For some time I’ve been working as a contractor developing a new style and vector tile schema for the Wikimedia Foundation. It’s been completed but not deployed for several months. As my contract finishes this month and Wikimedia Foundation leadership has decided to not deploy the new map styles, I’m writing up a technical lessons learned from my experiences on the style. I’m not going to be discussing the organizational factors that lead to the decision, but looking at how I’d code things differently if starting over.
*Overview*
A complete map style consists of three parts: The database loading rules, the feature selection rules, and the styling rules. For a style written in the languages used by the WMF stack, these are expressed in osm2pgsql instructions, a tm2source project with SQL, and CartoCSS. The first tells you how to get the data into the database, the second defines what data goes into the vector tiles, and the third is how to draw features in the vector tiles. What goes in the vector tiles is also known as the “schema” and can be expressed in terms of what features appear and when, e.g. secondary roads first appear on zoom 12. For increased confusion, the database also has a “schema”, both of which are distinct from a PostgreSQL “SCHEMA.”
In the current style, the parts are the osm2pgsql C transforms, osm-bright.tm2source, and WMF’s fork of osm-bright.tm2. In the new style, the parts are ClearTables + osm2pgsql, meddo, and brighmed.
The goal with the style changes was to improve the representation of disputed borders, switch to a vector tile schema without a legal cloud over it, and make some styling improvements. In this it succeeded.
*Database schema*
The decision was made early on to go with ClearTables. This is an alternative set of rules for osm2pgsql which loads the data into many more tables for greater performance, easier style rules, and a bigger layer of abstraction between raw OSM tags and the SQL you need to write. It was started by me before my work at WMF and only a few features were added.
ClearTables does what it is designed to do, yet it was a mistake for this project. I still believe it is technically a better solution, yet the advantages are not worth the costs of doing something different.
The two most common database schemas are the built-in osm2pgsql “C transforms” and the OpenStreetMap Carto. They aren’t any better code - with ClearTable’s test suite, it’s probably got fewer bugs, but there are many guides on how to set them up, and it requires fewer components.
Setting up the database isn’t an issue for WMF production servers, but is considered one of the more difficult steps for potential contributors to any style. Minimizing differences from other setups here helps greatly. A second issue is that many potential users of the style already have a database. I have heard from multiple people who would like to run the style if it could be used with their existing databases.
*Static data*
Map styles need some forms of “static” data loaded such as oceans, low-zoom data, and borders. Normally this is done on an ad-hoc basis with a long complicated shp2pgsql or ogr2ogr command, but I wrote a python script that downloads the data and loads it with ogr2ogr, as well as handling all the SQL needed to update the data without a service interruption.
This script is useful enough that I have reused it for other projects, which was made easy because I didn’t hard-code the files used into the script, but used another file to define them.
*Borders*
One of the drivers of the work was to better display disputed borders. To do this a pre-processing step was considered necessary, and I wrote a necessary program in C++ with libosmium. This worked, but I should have made more of an effort to get it packed by Debian GIS and run on Jochen Toph’s OpenStreetMapData.com servers so others could use the work to encourage more developers to participate in maintenance. I should also have given pyosmium a more detailed look. Vector tile schema One of the reasons for switching to a new schema was legal threats against people using the Mapbox Streets schema. This meant osm2vectortiles also had to switch schemas at the same time. There was an effort to work with them to use a common schema, but it never happened because we had different needs. In retrospect, we should have either gone with a common schema and tm2source project, or done nothing in common. Either choice is valid, and it’s a balance of coordination work against a common development direction.
It was useful to have someone external to discuss ideas with, but this wouldn’t have been required with other people on the team.
*Style*
The original plan was to largely stick with the cartography of osm-bright. This changed once we got into implementation and we realized how insane some parts of the osm-bright cartography were, and efforts were made towards redoing the style.
The road colours selected were from ColorBrewer2 OrRd6, with casing colours done by adjusting the Lch lightness and chroma. It would have been better to pick endpoints and generate colours using a script, similar to osm-carto. This would have allowed easier changes and sped up development by reducing the number of variables that need to be manually set.
*Overall*
The style was completed successfully in time, and none of the changes would have significantly changed that. They would have mainly made it easier to attract external contributors if an effort were put into that. As attracting external contributors wasn’t a priority, they didn’t matter.