For some time I’ve been working as a contractor developing a new style
and vector tile schema for the Wikimedia Foundation. It’s been completed
but not deployed for several months. As my contract finishes this month
and Wikimedia Foundation leadership has decided to not deploy the new
map styles, I’m writing up a technical lessons learned from my
experiences on the style. I’m not going to be discussing the
organizational factors that lead to the decision, but looking at how I’d
code things differently if starting over.
*Overview*
A complete map style consists of three parts: The database loading
rules, the feature selection rules, and the styling rules. For a style
written in the languages used by the WMF stack, these are expressed in
osm2pgsql instructions, a tm2source project with SQL, and CartoCSS. The
first tells you how to get the data into the database, the second
defines what data goes into the vector tiles, and the third is how to
draw features in the vector tiles. What goes in the vector tiles is also
known as the “schema” and can be expressed in terms of what features
appear and when, e.g. secondary roads first appear on zoom 12. For
increased confusion, the database also has a “schema”, both of which are
distinct from a PostgreSQL “SCHEMA.”
In the current style, the parts are the osm2pgsql C transforms,
osm-bright.tm2source, and WMF’s fork of osm-bright.tm2. In the new
style, the parts are ClearTables + osm2pgsql, meddo, and brighmed.
The goal with the style changes was to improve the representation of
disputed borders, switch to a vector tile schema without a legal cloud
over it, and make some styling improvements. In this it succeeded.
*Database schema*
The decision was made early on to go with ClearTables. This is an
alternative set of rules for osm2pgsql which loads the data into many
more tables for greater performance, easier style rules, and a bigger
layer of abstraction between raw OSM tags and the SQL you need to write.
It was started by me before my work at WMF and only a few features were
added.
ClearTables does what it is designed to do, yet it was a mistake for
this project. I still believe it is technically a better solution, yet
the advantages are not worth the costs of doing something different.
The two most common database schemas are the built-in osm2pgsql “C
transforms” and the OpenStreetMap Carto. They aren’t any better code -
with ClearTable’s test suite, it’s probably got fewer bugs, but there
are many guides on how to set them up, and it requires fewer components.
Setting up the database isn’t an issue for WMF production servers, but
is considered one of the more difficult steps for potential contributors
to any style. Minimizing differences from other setups here helps
greatly. A second issue is that many potential users of the style
already have a database. I have heard from multiple people who would
like to run the style if it could be used with their existing databases.
*Static data*
Map styles need some forms of “static” data loaded such as oceans,
low-zoom data, and borders. Normally this is done on an ad-hoc basis
with a long complicated shp2pgsql or ogr2ogr command, but I wrote a
python script that downloads the data and loads it with ogr2ogr, as well
as handling all the SQL needed to update the data without a service
interruption.
This script is useful enough that I have reused it for other projects,
which was made easy because I didn’t hard-code the files used into the
script, but used another file to define them.
*Borders*
One of the drivers of the work was to better display disputed borders.
To do this a pre-processing step was considered necessary, and I wrote a
necessary program in C++ with libosmium. This worked, but I should have
made more of an effort to get it packed by Debian GIS and run on Jochen
Toph’s OpenStreetMapData.com servers so others could use the work to
encourage more developers to participate in maintenance. I should also
have given pyosmium a more detailed look.
Vector tile schema
One of the reasons for switching to a new schema was legal threats
against people using the Mapbox Streets schema. This meant
osm2vectortiles also had to switch schemas at the same time. There was
an effort to work with them to use a common schema, but it never
happened because we had different needs. In retrospect, we should have
either gone with a common schema and tm2source project, or done nothing
in common. Either choice is valid, and it’s a balance of coordination
work against a common development direction.
It was useful to have someone external to discuss ideas with, but this
wouldn’t have been required with other people on the team.
*Style*
The original plan was to largely stick with the cartography of
osm-bright. This changed once we got into implementation and we realized
how insane some parts of the osm-bright cartography were, and efforts
were made towards redoing the style.
The road colours selected were from ColorBrewer2 OrRd6, with casing
colours done by adjusting the Lch lightness and chroma. It would have
been better to pick endpoints and generate colours using a script,
similar to osm-carto. This would have allowed easier changes and sped up
development by reducing the number of variables that need to be manually
set.
*Overall*
The style was completed successfully in time, and none of the changes
would have significantly changed that. They would have mainly made it
easier to attract external contributors if an effort were put into that.
As attracting external contributors wasn’t a priority, they didn’t matter.