Hello everyone,
After just writing last week that everything is running stable on
the OpenStreetMap tile rendering server, the next day, the
postgresql database seems to have gotten corrupted and is only
partially functional anymore.
Due to the database corruption, replication (diff imports) are
suspended for the moment and most rendering of new map tiles is
disabled as well. This will effect both WIWOSM and the osm tiles
showen in the osm_gadget. Any changes that occurred in the last 3
days or so as well future updates to the OSM database won't show up
in either until the toolserver-osm database can be fixed again. This
is unfortunately likely going to take a few days, if not weeks
should a full new import be necessary due to the corruption.
More technically:
Initially queries seemed to fail every couple of hours with the
error message
"DETAIL: The postmaster has commanded this server process to roll
back the current transaction and exit, because another server
process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database
and repeat your command.
The connection to the server was lost. Attempting reset: Failed."
But otherwise more or less "work".
Now though diff imports quickly fail, and e.g. the query "SELECT *
FROM planet_ways WHERE id = 67780465;" consistently results in the
error
"unexpected chunk number 0 (expected 1) for toast value 28214399 in
pg_toast_3406700" which sounds like definite database corruption.
I have no idea what caused the issues, but they seemed to start in
the evening of the 23rd.
It appears that the planet_ways table definitely has problems, but I
don't know if the rendering tables are corrupt as well. Potentially
this means that processes that work on the other tables still work,
although I'd treat them with care, as it may well be that they are
corrupt as well.
i will see if reindexing the tables or vacuuming them will help to
recover the corruption.
If not, then a full reimport will presumably be necessary.
This is a bit of an awkward time for this, as due to the license
change in OSM, there are no up-to-date planet files at the moment
(last one was 3 weeks ago) and will likely not resume until the
license change is over. When that will be is not yet entirely clear.
So that might add another couple of weeks of delay until a new
import can take place.
Hopefully this can be resolved as soon as possible, but it will
likely be an inconvenience to WIWOSM and tile rendering.
Kai