Hi
Atm. the load on cassini is around 20 and the diff imports are fighting to stay up. I think it's cmarq's hillshade-generator takes the most resources on cassini. top reports ~57.1% IO waiting so I think there's a lot of disk going on.
I tried to do a "du -hs /mnt/user-store/osm_hillshading" but ir ran loooong until I canceled it.
What I think is going on is that the generate_tiles script creates a real huge number of single pngs. Unlike mod_tile it is not combining the small tiles into larger chunks (mod_tile packs 8x8 tiles into a metatile). Also unless mod_tile it renders each and every tile in each and every zoomlevel, while mod_tile only renders tiles that are really shown (e.g. it leaves out most of the atlantic at zoomlevel 18, as nobody would view the empty sea at this zoomlevel.
Both problems together create a real huge number of files in a single directory which is not good for most filesystems. To make things worse, /mnt/user-store is connected to cassini via NFS, so there's another bottleneck.
I don't know it this really is a problem, but I think it's worth talking about it.
Peter
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Atm. the load on cassini is around 20 and the diff imports are fighting to stay up. I think it's cmarq's hillshade-generator takes the most resources on cassini. top reports ~57.1% IO waiting so I think there's a lot of disk going on.
This must have gotten worse as generate_tiles goes to lower zoom levels. I can kill it and maybe run with a single thread or even some artificial slowdown tonight. Right now it's three threads, which wasn't a problem so far.
I tried to do a "du -hs /mnt/user-store/osm_hillshading" but ir ran loooong until I canceled it.
What I think is going on is that the generate_tiles script creates a real huge number of single pngs. Unlike mod_tile it is not combining the small tiles into larger chunks (mod_tile packs 8x8 tiles into a metatile). Also unless mod_tile it renders each and every tile in each and every zoomlevel, while mod_tile only renders tiles that are really shown (e.g. it leaves out most of the atlantic at zoomlevel 18, as nobody would view the empty sea at this zoomlevel.
Well, I generate hillshading for the land mass of Europe (plus the UK) right now, so there isn't such a lot of empty sea tiles. It would indeed be nice to generate metatiles instead of the single PNGs though.
I'm also not rendering down to zoom 18. Pre-rendering those hillshading tiles (up to a certain zoom level at least) in general is not bad IMO since they are very much static (they don't contain any changing information), so will be useful for many maps for the next several years.
Dynamically creating these tile (and then never expiring them) would also be possible if renderd or mapnik are extended, but right now, only generate_tiles can do the necessary postprocessing.
Both problems together create a real huge number of files in a single directory which is not good for most filesystems. To make things worse, /mnt/user-store is connected to cassini via NFS, so there's another bottleneck.
We could determine what the maximum sensible number of files is in a single directory and from that see up to which zoom level we could support (whether pre-rendered or dynamically created). Maybe mod_tile could even be extended to handle metatiles with e.g. 16x16? An alternative would be a special file system, but I suppose that's rather undesirable...
Cheers Colin
Colin Marquardt schrieb:
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Atm. the load on cassini is around 20 and the diff imports are fighting to stay up. I think it's cmarq's hillshade-generator takes the most resources on cassini. top reports ~57.1% IO waiting so I think there's a lot of disk going on.
This must have gotten worse as generate_tiles goes to lower zoom levels. I can kill it and maybe run with a single thread or even some artificial slowdown tonight. Right now it's three threads, which wasn't a problem so far.
I don't think that this is necessary atm. I just wanted to talk about this upcoming issue. Can you estimate how long it will keep running with the current parameters? Getting some days out of sync is not that big problem (we're not expiring tiles currently anyway).
I tried to do a "du -hs /mnt/user-store/osm_hillshading" but ir ran loooong until I canceled it.
Well, I generate hillshading for the land mass of Europe (plus the UK) right now, so there isn't such a lot of empty sea tiles.
Okay, cool.
It would indeed be nice to generate metatiles instead of the single PNGs though.
Yes, it would. Is there a tool to pack the existing tiles into metatiles?
I'm also not rendering down to zoom 18. Pre-rendering those hillshading tiles (up to a certain zoom level at least) in general is not bad IMO since they are very much static (they don't contain any changing information), so will be useful for many maps for the next several years.
Yes I see that, too, but rendering on the fly and then caching the results is a compromise between disk space and cpu power.
Dynamically creating these tile (and then never expiring them) would also be possible if renderd or mapnik are extended, but right now, only generate_tiles can do the necessary postprocessing.
Both problems together create a real huge number of files in a single directory which is not good for most filesystems. To make things worse, /mnt/user-store is connected to cassini via NFS, so there's another bottleneck.
We could determine what the maximum sensible number of files is in a single directory and from that see up to which zoom level we could support
It's not the absolute "max number" I'm worried about but the decreasing performance when working with a lot of files in a directory (and maybe the inode usage). This could both be solved by using metatiles.
Peter
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Colin Marquardt schrieb:
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Atm. the load on cassini is around 20 and the diff imports are fighting to stay up. I think it's cmarq's hillshade-generator takes the most resources on cassini. top reports ~57.1% IO waiting so I think there's a lot of disk going on.
This must have gotten worse as generate_tiles goes to lower zoom levels. I can kill it and maybe run with a single thread or even some artificial slowdown tonight. Right now it's three threads, which wasn't a problem so far.
I don't think that this is necessary atm. I just wanted to talk about this upcoming issue. Can you estimate how long it will keep running with the current parameters? Getting some days out of sync is not that big problem (we're not expiring tiles currently anyway).
Maybe two more days, but it's really no problem to slow it down a bit. The maximum run time of the current job is bounded anyway with cassini being repurposed.
I tried to do a "du -hs /mnt/user-store/osm_hillshading" but ir ran loooong until I canceled it.
Well, I generate hillshading for the land mass of Europe (plus the UK) right now, so there isn't such a lot of empty sea tiles.
Okay, cool.
It would indeed be nice to generate metatiles instead of the single PNGs though.
Yes, it would. Is there a tool to pack the existing tiles into metatiles?
I just found that there is a tool called convert_meta: http://svn.openstreetmap.org/applications/utils/mod_tile/readme.txt
I hope it will work with the paletted PNGs I'm using.
I'm also not rendering down to zoom 18. Pre-rendering those hillshading tiles (up to a certain zoom level at least) in general is not bad IMO since they are very much static (they don't contain any changing information), so will be useful for many maps for the next several years.
Yes I see that, too, but rendering on the fly and then caching the results is a compromise between disk space and cpu power.
ACK. I'll try to get the necessary tool support.
It's not the absolute "max number" I'm worried about but the decreasing performance when working with a lot of files in a directory (and maybe the inode usage). This could both be solved by using metatiles.
I see. Let's try convert_meta on a subset of the data tonight and see how it goes.
Cheers Colin
Maybe two more days, but it's really no problem to slow it down a bit. The maximum run time of the current job is bounded anyway with cassini being repurposed.
Okay then I think you should let it run.
Yes I see that, too, but rendering on the fly and then caching the results is a compromise between disk space and cpu power.
ACK. I'll try to get the necessary tool support.
Maybe packing the prerendered tiles into metatiles will solve it for the moment BUT I don't know how it'll be with zoomlevel 18 or maybe 19, 20 or even higher in some high density cities.
Peter
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Yes I see that, too, but rendering on the fly and then caching the results is a compromise between disk space and cpu power.
ACK. I'll try to get the necessary tool support.
Maybe packing the prerendered tiles into metatiles will solve it for the moment BUT I don't know how it'll be with zoomlevel 18 or maybe 19, 20 or even higher in some high density cities.
*Hillshading* is not really useful with zooms above 16 or 17 I would say, at least not with the DEM data that is freely available outside the US (SRTM3 basically). Contour lines might be useful in some areas.
When you say "high density cities", you probably are referring to road maps, and these can be rather aggressively expired if necessary, no?
Cheers Colin
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Atm. the load on cassini is around 20 and the diff imports are fighting to stay up. I think it's cmarq's hillshade-generator takes the most resources on cassini. top reports ~57.1% IO waiting so I think there's a lot of disk going on.
Load is now back to ~4 without me changing anything. Maybe it was some interaction with other processes?
Cheers Colin
Colin Marquardt wrote:
2009/11/25 Peter Körner osm-lists@mazdermind.de:
Atm. the load on cassini is around 20 and the diff imports are fighting to stay up. I think it's cmarq's hillshade-generator takes the most resources on cassini. top reports ~57.1% IO waiting so I think there's a lot of disk going on.
Load is now back to ~4 without me changing anything. Maybe it was some interaction with other processes?
Just looking at the munin graphs, (i.e. not looking at more detailed stats), my guess would be that it is probably not the hillshading , but rather the standard mod_tile /renderd setup that is causing the load spikes. Those load spikes correlate quite strongly with renderd queue lengths and rendering times. Currently I think renderd is set to use 8 threads. So if there is a render queue, all 8 threads will try to run, each connecting to a separate postgres process. Therefore, there are about 16 processes/threads that try to run and thus depending on how much of the wait time gets accounted to load, a load of 20 doesn't sound all that unreasonable.
I think the problem is that the current setup is too slow to render low zooms, with single metatiles taking partly over a minute to render. This is therefore clearly not feasible to render on the fly. However, with the 200+ styles of cassini, the fraction of low zoom tiles to high zoom tiles is skewed in the wrong direction. Once you get down to zoom level 12 or so, it starts just about being fast enough to render on the fly.
I think it is therefore necessary to pre-render all tiles to zoom level of probably at least 9 - 10. One can use render_list to pre-render in the background through renderd, thus playing nicely with the mod_tiles on-the-fly rendering. It will probably also be necessary to not expire the low zoom tiles as frequently to not need to rerender those as much. The OSM mapnik server basically doesn't expire lowzoom tiles at all, other than after a full DB reimport. Given that probably most of the name localisation still happens on large features like country / major city names, the low zoom tiles probably change much more though. So perhaps the the lowzoom tiles can be bunched up and only expired once a day and then immediately pre-rendered again. render_old can probably handle this. It goes through all of the previously rendered metatiles and checks if they are marked dirty. If yes, it submits them for background rendering.
It might still make sense though to convert the hillshading tiles to metatiles though. It might be that the program convert_meta allows to convert individual pngs into metatiles. But I haven't used convert_meta so I don't know if it actually does that.
Kai
Cheers Colin
Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l
I think the problem is that the current setup is too slow to render low zooms, with single metatiles taking partly over a minute to render. This is therefore clearly not feasible to render on the fly. However, with the 200+ styles of cassini, the fraction of low zoom tiles to high zoom tiles is skewed in the wrong direction. Once you get down to zoom level 12 or so, it starts just about being fast enough to render on the fly.
Maybe adding an index on place and capital would help already.. How are the indexes on the main osm tile server?
Peter
Peter Körner wrote:
I think the problem is that the current setup is too slow to render low zooms, with single metatiles taking partly over a minute to render. This is therefore clearly not feasible to render on the fly. However, with the 200+ styles of cassini, the fraction of low zoom tiles to high zoom tiles is skewed in the wrong direction. Once you get down to zoom level 12 or so, it starts just about being fast enough to render on the fly.
Maybe adding an index on place and capital would help already.. How are the indexes on the main osm tile server?
After a chat with a couple of people on the osm-irc yesterday, it seems like the following. The indexes on the main site are only those that osm2pgsql creates, which are indexes on the geometry columns for all of the tables, but none of the other columns. The two SQL calls that are done on layers 2 - 4 are the search for places and admin boundaries. Doing those queries with a simple count of results without returning results only takes about 2 - 4 seconds on either cassini or the osm tile server. So an index could save some time, but not particularly much, at least not compared to the overall time of tens of seconds to a minute it seems to take to render a metatile. However, the simple count / lookup might not be the problem, but returning the results may. At some point apparently, those calls returned results in excess of 1Gb of data, although I don't know if that is still the case or some recent optimisations have helped there. So there has been some thought about splitting out the admin boundaries into separate tables for the low zoom and running line simplification algorithms on those to reduce the data volumn, as the full accuracy isn't really needed on low zoom tiles.
But it is probably worth checking out which queries do actually take all the time and Frederik Ramm has written some scripts to analyse the postgres log files to check where the time is actually spent in rendering. This requires to enable the statement and duration logging features of postgres, which might be worth enabling for a while to see where the bottleneck is.
Kai
Peter
Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Kai Krueger:
Frederik Ramm has written some scripts to analyse the postgres log files to check where the time is actually spent in rendering.
where can i find these scripts?
- river.
River Tarnell wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Kai Krueger:
Frederik Ramm has written some scripts to analyse the postgres log files to check where the time is actually spent in rendering.
where can i find these scripts?
The scripts are in the osm-svn in the applications/rendering/mapnik/utils directory ( http://trac.openstreetmap.org/browser/applications/rendering/mapnik/utils )
stylecheck.pl analyzes the style file and shows what SQL queries get generated for each zoom level and analyze_postgis_log.pl checks the log files to see how much time each one needs. The latter probably needs some adaptation to match the local log structure.
Kai
P.S. Is there another problem with rendering on cassini at the moment? None of the tiles seem to be rendered after the 14th of November. At least all the ones I checked with the /status page show the tiles as rendered prior to the 15th and all need re-rendering, even after requesting renderes for them via /dirty. The munin graphs for renderd show that they seem to get rendered, but don't appear. Is it possible that the tiles don't get saved any more (e.g. due to permission issues)?
- river.
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAksPeU4ACgkQIXd7fCuc5vLLEgCgjceZ936O/6vig2s4u77vT/uy DVsAoLKRUEg+Q9hMDymUnYuAtnFBzVIH =8loO -----END PGP SIGNATURE-----
Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Peter Körner:
Both problems together create a real huge number of files in a single directory which is not good for most filesystems. To make things worse, /mnt/user-store is connected to cassini via NFS, so there's another bottleneck.
i had a look at osm_hillshading/, but i couldn't see any especially large directories. the largest under tiles/ seemed to be about 3'000 entries, which is well within the capability of the filesystem (VxFS).
since user-store is on NFS, it should generate very little local load; while iowait might show as increased load average, it should not have an effect on PostgreSQL speed, which runs on different disks entirely.
- river.
2009/11/25 River Tarnell river@loreley.flyingparchment.org.uk:
i had a look at osm_hillshading/, but i couldn't see any especially large directories. the largest under tiles/ seemed to be about 3'000 entries, which is well within the capability of the filesystem (VxFS).
Ah, thanks for investigating. I was looking into using convert_meta but it isn't as straightforward as I thought. I'll still keep it in mind so that we can reclaim some space later and save on inodes.
Cheers Colin