hstore on Query-to-map project

List overview All Threads
Download

newer

older

Re: [Maps-l] hstore on...

Changes to osm account

Tim Alder

15 May 2010 15 May '10

6:54 p.m.

I'm happy to can say: "The light is on.".

I moved with my project Query-to-map[1] to hstore-DB which was really easy, so I can now also show lit=yes[2]. ;-) As far as I know is no project active on the old tables, so we could shut off them?

"drink:club-mate=yes" (rare key with commons value) works fine, also in a huge area, so I'm very happy ;-) But "amenity=baby_hatch" (commons key with rare value) and others doesn't work fine in a huge area, that is frustrating. Peter explains the reasons in his mail. But we should really find an answer for this problem. Is there a way to indexing the hstore with alternating the keys and the values or indexing the values? I read in in the documentation[3] only that "%% hstore" convert hstore to array of alternating keys and values.

Or could we use a second alternated hstore? This should work and there are a lot of use-cases.

Independent from that I want to ask if I can get an index on "name" and "ref", because I use this often to link on objects from featurelist.

The tools seems in the moment not very fast because we have high load on the database until we are up-to-date. So I will wait with further testing.

Greetings Kolossos

[1] http://wiki.openstreetmap.org/wiki/Query-to-map [2] http://toolserver.org/~kolossos/qtm2/queryinmap.php?BBOX=13.5333,50.95,13.93... [3] http://developer.postgresql.org/pgdocs/postgres/hstore.html

Show replies by date

Colin Marquardt

15 May 15 May

7:30 p.m.

2010/5/15 Tim Alder tim.alder@s2002.tu-chemnitz.de:

...

I'm happy to can say: "The light is on.".

"Prisoner: You lucky bastard!"</brian>

...

I moved with my project Query-to-map[1] to hstore-DB which was really easy, so I can now also show lit=yes[2]. ;-) As far as I know is no project active on the old tables, so we could shut off them?

No, I still need to convert the hikebike style.

Cheers Colin

Peter Körner

9:52 p.m.

...

I moved with my project Query-to-map[1] to hstore-DB which was really easy, so I can now also show lit=yes[2]. ;-) As far as I know is no project active on the old tables, so we could shut off them?

Nice to see that this new database is of use.

...

"drink:club-mate=yes" (rare key with commons value) works fine, also in a huge area, so I'm very happy ;-) But "amenity=baby_hatch" (commons key with rare value) and others doesn't work fine in a huge area, that is frustrating. Peter explains the reasons in his mail.

The query is twice fast when you use the regular column. This should be as fast as it was without the hstore. I'd suggest using the regular columns where possible and only use the hstore if there is no column for the key you are looking for.

osm_mapnik=> explain analyze select tags from planet_point where (tags ? 'amenity') and (tags->'amenity' = 'baby_hatch'); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on planet_point (cost=1660.92..72555.72 rows=103 width=198) (actual time=21259.021..100168.469 rows=55 loops=1) Recheck Cond: (tags ? 'amenity'::text) Filter: ((tags -> 'amenity'::text) = 'baby_hatch'::text) -> Bitmap Index Scan on planet_point_tags (cost=0.00..1660.89 rows=20501 width=0) (actual time=14650.118..14650.118 rows=1920725 loops=1) Index Cond: (tags ? 'amenity'::text) Total runtime: 100168.620 ms (6 rows)

osm_mapnik=> explain analyze select tags from planet_point where amenity = 'baby_hatch'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Seq Scan on planet_point (cost=0.00..957754.05 rows=45012 width=198) (actual time=62.098..58970.003 rows=55 loops=1) Filter: (amenity = 'baby_hatch'::text) Total runtime: 58970.149 ms (3 rows)

...

Or could we use a second alternated hstore? This should work and there are a lot of use-cases.

I dont think you can the indexes on both hstores in one query, so either the key-lookup is slow or the value-lookup is.

Another thing is, that so, in the audience on our lightning talk at fossgis said, that the index efficiency drops noticeable with an inhomogeneous value distribution and you'd get as lot of "yes"es or "no"s for a value-index.

...

Independent from that I want to ask if I can get an index on "name" and "ref", because I use this often to link on objects from featurelist.

I think we can do this: CREATE INDEX planet_point_name ON planet_point (name); CREATE INDEX planet_point_ref ON planet_point (ref);

CREATE INDEX planet_line_name ON planet_line (name); CREATE INDEX planet_line_ref ON planet_line (ref);

CREATE INDEX planet_polygon_name ON planet_polygon (name); CREATE INDEX planet_polygon_ref ON planet_polygon (ref);

CREATE INDEX planet_roads_name ON planet_roads (name); CREATE INDEX planet_roads_ref ON planet_roads (ref);

I'd like to wait until the diff import has catched up. I think this will take not more than one day.

Peter

Kai Krueger

10:12 p.m.

On 05/15/2010 10:52 PM, Peter Körner wrote:

...

...

osm_mapnik=> explain analyze select tags from planet_point where (tags ? 'amenity') and (tags->'amenity' = 'baby_hatch'); QUERY PLAN

Bitmap Heap Scan on planet_point (cost=1660.92..72555.72 rows=103 width=198) (actual time=21259.021..100168.469 rows=55 loops=1) Recheck Cond: (tags ? 'amenity'::text) Filter: ((tags -> 'amenity'::text) = 'baby_hatch'::text) -> Bitmap Index Scan on planet_point_tags (cost=0.00..1660.89 rows=20501 width=0) (actual time=14650.118..14650.118 rows=1920725 loops=1) Index Cond: (tags ? 'amenity'::text) Total runtime: 100168.620 ms (6 rows)

osm_mapnik=> explain analyze select tags from planet_point where amenity = 'baby_hatch'; QUERY PLAN

Seq Scan on planet_point (cost=0.00..957754.05 rows=45012 width=198) (actual time=62.098..58970.003 rows=55 loops=1) Filter: (amenity = 'baby_hatch'::text) Total runtime: 58970.149 ms (3 rows)

...
Or could we use a second alternated hstore? This should work and there are a lot of use-cases.

I dont think you can the indexes on both hstores in one query, so either the key-lookup is slow or the value-lookup is.

There does seem to be another option that is much faster:

osm_mapnik=> EXPLAIN ANALYZE select count(*) from planet_point where (tags @> 'amenity=>baby_hatch'); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=72509.57..72509.58 rows=1 width=0) (actual time=163.902..163.902 rows=1 loops=1) -> Bitmap Heap Scan on planet_point (cost=1666.01..72458.31 rows=20501 width=0) (actual time=163.286..163.872 rows=55 loops=1) Filter: (tags @> '"amenity"=>"baby_hatch"'::hstore) -> Bitmap Index Scan on planet_point_tags (cost=0.00..1660.89 rows=20501 width=0) (actual time=163.250..163.250 rows=55 loops=1) Index Cond: (tags @> '"amenity"=>"baby_hatch"'::hstore) Total runtime: 163.986 ms

compared to

osm_mapnik=> EXPLAIN ANALYZE select count(*) from planet_point where amenity='baby_hatch'; QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=957866.58..957866.59 rows=1 width=0) (actual time=11162.156..11162.156 rows=1 loops=1) -> Seq Scan on planet_point (cost=0.00..957754.05 rows=45012 width=0) (actual time=78.598..11161.997 rows=55 loops=1) Filter: (amenity = 'baby_hatch'::text) Total runtime: 11162.236 ms (4 rows)

They both seem to give the same result of 55 entries, although I haven't looked at how the @> operator behaves in other cases.

...

Another thing is, that so, in the audience on our lightning talk at fossgis said, that the index efficiency drops noticeable with an inhomogeneous value distribution and you'd get as lot of "yes"es or "no"s for a value-index.

...
Independent from that I want to ask if I can get an index on "name" and "ref", because I use this often to link on objects from featurelist.

Can you not use the the osm_id to link to objects?

...

I think we can do this: CREATE INDEX planet_point_name ON planet_point (name); CREATE INDEX planet_point_ref ON planet_point (ref);

CREATE INDEX planet_line_name ON planet_line (name); CREATE INDEX planet_line_ref ON planet_line (ref);

CREATE INDEX planet_polygon_name ON planet_polygon (name); CREATE INDEX planet_polygon_ref ON planet_polygon (ref);

CREATE INDEX planet_roads_name ON planet_roads (name); CREATE INDEX planet_roads_ref ON planet_roads (ref);

I'd like to wait until the diff import has catched up. I think this will take not more than one day.

Peter

Kai

...

Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l

Peter Körner

10:53 p.m.

16.05.2010 00:12, schrieb Kai Krueger:

...

There does seem to be another option that is much faster:

osm_mapnik=> EXPLAIN ANALYZE select count(*) from planet_point where (tags @> 'amenity=>baby_hatch');

Oh whow, that's cool!

Peter

Peter Körner

11:01 p.m.

Am 16.05.2010 00:53, schrieb Peter Körner:

...

16.05.2010 00:12, schrieb Kai Krueger:

...
There does seem to be another option that is much faster:

osm_mapnik=> EXPLAIN ANALYZE select count(*) from planet_point where (tags @> 'amenity=>baby_hatch');

Oh whow, that's cool!

That's amasing! even a query for "all restaurants in the whole world" (111746 rows) takes just 3 seconds: select name from planet_point where (tags @> 'amenity=>restaurant');

That's really nice! Peter

Tim Alder

16 May 16 May

8:33 a.m.

Hello, thank you Kai for your "@>" hint.

It works now also fine on huge areas with rare objects (pipelines): http://toolserver.org/~kolossos/qtm2/queryinmap.php?name=*&key=man_made&...

Amazing. ----

Kai Krueger schrieb:

...

Can you not use the the osm_id to link to objects?

No, a street with a bridge in the middle have more than one osm_id and this IDs are changing with each new segment. My plan for the future is to link from Wikipedia-articles to complex OSM-Objects and this links should be human-readable, powerfull and long-time stable so I decide to use BBOX, key=value, name and/or ref.

---- One hint from my side: OSM key-values can contain whitespaces so we need doubleqoute: (tags @> '"ref"=>"A 4"') works fine.

---- For the indexes on name and ref I will test next week if I could live also with the index of hstore.

Greetings Kolossos

Peter Körner

9:04 a.m.

Am 16.05.2010 10:33, schrieb Tim Alder:

...

For the indexes on name and ref I will test next week if I could live also with the index of hstore.

The diff import is nearly done, so I think I can add the indexes today. I'll only add them on the point table and then we can compare them against the hstore and check if we really need them.

Peter

5181

Age (days ago)

5182

Last active (days ago)

maps-l@lists.wikimedia.org

7 comments

4 participants

tags (0)

participants (4)

Colin Marquardt
Kai Krueger
Peter Körner
Tim Alder