Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
On 14/03/12 14:26, toni hernández wrote:
Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Hi Toni,
The geographical locations are stored in the article text, using templates, and not (yet) available in the main database, although I believe there is long term work planned to remedy this.
However, there is a project that parses and consolidates all this data across all wikipedia languages. The simplest way to get this data is to get a toolserver account, and to access the kolossus database on the Wikipedia Toolserver. See http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikiped... for more detials.
Regards,
Neil
2012/3/14 Neil Harris neil@tonal.clara.co.uk:
On 14/03/12 14:26, toni hernández wrote:
Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Hi Toni,
The geographical locations are stored in the article text, using templates, and not (yet) available in the main database, although I believe there is long term work planned to remedy this.
However, there is a project that parses and consolidates all this data across all wikipedia languages. The simplest way to get this data is to get a toolserver account, and to access the kolossus database on the Wikipedia Toolserver. See http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikiped... for more detials.
Regards,
CC'ing Max as this is pretty close to what he's been working on for us. In short were looking at adding a parser hook to store DB coordinates in a separate part of the db so that we can query them much faster through our API.
--tomas
On 14/03/12 19:02, Tomasz Finc wrote:
2012/3/14 Neil Harrisneil@tonal.clara.co.uk:
On 14/03/12 14:26, toni hernández wrote:
Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Hi Toni,
The geographical locations are stored in the article text, using templates, and not (yet) available in the main database, although I believe there is long term work planned to remedy this.
However, there is a project that parses and consolidates all this data across all wikipedia languages. The simplest way to get this data is to get a toolserver account, and to access the kolossus database on the Wikipedia Toolserver. See http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikiped... for more detials.
Regards,
CC'ing Max as this is pretty close to what he's been working on for us. In short were looking at adding a parser hook to store DB coordinates in a separate part of the db so that we can query them much faster through our API.
--tomas
Hi Tomas,
Parsing the coordinates directly from the page source can be a bit awkward, because many coordinates are generated indirectly via chains of templates, and there are a large number of variations in the syntax used to manage coordinates in templates across article sub-projects and Wikipedia languages.
However, since they all end up generating links to the geohack page in a fairly simple format that is standardized across all Wikipedia editions, you can find these coordinates quite easilty either by parsing the rendered HTML for a page, or (more efficiently, if you have direct database access or can download Wikipedia dumps) by looking at the links generated in the externallinks table.
You can find the spec for the geohack syntax here: https://wiki.toolserver.org/view/GeoHack
Alternatively, you might want to take a look at using DBpedia, which does a lot of this for you, although I'm not sure how fresh or accurate their data is currently: see http://dbpedia.org/About for more on this project.
-- Neil
Neil,
I've been taking a look at toolserver and it looks really helpful. Thanks.
I have even downloaded a dump. There is only one table(wp_coords_red0) with a page_id field. But this page_id does not match the id field in the wikipedia database.
I have no idea how to join kolossos table with the page table on wikipedia. Is that even possible?
On 14/03/2012 19:45, Neil Harris wrote:
On 14/03/12 14:26, toni hernández wrote:
Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Hi Toni,
The geographical locations are stored in the article text, using templates, and not (yet) available in the main database, although I believe there is long term work planned to remedy this.
However, there is a project that parses and consolidates all this data across all wikipedia languages. The simplest way to get this data is to get a toolserver account, and to access the kolossus database on the Wikipedia Toolserver. See http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikiped... for more detials.
Regards,
Neil
Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
We are currently developing the GeoData extension[1] which should store page coordinates. We hope to deploy it sooner than later, but have no clear ETA.
--- [1] https://www.mediawiki.org/wiki/Extension:GeoData
On 14/03/12 19:03, Max Semenik wrote:
Hi all,
I have beem loooking at the wikipedia database scheme and I haven't found any field that suggest that some contents are geographical located. Am I wrong?
If it is possible I would like to download the geographical located contents of Wikipedia to do something similar to what googleearth does with the wikipedia layer Is that possible?
Thanks in advanced.
We are currently developing the GeoData extension[1] which should store page coordinates. We hope to deploy it sooner than later, but have no clear ETA.
Hi Max,
Your project is the "longer-term solution" I was alluding to. It's a excellent idea, will add significant desirable functionality and promote longer term standardization, yet can be deployed without any disruption to current working practices as a simple drop-in replacement at the lowest level of the various current ad-hoc template mechanisms. I look forward to seeing it deployed as soon as possible.
You might also want to take a look at the current work on attacking KML files to articles on the English-language Wikipedia for some ideas about where things might progress in the future beyond simple point-of-interest coding. This approach could quite easily be generalized to using KML files as a multi-purpose longer term extension to the current coordinate coding mechansim.
-- Neil
xmldatadumps-l@lists.wikimedia.org