Hi, I'd like a sanity check on my plans to reduce the GeoData index. I plan to:
1) Not store coordinates from pages outside of content namespaces: main and file. 2) Not store secondary coordinates that don't specify the coordinate name. There is currently a bunch of secondary coordinates that make no sense because there's to way to tell them apart or tell what are they for.
I hope that these measures will make our database of geographical coordinates more useful on average, but please tell me if there's a fatal flaw in my plans:)
Hoi Max, Is it possible to compare your index with the items in Wikidata. There are several things that I would like to do * add all coordinates to Wikidata items * report on all coordinates where what you know differs from Wikidata A next obvious question is how can we keep the two data sets in sync.. Thanks, GerardM
On 2 April 2014 11:39, Max Semenik maxsem.wiki@gmail.com wrote:
Hi, I'd like a sanity check on my plans to reduce the GeoData index. I plan to:
- Not store coordinates from pages outside of content namespaces:
main and file. 2) Not store secondary coordinates that don't specify the coordinate name. There is currently a bunch of secondary coordinates that make no sense because there's to way to tell them apart or tell what are they for.
I hope that these measures will make our database of geographical coordinates more useful on average, but please tell me if there's a fatal flaw in my plans:)
-- Best regards, Max Semenik ([[User:MaxSem]])
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
2014-04-02 12:39 GMT+03:00 Max Semenik maxsem.wiki@gmail.com:
- Not store coordinates from pages outside of content namespaces:
main and file. 2) Not store secondary coordinates that don't specify the coordinate name. There is currently a bunch of secondary coordinates that make no sense because there's to way to tell them apart or tell what are they for.
Hi Max,
You don't really need to tell the coordinates apart in order for them to be useful. One of the things that I would like to do / see done at some point in https://toolserver.org/~kolossos/openlayers/kml-on-ol.php is an option to display only the coordinates _in the current page_. This would be useful on pages with large numbers of coordinates (mostly lists) even without a way to nominate them. Granted, having a name would help, but it doesn't seem compulsory to me.
Strainu
Max Semenik wrote:
- Not store coordinates from pages outside of content namespaces:
main and file.
Eh, many wikis have additional namespaces that are considered content namespaces. I'm not sure what the intention is here. Disk space usage reduction? A tighter data set? Without knowing what the savings are, it's difficult for me to assess whether the proposed plan is worth reducing the potential utility of this data.
MZMcBride
wikitech-l@lists.wikimedia.org