On 06/04/12 13:41, John Erling Blad wrote:
I believe this is very important, not only for spatial objects but also for time and for spatiotemporal objects, and even for a lot of other types of objects. One option could be to add some kind of subtypes, or other variations of types, and one of them could be default in a locale or given as a user preference. A subtype could then refer to a SRID which then refer to a specific datum, geoid, coordinate system, projection, and whattever. As Markus said, usability is an issue here, but I think most of the complexity can be hidden. Or I hope so. ;)
One reason for it to be subclasses of a type is that alternate subtypes could replace each other, while still being valid for a specific property. For example a coordinate could be given in NAD27 and then converted to WGS84 with some errors.. It could be difficult or impossible to convert coordinates in general (old coordinates referring to a flat map could be an example), but perhaps some types are important enough.
Note that even identifying which ones are important enough would be difficult, not to say converting between them. It is not an error in general to not be able to convert between subtypes, even if the two subtypes belong to the same supertype.
Note also that this is an inherent problem in all kinds of measurement where a value refer to some form of official or unofficial measuring method. Other examples are length measurements in feet in UK and Germany (all are valid length but conversion is difficult and in some cases unknown), and old time standards that could be different for individual European cities (not sure if the differences are well known at all). An even better example could be currency where USD 1 would not have a fixed value compared to EUR 1. Akain both USD and EUR is valid currency but the conversion rate is unknown in the future and known with some error in the past.
Many complicated issues are mentioned here. I think the only way in which we can decide for or against additional complexity is to collect more use cases. It would be useful to collect (and briefly explain) examples of challenging infoboxes at
http://meta.wikimedia.org/wiki/Wikidata/Infoboxes
as suggested by Lydia recently. We can then consider how each case can be supported appropriately. So far, I do not know of any example where Wikipedia uses unusual SRs or old-time units in infoboxes.
Markus
P.S. Let's try to keep this thread on geo; numeric unit conversion issues or time-related discussions deserve separate threads.
On Fri, Apr 6, 2012 at 1:46 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Hi Andreas,
thanks for the input. I have drafted the current text about geo-related datatypes, but I am far from being an expert in this area. Our mapping expert in Wikidata is Katie (Aude), who has also been working with OpenStreetMap, but further expert input on this topic would be quite valuable.
As in all areas, we need to find a balance between generality and usability, so I am slightly in favour of committing to one SR for now (as I understand, the data can be converted easily between SRs but -- as opposed to other cases where people measure something -- most of the world seems to be happy with one of them).
I have now included a link to this thread into an editorial remark in the data model, so we do not forget about this discussion when working out the details.
Markus
On 04/04/12 14:16, Andreas Trawoeger wrote:
Hi everybody!
As the guy who has to honor to shortly receive some funding from Wikimedia Germany for handling spatial open government data [0] I would like to make some remarks on the current geo definitions in the Wikidata model:
- Spatial Reference System Identifier (SRID [1]) definition is missing
Every GeoCoordinatesValue field should either have a corresponding SRID field that defines the used spatial reference system (SRS [2]) or mandate the use of a single SRS like WGS84 [3] which is currently the standard used by GPS, OpenStreetMap and Wikipedia.
- Geographic shapes should be defined in either Well-known text (WKT
[4]) or GeoJSON [5]
WKT is the defacto standard to store spatial data in a rational database and GeoJSON is the defacto standard to access geo data via web. Both formats can be easily transformed into each other. So which one you choose pretty much depends on your preferred choice of SQL vs. NoSQL database.
So in summary I would propose the following data model for spatial data:
Geographic locations Datatype IRI: http://wikidata.org/vocabulary/datatype_geocoords Value: GeoCoordinatesValue Mandatory spatial reference system: EPSG 4326 (WGS 84/GPS) Type: Decimal
Geographic objects Datatype IRI: http://wikidata.org/vocabulary/datatype_geoobjects Value: GeoObjectsValue Type: GeoJSON [5]
Geographic objects SRID Datatype IRI: http://wikidata.org/vocabulary/datatype_geoobjects_srid Value: GeoObjectsSridValue Type: EPSG Spatial Reference System Identifier (SRID [1])
That model would allow a structure where every spatial object can have a complex geometry stored in its original geodetic system and still have an easily manageable location in GPS format.
cu andreas
[0] http://de.wikipedia.org/wiki/Wikipedia:Community-Projektbudget#2._kartenwerk... [1] https://en.wikipedia.org/wiki/Spatial_reference_system_identifier [2] https://en.wikipedia.org/wiki/Spatial_reference_system [3] https://en.wikipedia.org/wiki/WGS84 [4] https://en.wikipedia.org/wiki/Well-known_text [5] https://en.wikipedia.org/wiki/GeoJSON
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l