On Thu, Aug 29, 2013 at 11:48 AM, Byrial Jensen <byrial@vip.cybercity.dk> wrote:
Den 29-08-2013 08:58, Byrial Jensen skrev:

Den 22-08-2013 11:33, Markus Krötzsch skrev:
Hi all,

I think one source of confusion here are the overlapping names of
property datatypes and datavalue types. Basically, the mapping is as
follows right now:

[Format: property type => datavalue type occurring in current dumps]

'wikibase-item' => 'wikibase-entityid'
'string' => 'string'
'time' => 'time'
'globe-coordinate' => 'globecoordinate'
'commonsMedia' => 'string'

Note that in the 2013-08-27 database dump you will also find:

  'globe-coordinate' => 'bad'

for values which was accepted before stricter format checking was
introduced in the lastest software revision, but cannot be accepted now
(values without indication of globe or precision).


I just found that there also are cases with

'time' => 'bad'

See for example https://www.wikidata.org/w/api.php?action=wbgetclaims&entity=Q7415505&format=xml. It has the time given as "+00000001984-23-01T00:00:00Z"; note that the month number is 23.


A list of "bad" time values would also be very helpful.  The "bad" value type is used to flag values that can't be "parsed" into one of the valid types.  These were likely added when wikidata has less strict validation of api input so are still in the database.

A bot would be able to fix them or they can be removed/re-added.

Cheers,
Katie

 

Regards,
- Byrial


_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



--
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | NEW: Obentrautstr. 72 | 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV Entered in the register of Amtsgericht Berlin-Charlottenburg under the number 23 855 as recognized as charitable by the Inland Revenue for corporations I Berlin, tax number 27/681/51985.