Tool for consuming left-over data from import - Wikidata

4 Aug 2017

Hi all!

As part of the Connected Open Heritage project Wikimedia Sverige have been
migrating Wiki Loves Monuments datasets from Wikipedias to Wikidata.

In the course of doing this we keep a note of the data which we fail to
migrate. For each of these left-over bits we know which item and which
property it belongs to as well as the source field and language from the
Wikipedia list.  An example would e.g. be a "type of building" field where
we could not match the text to an item on Wikidata but know that the target
property is P31.

We have created dumps of these (such as
https://tools.wmflabs.org/coh/_total_se-ship_new.json, don't worry this one
is tiny) but are now looking for an easy way for users to consume them.

Does anyone know of a tool which could do this today? The Wikidata game
only allows (AFAIK) for yes/no/skip whereas you would here want something
like <enter_value>/invalid/skip. And if not are there any tools which with
a bit of forking could be made to do it?

We have only published a few dumps but there are more to come. I would also
imagine that this, or a similar, format could be useful for other
imports/template harvests where some fields are more easily handled by
humans.

Any thoughts and suggestions are welcome.
Cheers,
André
André Costa | Senior Developer, Wikimedia Sverige | Andre.Costa(a)wikimedia.se
| +46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se