Hello,
As you may know, the Wikidata development team has been working on a tool
that lets editors review mismatching data between Wikidata and external
databases. The tool is now ready to be used, and you can access it here
<https://mismatch-finder.toolforge.org/> and read more details on
Wikidata:Mismatch
Finder <https://www.wikidata.org/wiki/Wikidata:Mismatch_Finder>. We hope
that this tool can be useful to people who are working on data quality and
matching external databases with Wikidata, and we are looking forward to
your feedback if you give it a try!
What is the purpose of Mismatch Finder?
The tool helps highlight differences in the data between Wikidata and other
databases, in order to improve data quality in Wikidata and make the whole
linked open data web more robust. The tool itself doesn’t check these
databases automatically: it is necessary for someone to compare an external
database to Wikidata first and then upload a list of possible mismatches
into the Mismatch Finder, so they can be analyzed and processed by Wikidata
editors.
By providing such a tool, we hope to support the Wikidata editors to spot
and fix mistakes in Wikidata as well as organizations reusing Wikidata’s
data, who now have a convenient way to contribute back by reporting lists
of possible mismatches.
How to use the tool to check mismatches?
On the Mismatch Finder tool page <https://mismatch-finder.toolforge.org/>,
you can check Items by entering a list of Q-IDs (for example taken from a
SPARQL query). After clicking on “Check Items”, the tool will check if
there are mismatches for these Items in the mismatch store, and display any
issue that was found with a specific part of the data.
From this page and after logging in with your Wikidata account via OAuth,
you will be able to choose a status of the mismatch, indicating what part
of the data is wrong, and to access the Item on
wikidata.org to edit the
data if needed. Mismatch Finder does not perform any automatic editing on
Wikidata.
Once the status is changed from “waiting for review” to another value, the
mismatch will not appear in the list anymore.
You can also use the Mismatch Finder
<https://www.wikidata.org/wiki/Wikidata:Mismatch_Finder/Gadget>user script
that will display an alert at the top of the Item pages on
wikidata.org and
a link to the Mismatch Finder tool to learn more about the potential
mismatches. See Help:User scripts
<https://www.wikidata.org/wiki/Help:User_scripts> for how to enable the
user script for your account.
Where does the information come from?
Information about the potential mismatches is stored in the Mismatch Store,
a database separate from Wikidata where organizations, researchers and
editors can upload lists of mismatches.
The Mismatch Store is hosted on Toolforge and its content can be accessed
via an API. You can find more information about the database, how to get
data from the API, how to prepare and upload a mismatches file in this user
guide
<https://github.com/wmde/wikidata-mismatch-finder/blob/main/docs/UserGuide.md>
.
We hope that the Mismatch Finder tool will help to build up feedback loops
with data re-users to get them actively involved in improving the data on
Wikidata. Feel free to try out the tool and let us know what you think on the
talk page <https://www.wikidata.org/wiki/Wikidata_talk:Mismatch_Finder>.
You can also join us for an intro session and discussion at the upcoming Data
Reuse Days
<https://www.wikidata.org/wiki/Wikidata:Events/Data_Reuse_Days_2022>.
For a quick intro to and demo of how the Mismatches tool works, please see this
short video
<https://commons.wikimedia.org/wiki/File:Mismatch_Finder_intro.webm>.
We would especially like to thank Mike Peel and Marco Fosatti, for
providing the first mismatches and real-world testing data for the Mismatch
Finder to get us started. More will follow in the next days and weeks.
Cheers,
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.