<div dir="ltr">
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" id="gmail-docs-internal-guid-4a87cf93-7fff-3b53-9724-f5d522f23136"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Hi everyone,</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">This is to announce that over the past month we started to look at ways to help us all get a better understanding of the quality of Wikidata's data in a specific area of interest. For this purpose we worked on building two tools; an </span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Item Quality Evaluator</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> and a </span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Constraint Violation Checker</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">  - both of these tools are now available at:</span></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><br></span></font></span></p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><a href="https://item-quality-evaluator.toolforge.org" style="text-decoration:none"><span style="color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">Item Quality Evaluator</span></a></font></span></p></li><li dir="ltr" style="list-style-type:disc;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><a href="https://github.com/wmde/wikidata-constraints-violation-checker" style="text-decoration:none"><span style="color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">Constraint Violation Checker</span></a></font></span></p></li></ul><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Data quality on Wikidata has many aspects. The constraint violations and ORES quality scores that these tools use are two helpful indicators of certain aspects of quality that we hope will be helpful for you.</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">As you may know, Wikidata’s data quality is very unevenly distributed - some areas are very well maintained and others not so much. We only currently provide ORES quality scores on a global and per-Item level. This has two effects, however:</span></font></span></p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Editors taking care of a specific area of Wikidata want to improve that area but currently don’t have an easy way to find the Items with the lowest quality they can focus their time on in order to raise the quality of that area.</span></font></span></p></li><li dir="ltr" style="list-style-type:disc;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Re-user of Wikidata’s data are usually only interested in a subset of Wikidata’s Items and by extension the quality of that subset. It is currently hard for them to know what quality level they are getting for their subset of interest.</span></font></span></p></li></ul><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">To address this issue we put together two small tools. The Item Quality Evaluator is a simple website that provides ORES quality scores for a list of Items in Wikidata. The Constraint Violation Checker is a small command-line script that retrieves the number of constraint violations and ORES scores for a list of Items for further analysis.</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">How does the Item Quality Evaluator tool work?</span></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">You provide it with a list of Item IDs or a SPARQL query and then it'll get the ORES score for each of them as well as the average score over all the Items you</span></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">provided in a nice webpage. This way, you can more easily identify the Items in an area you are interested in that have the lowest quality and improve them. </span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">How does the Constraint Violation Checker script work?</span></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">When you run it, it outputs a CSV file with the number of statements, the number of constraint violations for each severity level, the number of sitelinks to all projects and to Wikipedia and the ORES score for each of those Items.</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Why didn't we integrate the constraint violations data into the Item Quality Evaluator?</span></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">We want to do that in the long-term but right now it is not possible because the constraint violation data is not easily accessible and retrieving it takes several hours to run for a large list of items.</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Please try these tools and let us know if you encounter any issues. If you want to provide general feedback, feel free to let us know.</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2"><br></font></span><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:arial,sans-serif"><font size="2"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Cheers,</span></font></span></p><span style="font-family:arial,sans-serif"><font size="2">

</font></span><br>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Mohammed Sadat<br><b>Community Communications Manager for Wikidata/Wikibase</b><br><br>Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin<br>Phone: +49 (0)30 219 158 26-0<br><a href="https://wikimedia.de" target="_blank">https://wikimedia.de</a><br><br>Keep up to date! Current news and exciting stories about Wikimedia, Wikipedia and Free Knowledge in our newsletter (in German): <a href="https://www.wikimedia.de/newsletter/" target="_blank">Subscribe now</a>.<br><br>Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us to achieve our vision!<br><a href="https://spenden.wikimedia.de" target="_blank">https://spenden.wikimedia.de</a><br><br>Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.<br></div></div></div>