Hello all,
Wikidata’s content is growing and our data is used in more and more high-profile places. This means the pressure around data quality is rising. We want to provide people with good data. One important piece in the data quality puzzle is being able to *understand where we currently stand quality-wise and how that changes over time*. We need to be able to do this at scale and in an automated and repeatable way because no-one of us wants to do this by hand for 90 Million Items for sure.
That’s where ORES https://www.mediawiki.org/wiki/ORES, the machine learning system, comes in. One of the things it can do is judge the quality of an Item. Or to be more exact it can judge some aspects of the quality of an Item. It puts each Item into a quality class between A (amazing) and E (ewwww, terrible). It’s been doing this for a while already but the quality judgments it provided were not very good. The reasons for this were that it took only a limited number of signals into account (that’d be something like the number of References on the Item or the number of Labels) and because it was trained on rather old data. Since then Wikidata’s data has changed a lot so ORES could not tell what to do with the new kinds of Items like astronomical objects because it had never seen them before.
We wanted to improve that and *make the quality judgments ORES provides better*. We did this by:
- adding a number of new signals (e.g. does this Item have an image attached) - changing existing signals (e.g. missing references on external ID statements no longer punish the Item so much) - retraining the model on more current data so it better understands scientific papers, astronomical objects, etc.
While we were at it we also wanted to better understand how data quality changes over time on Wikidata. Before we only looked at the global average quality score. But how do Items change over time? How many Items are being improved from D to C or even B class for example? To better understand this we started creating diagrams like this one https://commons.wikimedia.org/wiki/File:Wikidata_quality_diagram,_January_2019_to_January_2020.png. It shows the development from January 2019 to January 2020.
We’re happy to present these improvements for Wikidata birthday https://www.wikidata.org/wiki/Wikidata:Eighth_Birthday/Presents, and we hope this will help us get a better and more accurate view of the data quality on Wikidata now.
If you want to see the quality score near the header on each Item you can include the following user script in your Common.js https://www.wikidata.org/wiki/Special:MyPage/common.js page: importScript("User:EpochFail/ArticleQuality.js")
*What’s coming next on the same topic?*
- ORES can’t judge all aspects of quality. It for example can not tell if a statement is generally considered true. We will look at ways of judging this aspect of quality as well but it’s considerably harder. If you have ideas how to go about it let us know. - We will build a small tool that’ll make it possible for you to provide a list of Items and then get the quality of that subset of Wikidata as well as the lowest and highest quality Items. This will hopefully help wiki projects etc to have a good overview of their data.
If you have any questions or feedback, or want to keep discussing Item quality, feel free to use this talk page https://www.wikidata.org/wiki/Wikidata_talk:Item_quality. Cheers,