One of ORES [1] applications is determining article quality. For example, What would be the best assessment of an article in the given revision. Users in wikiprojects use ORES data to check if articles need re-assessment. e.g. if an article is in "Start" level and now good it's enough to be a "B" article.
As part of Q4 goals, we made a dataset of article quality scores of all articles in English Wikipedia [2] (Here's the link to download the dataset [3]) and we are publishing it in figshare as something you can cite [4] also we are working on publishing monthly data for researchers to track article quality data change over time. [5]
As a pet project of mine, I always wanted to put these data in a database. So we can query the database and get much more useful data. For example quality of articles in category 'History_of_Essex' [6] [7]. The weighed sum is a measure of quality which is a decimal number between 0 (really stub) to 5 (a definitely featured article). We have also prediction column which is a number in this map [8] for example if prediction is 5, it means ORES thinks it should be a featured article.
I leave more use cases to your imagination :)
I'm looking for a more permanent place to put these data, please tell me if it's useful for you. [1] ORES is not a anti-vandalism tool, it's an infrastructure to use AI in Wikipedia. [2] https://phabricator.wikimedia.org/T135684 [3] (117 MBs) https://datasets.wikimedia.org/public-datasets/enwiki/article_quality/wp10-s... [4] https://phabricator.wikimedia.org/T145332 [5] https://phabricator.wikimedia.org/T145655 [6] https://quarry.wmflabs.org/query/12647 [7] https://quarry.wmflabs.org/query/12662 [8] https://github.com/wiki-ai/wikiclass/blob/3ff2f6c44c52905c7202515c5c8b525fb1...
Have fun! Amir