On 2/23/2011 9:20 PM, geni wrote:
Problem is that this is in practice a far better fit for wikipedia where such lists are generated in passing than commons.
And that's a critical insight.
From my perspective, wikipedia is a skeleton and commons is the flesh that's hanging on it. If you want to improve the organization of Commons, you've got the most incredible resource in the world to do that... Wikipedia.
Today, Freebase and Dbpedia can be used together to form a rich and powerful database and ontology that describes the contents of Wikipedia. A "top 100" list doesn't need to be compiled by experts or even by humans, but can be produced by a largely automated process. For instance, you could look for things that are typed '/people/person' in Freebase and then sort them in the order of how many Wikipedia articles and produce a list of the "top 100" people that is pretty good (except for the minor embarrassment that U.S. President #43 is the most linked person in my sample.)
With a little work, it should be possible to build something that makes lists like "Train stations in Poland that don't have pictures in en.wikipedia", though it's a query that's not on my fingertips because my system was built to pay attention to things that have photographs and ignore things that don't.
Anyhow, Freebase is CC-BY so there's no problem feeding data from it back into Wikipedia/commons. I'll offer that data quality is a lot better in Freebase than dbpedia, so anyone going down this route will save a lot of time by relying primarily on Freebase and using Dbpedia to fill gaps.