On Dec 20, 2007 10:07 AM, Brian Brian.Mingus@colorado.edu wrote:
I can provide a list of the top 40,000 articles rated by quality according to the wikipedia editorial team. A random sample is unlikely to be interesting, as greater than 70% of articles are stubs.
Well, we don't really want the top articles, a broad range to see how the system behaves with different levels of quality is important, but we could certainly take the top 10,000 and put in another 10,000 at random. At the end of the day, this is just a demo, and even 100 articles will do -- nobody on this list is going to read through the whole 40K. If anyone
@Gregory: Could you post some of those cases to the list so that they can be imported manually whenever Luca is available?