Jimmy (Jimbo) Wales wrote:
samuel wrote:
Yes, I am quite familiar with the nature of Wikipedia and how it works. The object of the study is to measure the quality of the current (by "current" I mean the Wikipedia that exists at a certain, as of yet undetermined, moment in time) online version of Wikipedia, since that is what people actually use.
Excellent. For this, one slight improvement I suggest, rather than selecting random pages, is to select random pages with a weighting adjusted for the traffic to those pages. What we're really interested in is the quality of the content that people actually use, as opposed to obscure "dead-end", orphan or near-orphan pages that end users seldom see.
I agree. The study will (hopefully -- this project cannot be allowed to grow arbitrarily large) make quality measures of wikipedia sample pages collected in several different ways. An example of categories that are interesting to look at with respect to quality:
1) Completely random pages 2) Most popular pages 3) Least popular pages 4) Most edited pages 5) Least edited pages 6) Most recently edited pages 7) Least recently edited pages
These are not the only possible selections (an omission is obviously selections from different normal categories of human knowledge -- technology, culture, etc. It would be interesting to see, for example, if technology articles are less likely to contain factual errors than articles about art.). However, the first category is important since there is no obvious analogy to, say, "most recently edited pages" in other encyclopedias, and some comparision to other encyclopedias is necessary if the scale at which the Wikipedia quality is judged is to be truly meaningful. (By the way, if anyone has ideas for methods to select articles from these other categories as well, by all means, let me know! Some of them are rather straightforward, such as "most recently edited pages", but others much less so.)
--Samuel
This is indeed what will be done. It is for the "vice-versa" part that I considered using the "random page" feature to select articles.
It's a good idea -- some form of randomization is a great way to look at it. Probably the random pages feature is too crude to really answer the most interesting questions, though.
--Jimbo _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l