Hi all.
This is just to announce that the final draft of my PhD. thesis "Wikipedia: A
quantitative analysis" is already finished. Only minor appendixes remain, on general
background for some statistical methods that I applied.
It will be (hopefully) approved to be presented in just a few days, though bureacracy will
delay the "voce" until middle of March (more or less).
It includes the first quantitative analysis comparing the top 10 language versions of
Wikipedia, as of Dec. 2007 (to allow fair comparison of EN with other languages). Among
other interesting insights, it presents a complete study of the activity of logged
authors, articles and talk pages, evolution in time of distributions of key parameters
(diff. authors per article, articles per author, revisions per author/article, etc.).
It also offer a more in-depth study of the inequality of contributions by logged authors,
and also for articles. Likewise, it presents a complete survival analysis to examine the
average lifetime of Wikipedia contributors, focusing on the transitions first contribution
--> joining the core --> core membership --> leaving the core --> abandoning
the project.
Finally, we already examine some very basic metrics for quality, analyze the commont
quantitative patterns of reputated authors and high quality content and try to infer
implications of all these findings for the future sustainability of the Wikipedia work
flow model in the following years.
If any of you is interested in having a look at the (still draft) manuscript, I accept
on-demand access petitions to the repo :).
I'll wait after the public defense and comments from reviewers to make a public
summary of our conclusions.
Best,
F.