Dear Wikitech list members,
This is my first post here, I have been redirected by Alfio who said you might have some answers regarding my research. Here's my orginal question (http://it.wikipedia.org/wiki/Discussioni_utente:Alfio#Long_Tail_of_Wikipedi a_Usage) and Alfio's answer (http://en.wikipedia.org/wiki/User_talk:Junjulien) right below: Dear Alfio,
I am part of an organization that tries, amongst other things, to promote the use of wikipedias in native languages. I believe you take an active part in compiling these statistics : http://en.wikipedia.org/wiki/Wikipedia:Multilingual_statistics, and I hope you might point me in the right direction for my research. I am interested in establishing a matrix which would give the number of users for each "below 100 000 articles" wikipedias (from #16 onward in this list http://meta.wikimedia.org/wiki/List_of_Wikipedias), against the countries where the visitor's traffix originates from, as well as against where the editors are editing from. Obviously it would be great to have time as a 3rd dimension to follow trends...
Where to start? Who to ask to?
Please contact me on my talk page http://en.wikipedia.org/wiki/User_talk:Junjulien
Thanks a lot for your time,
Jun Julien Matsushita Project Coordinator Internews Europe
-----------------------
Hello, sorry for the late answer (holidays...). It is true that I compile part of the Multilingual statistics, but my contribution is limited to getting the current copy of http://meta.wikimedia.org/wiki/List_of_Wikipedias http://meta.wikimedia.org/wiki/List_of_Wikipedias and feeding it to a script which generates the table. The list of wikipedias itself, as far as I know, is bot-generated, but I only have the foggiest idea of how (wikipedia's ways can be strange at times... :-) Your project would need a great deal of data about editors and readers, and data about the readers is probably unavailable as it would require collecting server logs, and Wikimedia servers do not have the capability of recording visitor logs at our current load. I remember seeing on wikitech-l that someone is recording decimated data, e.g. one in 10 or 100 visitors, but deleting personal info like the originating IP, which would defeat geolocation. About the editors, the IP addresses of logged in users are not collected (again). While for anonymous editors, the IP is recorded in the history and you could download a full history dump from http://download.wikimedia.org/ http://download.wikimedia.org and see what you can recover. In short, i don't really know how to help you. Try to write to wikitech-l (see http://lists.wikimedia.org/mailman/listinfo/wikitech-l http://lists.wikimedia.org/mailman/listinfo/wikitech-l), and see if someone has the data you need.
Cheers, Alfio
-------------------------
Has anyone a clue as to where to direct my efforts?
Thanks a lot for your time,
Jun Julien Matsushita Radio Connect Project Coordinator Internews Europe 14, cité Griset - 75011 Paris France - www.internews.eu skype: junjulien
Jun Matsushita hett schreven:
Has anyone a clue as to where to direct my efforts?
Similar data was collected in 2006: http://meta.wikimedia.org/wiki/Edits_by_project_and_country_of_origin. Greg Maxwell and Kelly Martin were involved in this. I don't know whether they are reading this list, but Kelly Martin can be contacted through http://meta.wikimedia.org/wiki/Special:Emailuser/Kelly_Martin.
I really would appreciate some statistics about our lesser-used language projects. We are often left out in statistics (in the 2006 sample too).
Marcus Buck
Marcus, thanks a lot : your advice was spot on.
Kelly Martin redirected me to Greg Maxwell who already had started a exrtemely similar project here : http://myrandomnode.dyndns.org/wikipedia-viewer-matrix.html (raw text here: http://myrandomnode.dyndns.org/wikipedia-viewer-matrix.txt)
Quoting him : "I've not done more updated versions because I was dissatisified with the quality of freely available geolocation databases (notice the large number of unknown)."
Anyone here has advice to get better geolocation?
Thanks again.
Jun.
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Marcus Buck Sent: jeudi 27 décembre 2007 19:30 To: Wikimedia developers Subject: Re: [Wikitech-l] Long tail of wikipedia usage
Jun Matsushita hett schreven:
Has anyone a clue as to where to direct my efforts?
Similar data was collected in 2006: http://meta.wikimedia.org/wiki/Edits_by_project_and_country_o f_origin. Greg Maxwell and Kelly Martin were involved in this. I don't know whether they are reading this list, but Kelly Martin can be contacted through http://meta.wikimedia.org/wiki/Special:Emailuser/Kelly_Martin.
I really would appreciate some statistics about our lesser-used language projects. We are often left out in statistics (in the 2006 sample too).
Marcus Buck
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org