Dear Wikitech list members,
This is my first post here, I have been redirected by Alfio who said you
might have some answers regarding my research.
Here's my orginal question
(
http://it.wikipedia.org/wiki/Discussioni_utente:Alfio#Long_Tail_of_Wikipedi
a_Usage) and Alfio's answer
(
http://en.wikipedia.org/wiki/User_talk:Junjulien) right below:
Dear Alfio,
I am part of an organization that tries, amongst other things, to promote
the use of wikipedias in native languages. I believe you take an active part
in compiling these statistics :
http://en.wikipedia.org/wiki/Wikipedia:Multilingual_statistics, and I hope
you might point me in the right direction for my research. I am interested
in establishing a matrix which would give the number of users for each
"below 100 000 articles" wikipedias (from #16 onward in this list
http://meta.wikimedia.org/wiki/List_of_Wikipedias), against the countries
where the visitor's traffix originates from, as well as against where the
editors are editing from. Obviously it would be great to have time as a 3rd
dimension to follow trends...
Where to start? Who to ask to?
Please contact me on my talk page
http://en.wikipedia.org/wiki/User_talk:Junjulien
Thanks a lot for your time,
Jun Julien Matsushita Project Coordinator Internews Europe
-----------------------
Hello,
sorry for the late answer (holidays...). It is true that I compile part of
the Multilingual statistics, but my contribution is limited to getting the
current copy of <http://meta.wikimedia.org/wiki/List_of_Wikipedias>
http://meta.wikimedia.org/wiki/List_of_Wikipedias and feeding it to a script
which generates the table. The list of wikipedias itself, as far as I know,
is bot-generated, but I only have the foggiest idea of how (wikipedia's ways
can be strange at times... :-)
Your project would need a great deal of data about editors and readers, and
data about the readers is probably unavailable as it would require
collecting server logs, and Wikimedia servers do not have the capability of
recording visitor logs at our current load. I remember seeing on wikitech-l
that someone is recording decimated data, e.g. one in 10 or 100 visitors,
but deleting personal info like the originating IP, which would defeat
geolocation.
About the editors, the IP addresses of logged in users are not collected
(again). While for anonymous editors, the IP is recorded in the history and
you could download a full history dump from
<http://download.wikimedia.org/>
http://download.wikimedia.org and see what
you can recover. In short, i don't really know how to help you. Try to write
to wikitech-l (see <http://lists.wikimedia.org/mailman/listinfo/wikitech-l>
http://lists.wikimedia.org/mailman/listinfo/wikitech-l), and see if someone
has the data you need.
Cheers,
Alfio
-------------------------
Has anyone a clue as to where to direct my efforts?
Thanks a lot for your time,
Jun Julien Matsushita
Radio Connect Project Coordinator
Internews Europe
14, cité Griset - 75011 Paris
France -
www.internews.eu
skype: junjulien