Dear researchers,
Recently, we started the Editor Trends Study ( http://strategy.wikimedia.org/wiki/Editor_Trends_Study). The goal of this study is to get a better understanding of the community dynamics within the different Wikipedia projects.
Part of this project consists of developing a tool ( http://strategy.wikimedia.org/wiki/Editor_Trends_Study/Software) that parses a Wikipedia dump file, extracts the required information, stores it in a database and exports it to a CSV file. This CSV file can then be used in a statistical program such as R, Stata or SAS.
We are looking for some volunteers that would enjoy testing the tool. You don't need to be a software developer (although it helps :)) to help us; some patience, a bit of time and a fairly recent computer is all you need. You should be comfortable installing programs, working with a command-line interface and have basic Subversion experience. Python experience is a real bonus!
The testing will focus on getting the tool to run without any supervision. For more background information, have a look at: http://strategy.wikimedia.org/wiki/Editor_Trends_Study/Software
We are testing the tool with the largest Wikipedia projects, so if you would like to replicate the analysis on your own favorite Wikipedia project or help improve the quality of the tool then please contact me off-list.
Best,
Diederik
Diederik van Liere wrote:
We are looking for some volunteers that would enjoy testing the tool. You don't need to be a software developer (although it helps :)) to help us; some patience, a bit of time and a fairly recent computer is all you need. You should be comfortable installing programs, working with a command-line interface and have basic Subversion experience. Python experience is a real bonus!
Quick feedback: * glad to see progress! * the wiki pages you link seem well designed and how-to's appear to make sense :) * as long as there is a need for a command-line interface and no graphical user interface, many would-be users will not be able to use it * ditto for things like Python and Subversion (I never even heard of the latter...).
I assume that having a GUI is planned in some foreseeable future?
--- El mié, 10/11/10, Diederik van Liere dvanliere@gmail.com escribió:
De: Diederik van Liere dvanliere@gmail.com Asunto: [Wiki-research-l] Editor Trends Study - Improving the tool Para: wiki-research-l@lists.wikimedia.org Fecha: miércoles, 10 de noviembre, 2010 00:02
Hi, Diederik,
I'm also glad to see progress in this project. Some comments inline.
Dear researchers,
Recently, we started the Editor Trends Study (http://strategy.wikimedia.org/wiki/Editor_Trends_Study). The goal of this study is to get a better understanding of the community
dynamics within the different Wikipedia projects.
Part of this project consists of developing a tool (http://strategy.wikimedia.org/wiki/Editor_Trends_Study/Software)
that parses a Wikipedia dump file, extracts the required information, stores it in a database and exports it to a CSV file. This CSV file can then be used in a statistical program such as R, Stata or SAS.
Well, I would have expected that the team would have done some previous search for open source code already available, that implements at least some (if not exactly all or the very same) of the planned functionalities.
Some examples are my own tool, WikiXRay, and Pywikpediabot (that, AFAIK, now it also includes a fast parser of Wikipedia dump files).
For my tool, now I use git for version control and you can use any of the two repos available (the official at libresoft, or the mirror at Gitorious):
http://git.libresoft.es/WikixRay/ http://gitorious.org/wikixray/wikixray
Well, they might not be the best possible software available, but I guess they can help to solve some problems, or at least help you to speed up the development and to avoid starting from scratch.
We are looking for some volunteers that would enjoy testing the tool. You don't need to be a software developer (although it helps :)) to help us; some patience, a bit of time and a fairly recent computer is all you need. You should be comfortable installing programs,
working with a command-line interface and have basic Subversion experience. Python experience is a real bonus!
The testing will focus on getting the tool to run without any supervision. For more background information, have a look at:
http://strategy.wikimedia.org/wiki/Editor_Trends_Study/Software
Perhaps you're going to provide this info later, but I don't see the links to your SVN repo (only [] ).
We are testing the tool with the largest Wikipedia projects, so if you would like to replicate
the analysis on your own favorite Wikipedia project or help improve the quality of the tool then please contact me off-list.
I think it should be more effective to have another public list to which people specifically interested in this tool can suscribe (for example, like we have one for XML dumps exclusively).
This should sensibly reduce the number of duplicated bug reports, and comments, since other people can learn about known issues.
Hope this helps.
Best, Felipe.
Best,
Diederik
-----Adjunto en línea a continuación-----
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org