Sure!
I was just going to ask on the list about such stuff when your mail arrived.
Regards,
Pavlo Shevelo
On Tue, Jul 13, 2010 at 1:30 PM, Jodi Schneider <jodi.schneider(a)deri.org> wrote:
We had a very useful collective notetaking effort
during Felipe's Wikimania
session on Mining Wikipedia public data. To have a second copy, I've dumped
it the contents into the Talk page for that session:
http://wikimania2010.wikimedia.org/wiki/Talk:Submissions/Mining_Wikipedia_p…
There are several interesting parts -- including a summary of Felipe's
recommendations.
I'll paste below just one section -- about tools/best practices -- because
I'd really like to see a central place to look up documentation on best
practices, tools, and methodologies. It could transclude from or point to
the existing documentation.
Would that be useful to anyone else? If so, this list might give a scope of
the tech aspects, as a starting place. If it already exists --as an existing
single point-of-entry, I'd be delighted to know that instead!
-Jodi
==========
Here's part of that sync.in sheet -- worth looking at the whole thing, at
http://sync.in/60kOfEwBHA
What tools/best practices can we share/should we know about?
Tools for analytizing particular articles
http://toolserver.org/~daniel/WikiSense/Contributors.php - number of
contributors
http://toolserver.org/~mzmcbride/watcher/ - number of people who are
watching a page
http://stats.grok.se/en/201007/ - most viewed pages, largest # of editors in
a month, viewed page statistics
http://en.wikichecker.com/article/
http://wikidashboard.parc.com visualization in place
Bots and code
http://meta.wikimedia.org/wiki/Pywikipediabot pywikipediabot - queries the
Wikipedia API
Computer resources
http://toolserver.org/~daniel/ Talk to Daniel about Toolserver accounts
Tools for dealing with particular dumps
http://en.wikipedia.org/wiki/Wikipedia:Database_download - Information on
downloading the database
What are these good for? (classify me)
http://meta.wikimedia.org/wiki/WikiXRay quantitative analysis tool (from
Felipe Ortega et al)
http://meta.wikimedia.org/wiki/User:Micke/WikiFind search tool for database
dumps
http://www.cs.technion.ac.il/~gabr/resources/code/wikiprep/ - preprocessor
for XML dumps, "eliminates some information and adds other useful
information"
http://www.mediawiki.org/wiki/Alternative_parsers - List of parsers
http://static.wikipedia.org/ - Static HTML dumps
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l