I missed the WikiSym session, but I think merging the notes from it with
that page would be useful.
--
Piotr Konieczny
Jodi Schneider wrote:
We had a very useful collective notetaking effort
during Felipe's
Wikimania session on Mining Wikipedia public data. To have a second
copy, I've dumped it the contents into the Talk page for that session:
http://wikimania2010.wikimedia.org/wiki/Talk:Submissions/Mining_Wikipedia_p…
There are several interesting parts -- including a summary of Felipe's
recommendations.
I'll paste below just one section -- about tools/best practices --
because I'd really like to see a central place to look up documentation
on best practices, tools, and methodologies. It could transclude from or
point to the existing documentation.
Would that be useful to anyone else? If so, this list might give a scope
of the tech aspects, as a starting place. If it already exists --as an
existing single point-of-entry, I'd be delighted to know that instead!
-Jodi
==========
Here's part of that sync.in sheet -- worth looking at the whole thing, at
http://sync.in/60kOfEwBHA
*What tools/best practices can we share/should we know about?*
*Tools for analytizing particular articles*
http://toolserver.org/~daniel/WikiSense/Contributors.php - number of
contributors
http://toolserver.org/~mzmcbride/watcher/ - number of people who are
watching a page
http://stats.grok.se/en/201007/ - most viewed pages, largest # of
editors in a month, viewed page statistics
http://en.wikichecker.com/article/
http://wikidashboard.parc.com <http://wikidashboard.parc.com/>
visualization in place
*Bots and code*
http://meta.wikimedia.org/wiki/Pywikipedia
<http://meta.wikimedia.org/wiki/Pywikipediabot>bot
<http://meta.wikimedia.org/wiki/Pywikipediabot> pywikipediabot - queries
the Wikipedia API
*Computer resources*
http://toolserver.org/~daniel/ Talk to Daniel about Toolserver accounts
*Tools for dealing with particular dumps*
http://en.wikipedia.org/wiki/Wikipedia:Database_download - Information
on downloading the database
*What are these good for? (classify me)*
http://meta.wikimedia.org/wiki/WikiXRay quantitative analysis tool (from
Felipe Ortega et al)
http://meta.wikimedia.org/wiki/User:Micke/WikiFind search tool for
database dumps
http://www.cs.technion.ac.il/~gabr/resources/code/wikiprep/ -
preprocessor for XML dumps, "eliminates some information and adds other
useful information"
http://www.mediawiki.org/wiki/Alternative_parsers - List of parsers
http://static.wikipedia.org/ - Static HTML dumps
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l