Heya,
Last week I resurrected Wikihadoop from the Summer of Research 2011 when we
wrote a Hadoop-based input parser for bzipped XML dumps of Wikipedia and
the ability to create diffs between revisions. This work was mainly done by
Yusuke Matsubara, Aaron Halfaker and Fabian Kaelin.
This is working again and if you have input / suggestions on how to expose
this data, then please let me know!
Best,
Diederik
Show replies by date