Edit history in an accessible form -- create a queryable NoSQL form of
data dumps
I'd like to get this started ASAP. I think we can set up a bridge to synchronize directly from MediaWiki to a tool like Cassandra. It will provide a superior source for both XML dumps and analysis.
Data dumps -- ongoing improvements of the data dump creation process
I think we can improve this process by working on a queryable NoSQL system that syncs directly from MediaWiki. It should allow us to produce dumps in parallel and with more bandwidth than querying MySQL.
Privacy -- making sure we act consistently with the letter and intent
of our privacy policy ( http://wikimediafoundation.org/wiki/Privacy_policy ) in developing new analytics solutions
I'm happy to share thoughts here and participate in discussions.
Big Data ad hoc mining infrastructure -- working through design
considerations for a NoSQL cluster
This seems to go hand-in-hand with the first two working groups.
Fundraiser Analytics & Testing -- group devoted to QA of existing
systems
I'm trying to ramp down my work here so I can move onto the other challenges.