Hi Sj
Scraperwiki is about playing with data (like a cool Excel), but WikiTeam extracts page histories and images. It is unrelated.
We surpassed 3,000 preserved wikis yesterday http://code.google.com/p/wikiteam/wiki/AvailableBackups and is quickly growing. We upload the dumps to Internet Archive, that folks know a bit about long-term preservation.
Wiki preservation is part of my research on wikis, and later I want to compare these wiki communities with Wikipedia. I'm open to suggestions.
Regards,
emijrp
Just wow... Thank you WikiTeam and task force! Is scraperwiki involved? SJOn Tue, Aug 7, 2012 at 5:18 AM, emijrp <emijrp@gmail.com> wrote:
_______________________________________________Hi;
I think this is the first time a full XML dump of Citizendium is publicly available[1] (CZ offers dumps but only the last revision for each article[2], and our previously efforts generated corrupted and incomplete dumps). It contains 168,262 pages and 753,651 revisions (9 GB, 99 MB in 7z). I think it may be useful for researchers, including quality analysis.
It was generated using WikiTeam tools.[3] This is part of our task force to make backups of thousands of wikis around the Internet.[4]
Regards,
emijrp
[1] http://archive.org/details/wiki-encitizendiumorg
[2] http://en.citizendium.org/wiki/CZ:Downloads
[3] http://code.google.com/p/wikiteam/
[4] http://code.google.com/p/wikiteam/wiki/AvailableBackups
--Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT comPre-doctoral student at the University of Cádiz (Spain)Personal website: https://sites.google.com/site/emijrp/
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Samuel Klein @metasj w:user:sj +1 617 529 4266