Hi all;
We know that websites are fragile and that broken links are common. Wikimedia (and other communities like Wikia) publish dumps of their wikis, but, that is not common. Most wiki communities don't publish any backups, so their users can't do anything when a disaster occurs (data loss, attack), if they want to fork, etc. Of course they can use Special:Export, but that requires a huge hand-made effort, and the images are not downloaded.
I'm working in WikiTeam,[1] a group inside Archive Team, where we want to archive wikis, from Wikipedia to tiniest ones. As I said, Wikipedia publishes backups, so not problem here. But I have developed a script that downloads all the pages of a wiki (using Special:Export), it merges them into an unique XML file (as pages-history dumps) and downloads all the images (if you enable that option). That is great if you want to have a backup of your favorite wiki, or to clone a defunct wiki (abandoned by its administrator), or you want to move your wiki from a free wikifarm to a personal paid hosting, etc.
Also, of course, you can use this script to retrieve the full histories of a wiki, and research, just as a Wikipedia dump.
We are running this script in several wikis and uploading the complete histories to the download section[2], building a little wiki library. Don't be fooled by their sizes. They are 7zip files which usually expand to many MB.
I hope you enjoy this script, make backups of your favorite wikis and research them.
Regards, emijrp
[1] http://code.google.com/p/wikiteam/ [2] http://code.google.com/p/wikiteam/downloads/list?can=1
emijrp, 14/04/2011 12:52:
We know that websites are fragile and that broken links are common. Wikimedia (and other communities like Wikia) publish dumps of their wikis, but, that is not common. Most wiki communities don't publish any backups [...]
Would it make sense to add a feature to MediaWiki 8disabled by default) that makes the regular generation of dumps automatic and creates some special page to access them or even sends them to some central repository (we could create one for free license projects)? So people wouldn't need to care about dumps and we wouldn't need to go around looking for wikis.
Nemo
Hi Nemo. It is a great idea, but it needs a lot of resources (space and bandwidth).
By the way, MediaWiki includes dumpBackup.php to make wiki backups. But people do not use it, and do not publish backups. If we make an extension or feature (disabled by deault), it is the same problem. Only a few persons are going to care about enable it and publishing backups.
2011/4/14, Federico Leva (Nemo) nemowiki@gmail.com:
emijrp, 14/04/2011 12:52:
We know that websites are fragile and that broken links are common. Wikimedia (and other communities like Wikia) publish dumps of their wikis, but, that is not common. Most wiki communities don't publish any backups [...]
Would it make sense to add a feature to MediaWiki 8disabled by default) that makes the regular generation of dumps automatic and creates some special page to access them or even sends them to some central repository (we could create one for free license projects)? So people wouldn't need to care about dumps and we wouldn't need to go around looking for wikis.
Nemo
emijrp, 18/04/2011 20:57:
Hi Nemo. It is a great idea, but it needs a lot of resources (space and bandwidth).
By the way, MediaWiki includes dumpBackup.php to make wiki backups. But people do not use it, and do not publish backups. If we make an extension or feature (disabled by deault), it is the same problem. Only a few persons are going to care about enable it and publishing backups.
Well, apart from the central repository, it could even be enabled by default if the wiki is under a free license, given that dumps of small wikis are so small after compression.
Nemo
wikimedia-l@lists.wikimedia.org