On 14/09/05, Netocrat netocrat@dodo.com.au wrote:
So overall there's potential, but I need to work on some representative data to present a proper summary of the pros and cons. Which presents a David and Goliath problem... 56k dial-up vs 31G xml download. Can anyone suggest a source for a smaller data set in English with some representative multiple-revision articles, preferably a few edit wars etc.
You can use the Special:Export page to export the complete histories of individual pages, so you could pick a few which seemed suitably representative and play with them.