On 09/09/05, Dorożyński Janusz dorozynskij@poczta.onet.pl wrote:
But ... Fact 1. Since 23rd June anybody can downloaded whole content (or text) Fact 2. Past 23rd June anybody can download excerpt of whole content (yes, yes, because are important technical reasons, etc., and I accept them without any doubts)
As far as I know, this is not the situation - the dumps still represent the same data, it's just that they represent it in a different format. For instance, where previously you had the content of the 'cur' table, you now have an XML file (pages_current.xml.gz) which contains the same thing - the current version of each page. Similarly, the old 'old' table is available in a *more* friendly form (with the current revisions in too, which makes sense) as pages_full.xml.gz or filtered to the main namespace as all_titles_in_ns0.gz
The only downside is that the dumps are now harder to import, because the helper scripts are not yet well-developed. Once those scripts are working better, the new system will actually allow *improved* access to the data, since there can easily be more different dumps for different purposes.