On Mon, 2007-10-29 at 12:47 +0200, Osnat Etgar wrote:
I don't want all the history. I just want the
current articles, so I
am downloading pages-meta-current.xml.bz2 and pages-articles.xml.bz2
You don't want the history, but you want all of the discussion and user
pages? Are you sure?
I'm testing a download of the -meta-current.xml.bz2 right now to see if
it does indeed work, but it will take 1/2 day to get it all. I'll post
back and let you know what happens.
Where else can I get the pages-meta-current? The
previous dump? When I
look for the previous one, I can only find a status.html file.
Maybe I don't really need the pages-meta-current if I only want the
current articles?
The server claims to have the right amount of bytes, so let's see what
happens when my download completes:
Server: Wikimedia dump service 20050523 (lighttpd)
Content-Length: 5780471837
--
David A. Desrosiers
desrod(a)gnu-designs.com
setuid(a)gmail.com
http://projects.plkr.org/
Skype...: 860-967-3820