I notice the dumps seem currently frozen, is this the best place to ask
for information, or is it publicly available somewhere else? (in which
case sorry for pestering).
Conrad
--
Este mensaje le ha llegado mediante el servicio de correo electronico
que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistem
a Nacional de Salud. La persona que envia este correo asume el compromiso de
usar el servicio a tales fines y cumplir con las regulaciones establecidas
Infomed: http://www.sld.cu/
Tomasz:
> Then I further split it for ops and general tech.
> Let me know how well you think that has worked.
I would favor combining the two. They are both very low traffic
and I noticed other users were also confused in the past.
But if the split up is handier for ops, it's no big deal.
Erik Zachte
Jamie:
> I thought the file size would grow fairly linearly with the page count,
> but for the last 10% or so of the pages the file size hardly grew at all.
Pages in the dump are in order of page id, and thus more or less in order of
creation date.
Pages in the end of the dump are small, more often stubs, with few
revisions.
Erik Zachte
Tomasz Finc wrote:
> New full history en wiki snapshot is hot off the presses!
>
> It's currently being checksummed which will take a while for 280GB+ of
> compressed data but for those brave souls willing to test please grab it
> from
>
> http://download.wikipedia.org/enwiki/20100130/enwiki-20100130-pages-meta-hi…
>
>
> and give us feedback about its quality. This run took just over a month
> and gained a huge speed up after Tims work on re-compressing ES. If we
> see no hiccups with this data snapshot, I'll start mirroring it to other
> locations (internet archive, amazon public data sets, etc).
>
> For those not familiar, the last successful run that we've seen of this
> data goes all the way back to 2008-10-03. That's over 1.5 years of
> people waiting to get access to these data bits.
>
> I'm excited to say that we seem to have it :)
>
> --tomasz
We now have an md5sum for enwiki-20100130-pages-meta-history.xml.bz2.
"65677bc275442c7579857cc26b355ded"
Please verify against it before filing issues.
--tomasz