Please forward wherever you think appropriate.
For some time we have provided multiple numbered pages-articles bz2 file
for large wikis, as well as a single file with all of the contents combined
into one. This is consuming enough time for Wikidata that it is no longer
sustainable. For wikis where the sizes of these files to recombine is "too
large", we will skip this recombine step. This means that downloader
scripts relying on this file will need to check its existence, and if it's
not there, fall back to downloading the multiple numbered files.
I expect to get this done and deployed by the March 20th dumps run. You
can follow along here: https://phabricator.wikimedia.org/T179059
Thanks!
Ariel
Thank you for the fix!
I confirmed that md5sums.txt under the /latest directory currently works fine.
--
Itsuki Toyota
Yahoo Japan Corporation
2018/02/13 15:04 に、"Xmldatadumps-l (Itsuki Toyota の代理)" <xmldatadumps-l-bounces(a)lists.wikimedia.org (itoyota(a)yahoo-corp.jp の代理)> を書き込みました:
Hi everyone,
See: https://lists.wikimedia.org/pipermail/xmldatadumps-l/2018-January/001397.ht…
As he mentioned above post, the /latest dir still contains outdated dumps other than the latest dumps.
Moreover, I noticed that this issue incapacitates /latest/md5sums.txt.
I think this file should lists the md5 of
- all of the dumps under the /latest dir; or
- the latest dumps under the /latest dir.
However, the current /latest/md5sums.txt behaves differently; it lists the md5 of the oldest dumps under the /latest dir.
My questions are as follows:
1) Is /latest/md5sums.txt obsolete or not?
2) If not, could you tell me how long does it take (or the priority) to fix this issue?
Thank you in advance.
--
Itsuki Toyota
Yahoo Japan Corporation
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l