Thanks for the quick fix! I'll verify it too with the next run.
I discovered this while building a link graph directly from the pages-articles dump, and finding that I had more broken links (missing target articles) than expected.
On Tue, Feb 27, 2018 at 4:10 AM, Ariel Glenn WMF ariel@wikimedia.org wrote:
It turns out that this happens for exactly 27 pages, those at the end of each enwiki-20180220-stub-articlesXX.xml.gz file. Tracking here: https://phabricator.wikimedia.org/T188388
Ariel
On Tue, Feb 27, 2018 at 10:45 AM, Ryan Hitchman hitchmanr@gmail.com wrote:
Multiple pages are missing from the enwiki pages-articles-multistream dumps from 20180201 and 20180220.
Page id 88444: "Phosphor" doesn't appear in the index or in the data stream. This also happens for TARDIS, Psalm 132, and many others
Why would the dump be partial?
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l