These are still 404ing today. They were working for months prior to
yesterday.
Does the timing of the issue narrow it down to any recent changes in the
dump system? I strongly suspect the latest directory is getting updated
before the dump has completed. http://dumps.wikimedia.org/jawiki/20150703/
shows the files in the "latest" directory but also shows that the combined
latest revision articles dump hasn't been created yet:
- waiting *Recombine articles, templates, media/file descriptions, and
primary meta-pages.*
- jawiki-20150703-pages-articles.xml.bz2
--
Devesh
On Tue, Jul 7, 2015 at 5:45 PM, Devesh Parekh
dparekh@netflix.com wrote:
> Good catch. Unfortunately, it doesn't exist in the jawiki directory either.
>
> $ curl -I "
>
http://dumps.wikimedia.org/jawiki/latest/jawiki-latest-pages-articles.xml.bz...
> "
> HTTP/1.1 404 Not Found
> Server: nginx/1.1.19
> Date: Wed, 08 Jul 2015 00:44:31 GMT
> Content-Type: text/html; charset=utf-8
> Content-Length: 169
> Connection: keep-alive
>
> --
> Devesh
>
> On Tue, Jul 7, 2015 at 5:01 PM, Max Semenik
maxsem.wiki@gmail.com wrote:
>
>> You're trying to download Japanese file from an "enwiki" directory.
>>
>> On Tue, Jul 7, 2015 at 4:54 PM, Devesh Parekh
dparekh@netflix.com
>> wrote:
>>
>>> Hi data dumpers,
>>>
>>> Starting today, some of the URLs I've been using to find the latest
>>> dumps for current article revisions have begun 404ing.
>>>
>>> Japanese (failing):
>>> $ curl -I "
>>>
http://dumps.wikimedia.org/enwiki/latest/jawiki-latest-pages-articles.xml.bz...
>>> "
>>> HTTP/1.1 404 Not Found
>>> Server: nginx/1.1.19
>>> Date: Tue, 07 Jul 2015 23:40:40 GMT
>>> Content-Type: text/html; charset=utf-8
>>> Content-Length: 169
>>> Connection: keep-alive
>>>
>>> English (working):
>>> $ curl -I "
>>>
http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz...
>>> "
>>> HTTP/1.1 200 OK
>>> Server: nginx/1.1.19
>>> Date: Tue, 07 Jul 2015 23:39:36 GMT
>>> Content-Type: application/octet-stream
>>> Content-Length: 11984805689
>>> Last-Modified: Fri, 05 Jun 2015 23:45:33 GMT
>>> Connection: keep-alive
>>> Accept-Ranges: bytes
>>>
>>> Are these particular dump files going away, or is the "latest" symlink
>>> being updated before all dumps have completed?
>>>
>>> --
>>> Devesh
>>>
>>> _______________________________________________
>>> Xmldatadumps-l mailing list
>>> Xmldatadumps-l@lists.wikimedia.org
>>>
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Max Semenik ([[User:MaxSem]])
>>
>
>