--
Devesh
On Wed, Jul 8, 2015 at 11:10 AM, Devesh Parekh
dparekh@netflix.com wrote:
> These are still 404ing today. They were working for months prior to
> yesterday.
>
> Does the timing of the issue narrow it down to any recent changes in the
> dump system? I strongly suspect the latest directory is getting updated
> before the dump has completed.
http://dumps.wikimedia.org/jawiki/20150703/
> shows the files in the "latest" directory but also shows that the combined
> latest revision articles dump hasn't been created yet:
>
> - waiting *Recombine articles, templates, media/file descriptions, and
> primary meta-pages.*
> - jawiki-20150703-pages-articles.xml.bz2
>
>
> --
> Devesh
>
> On Tue, Jul 7, 2015 at 5:45 PM, Devesh Parekh
dparekh@netflix.com wrote:
>
>> Good catch. Unfortunately, it doesn't exist in the jawiki directory
>> either.
>>
>> $ curl -I "
>>
http://dumps.wikimedia.org/jawiki/latest/jawiki-latest-pages-articles.xml.bz...
>> "
>> HTTP/1.1 404 Not Found
>> Server: nginx/1.1.19
>> Date: Wed, 08 Jul 2015 00:44:31 GMT
>> Content-Type: text/html; charset=utf-8
>> Content-Length: 169
>> Connection: keep-alive
>>
>> --
>> Devesh
>>
>> On Tue, Jul 7, 2015 at 5:01 PM, Max Semenik
maxsem.wiki@gmail.com
>> wrote:
>>
>>> You're trying to download Japanese file from an "enwiki" directory.
>>>
>>> On Tue, Jul 7, 2015 at 4:54 PM, Devesh Parekh
dparekh@netflix.com
>>> wrote:
>>>
>>>> Hi data dumpers,
>>>>
>>>> Starting today, some of the URLs I've been using to find the latest
>>>> dumps for current article revisions have begun 404ing.
>>>>
>>>> Japanese (failing):
>>>> $ curl -I "
>>>>
http://dumps.wikimedia.org/enwiki/latest/jawiki-latest-pages-articles.xml.bz...
>>>> "
>>>> HTTP/1.1 404 Not Found
>>>> Server: nginx/1.1.19
>>>> Date: Tue, 07 Jul 2015 23:40:40 GMT
>>>> Content-Type: text/html; charset=utf-8
>>>> Content-Length: 169
>>>> Connection: keep-alive
>>>>
>>>> English (working):
>>>> $ curl -I "
>>>>
http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz...
>>>> "
>>>> HTTP/1.1 200 OK
>>>> Server: nginx/1.1.19
>>>> Date: Tue, 07 Jul 2015 23:39:36 GMT
>>>> Content-Type: application/octet-stream
>>>> Content-Length: 11984805689
>>>> Last-Modified: Fri, 05 Jun 2015 23:45:33 GMT
>>>> Connection: keep-alive
>>>> Accept-Ranges: bytes
>>>>
>>>> Are these particular dump files going away, or is the "latest" symlink
>>>> being updated before all dumps have completed?
>>>>
>>>> --
>>>> Devesh
>>>>
>>>> _______________________________________________
>>>> Xmldatadumps-l mailing list
>>>> Xmldatadumps-l@lists.wikimedia.org
>>>>
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Max Semenik ([[User:MaxSem]])
>>>
>>
>>
>