This run was taking a long time to complete (see
https://phabricator.wikimedia.org/T153345). The underlying problem has
been fixed and the last steps are running manually right now as we play
catch-up. Expect the run to show up as completed late today or early
tomorrow (UTC time).
Ariel
Thanks for doing this. I would rather have all of the table dumps together. I parse the name so order of precedence isn’t an issue for me.
--Scott
On 12/9/16, 7:00 AM, "Xmldatadumps-l on behalf of xmldatadumps-l-request(a)lists.wikimedia.org" <xmldatadumps-l-bounces(a)lists.wikimedia.org on behalf of xmldatadumps-l-request(a)lists.wikimedia.org> wrote:
Send Xmldatadumps-l mailing list submissions to
xmldatadumps-l(a)lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
or, via email, send a message with subject or body 'help' to
xmldatadumps-l-request(a)lists.wikimedia.org
You can reach the person managing the list at
xmldatadumps-l-owner(a)lists.wikimedia.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Xmldatadumps-l digest..."
Today's Topics:
1. changing order of dump steps in status and checksum files
(Ariel Glenn WMF)
----------------------------------------------------------------------
Message: 1
Date: Thu, 8 Dec 2016 17:37:29 +0200
From: Ariel Glenn WMF <ariel(a)wikimedia.org>
To: Wikipedia Xmldatadumps-l <Xmldatadumps-l(a)lists.wikimedia.org>
Subject: [Xmldatadumps-l] changing order of dump steps in status and
checksum files
Message-ID:
<CALCvg_6qG1JC0Qq-biaH8zNwgK6LvatYjr=wGvazUdBivrym=Q(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Before I do this, I want to know if anyone here relies on the specific
order of the contents of the md5 or sha1 sum files for the dumps, or on the
order of the entries in the dumpruninfo file.
The reason I want to fiddle with the order is to have all the table dumps
together, rather than scattered around in these files. And the reason for
that is convenience; I'm about to update the code that adds these jobs as
steps to be run, and it's more readable/maintainable to add them all in one
group.
Anyone here who would be impacted? Please let me know; I'd like to roll
this out for the second monthly run.
Thanks,
Ariel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/xmldatadumps-l/attachments/20161208/6…>
------------------------------
Subject: Digest Footer
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
------------------------------
End of Xmldatadumps-l Digest, Vol 80, Issue 2
*********************************************
Before I do this, I want to know if anyone here relies on the specific
order of the contents of the md5 or sha1 sum files for the dumps, or on the
order of the entries in the dumpruninfo file.
The reason I want to fiddle with the order is to have all the table dumps
together, rather than scattered around in these files. And the reason for
that is convenience; I'm about to update the code that adds these jobs as
steps to be run, and it's more readable/maintainable to add them all in one
group.
Anyone here who would be impacted? Please let me know; I'd like to roll
this out for the second monthly run.
Thanks,
Ariel
Hi all.
Firstly, apologies for eventual duplicates or posting the question in
the wrong mailing list.
Secondly, could anybody kindly explain to me if some Wikipedia pages
changed their IDs from the past ? Or if so point to me where this might
be documented ?
I have Wikipedia pages-articles XML dumps from the years 2006 and 2008
and when I was parsing those dumps I ran across some situations
such as the following one. In the dumps from 2006 and 2008 I found that
the South Africa page has the ID 68854, while in the most current
Wikipedia pages-articles XML dump (i.e. 2016) the same article has the
ID 17416221.
I am trying to match some Wiki pages by IDs across time, but the example
above is not helping.
Much appreciated in advance for any help.
--
Renato Stoffalette Joao
- PhD Student -
L3S Research Center / Leibniz Uni.
15th Floor, Room:1519
Appelstraße 9a
30167 Hannover, Germany
+49.511.762-17759