Hi,
Perhaps there is documentation about this, but I have looked for the past hour and haven’t found anything.
I was wondering if it is guaranteed that all revisions given in the enwiki-latest-pages-meta-history files are in order of parent->child->grandchild->… In a few examples, it looks like they follow this pattern. I ask because I need them in order and it would be nice if I didn’t have to do that with the <parentid> field.
Thank you, Christopher
The queries to get page and revision metadata are ordered by page id, and within each page, by revision id. This is guaranteed. The behavior of rev_parent_id is not guaranteed however, in certain edge cases. See e.g. https://phabricator.wikimedia.org/T193211
Anyone who uses this field care to weigh in?
Ariel
On Fri, Jan 17, 2020 at 10:52 AM Christopher Wolfram < chriscwolfram@gmail.com> wrote:
Hi,
Perhaps there is documentation about this, but I have looked for the past hour and haven’t found anything.
I was wondering if it is guaranteed that all revisions given in the enwiki-latest-pages-meta-history files are in order of parent->child->grandchild->… In a few examples, it looks like they follow this pattern. I ask because I need them in order and it would be nice if I didn’t have to do that with the <parentid> field.
Thank you, Christopher
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Thanks Ariel.
So the revisions are in order of revision id which are assigned sequentially, which means that the revisions are in chronological order. Are there situations, however, where the parent id gives a history that isn’t chronological? That is, is there a continuous chain of edits from the creation of a page to its current version, or can there be forks in the history? I’m not too familiar with how reverts work, but maybe by reverting to a previous version of a page you can end up with a revision whose parent is not just the previous revision chronologically.
Thanks, Christopher
On Jan 17, 2020, at 6:07 AM, Ariel Glenn WMF ariel@wikimedia.org wrote:
The queries to get page and revision metadata are ordered by page id, and within each page, by revision id. This is guaranteed. The behavior of rev_parent_id is not guaranteed however, in certain edge cases. See e.g. https://phabricator.wikimedia.org/T193211 https://phabricator.wikimedia.org/T193211
Anyone who uses this field care to weigh in?
Ariel
On Fri, Jan 17, 2020 at 10:52 AM Christopher Wolfram <chriscwolfram@gmail.com mailto:chriscwolfram@gmail.com> wrote: Hi,
Perhaps there is documentation about this, but I have looked for the past hour and haven’t found anything.
I was wondering if it is guaranteed that all revisions given in the enwiki-latest-pages-meta-history files are in order of parent->child->grandchild->… In a few examples, it looks like they follow this pattern. I ask because I need them in order and it would be nice if I didn’t have to do that with the <parentid> field.
Thank you, Christopher
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org mailto:Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Normal editing won’t cause issues. But a delete /move/restore history merge can cause things to look out of order if you are using child/parent
On Fri, Jan 17, 2020 at 8:52 PM Christopher Wolfram chriscwolfram@gmail.com wrote:
Thanks Ariel.
So the revisions are in order of revision id which are assigned sequentially, which means that the revisions are in chronological order. Are there situations, however, where the parent id gives a history that isn’t chronological? That is, is there a continuous chain of edits from the creation of a page to its current version, or can there be forks in the history? I’m not too familiar with how reverts work, but maybe by reverting to a previous version of a page you can end up with a revision whose parent is not just the previous revision chronologically.
Thanks, Christopher
On Jan 17, 2020, at 6:07 AM, Ariel Glenn WMF ariel@wikimedia.org wrote:
The queries to get page and revision metadata are ordered by page id, and within each page, by revision id. This is guaranteed. The behavior of rev_parent_id is not guaranteed however, in certain edge cases. See e.g. https://phabricator.wikimedia.org/T193211
Anyone who uses this field care to weigh in?
Ariel
On Fri, Jan 17, 2020 at 10:52 AM Christopher Wolfram < chriscwolfram@gmail.com> wrote:
Hi,
Perhaps there is documentation about this, but I have looked for the past hour and haven’t found anything.
I was wondering if it is guaranteed that all revisions given in the enwiki-latest-pages-meta-history files are in order of parent->child->grandchild->… In a few examples, it looks like they follow this pattern. I ask because I need them in order and it would be nice if I didn’t have to do that with the <parentid> field.
Thank you, Christopher
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
xmldatadumps-l@lists.wikimedia.org