Hi all.
Firstly, apologies for eventual duplicates or posting the question in the wrong mailing list.
Secondly, could anybody kindly explain to me if some Wikipedia pages changed their IDs from the past ? Or if so point to me where this might be documented ? I have Wikipedia pages-articles XML dumps from the years 2006 and 2008 and when I was parsing those dumps I ran across some situations such as the following one. In the dumps from 2006 and 2008 I found that the South Africa page has the ID 68854, while in the most current Wikipedia pages-articles XML dump (i.e. 2016) the same article has the ID 17416221. I am trying to match some Wiki pages by IDs across time, but the example above is not helping.
Much appreciated in advance for any help.
It looks like the page was deleted/restored thus giving it a new page ID. Originally when pages where deleted the page_id was not kept, which caused a new page_id to be issued when it was restored. This phenomenon has since been fixed, and should no longer happen.
On Sat, Dec 3, 2016 at 8:47 AM, Renato Stoffalette Joao joao@l3s.de wrote:
Hi all.
Firstly, apologies for eventual duplicates or posting the question in the wrong mailing list.
Secondly, could anybody kindly explain to me if some Wikipedia pages changed their IDs from the past ? Or if so point to me where this might be documented ? I have Wikipedia pages-articles XML dumps from the years 2006 and 2008 and when I was parsing those dumps I ran across some situations such as the following one. In the dumps from 2006 and 2008 I found that the South Africa page has the ID 68854, while in the most current Wikipedia pages-articles XML dump (i.e. 2016) the same article has the ID 17416221. I am trying to match some Wiki pages by IDs across time, but the example above is not helping.
Much appreciated in advance for any help.
-- Renato Stoffalette Joao
- PhD Student -
L3S Research Center / Leibniz Uni. 15th Floor, Room:1519 Appelstraße 9a 30167 Hannover, Germany +49.511.762-17759
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l