On Tue, Nov 10, 2009 at 7:52 AM, Jeff Kubina <jeff.kubina(a)gmail.com> wrote:
1. What action will make an article get a new pageId?
Is it
only move/rename, a redirect, or a deletion and recreation, or are there
other ways this could happen? Can any of these changes be detected from the
stub-meta-history.xml files?
Normal deletion/undeletion, moving, or similar things will not create
a new page_id. However, there are a couple of things to be aware of:
1) In the old days, deleting an article and recreating it would assign
it a new page_id. This hasn't been true for several years.
2) It's still possible to get revisions associated with a different
page_id than they were originally written for, by deleting a page,
moving another page over it, and undeleting one or more revisions.
2. Is it possible for just one particular revision of
an article to be
deleted, maybe due to a copyright violation? If so, is just the content of
the revision deleted or would this include all the data associated with it,
so that the revision would not even appear in the stub-meta-history.xml
file?
Yes, an individual revision can be deleted. There are at least three
different ways to do this, last I checked. I would expect that the
old ways (oversight, and delete+selective undelete) would leave no
traces at all in the dump, while the new way (rev_deleted) might only
suppress certain fields. I'm not sure offhand, though.
3. Are pageIds recycled? If a page is deleted, could
its id number be used
for a completely new page in the future?
No. page_ids are handed out in strictly increasing order.