The talk page's ID may change over time (as may the page ID causing the talk page to "change" association) when pages are deleted and undeleted or moved. You should use the (namespace, page title) pair as persistent unique identifiers for associating talk pages and content pages. It's pretty easy to add an index on those columns allowing you (at the cost of a bit of storage space) to look-up pages in log(n) time.
Conrad
pages of a Wikipedia) to find the talk page that belongs to a user page when using the dump files. Hence, if the dump file would contain for each article a tag with talk page id then it would significantly reduce the processing time. Diederik
On Sat, Jan 8, 2011 at 11:39 AM, Bryan Tong Minh bryan.tongminh@gmail.com wrote:
On Sat, Jan 8, 2011 at 5:32 PM, John phoenixoverride@gmail.com wrote:
its just a matter of matching page titles, if there is a page in namespace 0 and a page in namespace (article and article talk) with the same title they go together. its fairly simple
To expand John's comment, the talk page is always the page with the same title, but with a namespace number 1 higher.
Bryan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- <a href="http://about.me/diederik">Check out my about.me profile!</a>
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l