Yes, manually matching is fairly simple but in the worst case you need to iterate over n-1 talk pages (where n is the total number of talk pages of a Wikipedia) to find the talk page that belongs to a user page when using the dump files. Hence, if the dump file would contain for each article a tag with talk page id then it would significantly reduce the processing time. Diederik
On Sat, Jan 8, 2011 at 11:39 AM, Bryan Tong Minh bryan.tongminh@gmail.com wrote:
On Sat, Jan 8, 2011 at 5:32 PM, John phoenixoverride@gmail.com wrote:
its just a matter of matching page titles, if there is a page in namespace 0 and a page in namespace (article and article talk) with the same title they go together. its fairly simple
To expand John's comment, the talk page is always the page with the same title, but with a namespace number 1 higher.
Bryan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l