Yes, manually matching is fairly simple but in the worst case you need
to iterate over n-1 talk pages (where n is the total number of talk
pages of a Wikipedia) to find the talk page that belongs to a user
page when using the dump files. Hence, if the dump file would contain
for each article a tag with talk page id then it would significantly
reduce the processing time.
On Sat, Jan 8, 2011 at 11:39 AM, Bryan Tong Minh
On Sat, Jan 8, 2011 at 5:32 PM, John
its just a matter of matching page titles, if
there is a page in namespace 0
and a page in namespace (article and article talk) with the same title they
go together. its fairly simple
To expand John's comment, the talk page is always the page with the
same title, but with a namespace number 1 higher.
Wikitech-l mailing list
<a href="http://about.me/diederik">Check out my about.me