Yes, manually matching is fairly simple but in the worst case you need
to iterate over n-1 talk pages (where n is the total number of talk
pages of a Wikipedia) to find the talk page that belongs to a user
page when using the dump files. Hence, if the dump file would contain
for each article a tag with talk page id then it would significantly
reduce the processing time.
Diederik
On Sat, Jan 8, 2011 at 11:39 AM, Bryan Tong Minh
<bryan.tongminh(a)gmail.com> wrote:
On Sat, Jan 8, 2011 at 5:32 PM, John
<phoenixoverride(a)gmail.com> wrote:
its just a matter of matching page titles, if
there is a page in namespace 0
and a page in namespace (article and article talk) with the same title they
go together. its fairly simple
To expand John's comment, the talk page is always the page with the
same title, but with a namespace number 1 higher.
Bryan
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
<a href="http://about.me/diederik">Check out my about.me
profile!</a>