[Xmldatadumps-l] inter-page links in the data dump
Platonides
platonides at gmail.com
Thu Nov 17 16:14:11 UTC 2011
On 03/10/11 23:18, Greg Morrison wrote:
> Hi all,
>
> Thanks for the help! The pagelink.sql file sounds like exactly what I
> need. I've downloaded it and expanded it, and am trying to sort out
> the contents. Sorry to be dense here! It appears that it's a
> sequence of INSERT INTO `pagelinks` VALUES (n1,n2,name), where none of
> these values are unique. What do these elements correspond to?
>
> For a quick test, the randomly chosen "Bely_Iyus_River" occurs three
> times in the sql file, but the wiki page for Bely_Iyus_River has at
> least 5 outgoing links. So my guess is that the pagelinks table
> element (n1,n2,name) corresponds to an incoming link to the file name.
> Where's that link coming from, though? How do I use n1 and n2 to
> find the source?
>
> Thanks again!
You probably already found out by yourself, but just in case, and
keeping for the record:
The pages are identified in the database by a tuple (namespace, title)
So (0, 'Foo') is the article [[Foo]] but (1, 'Foo') is [[Talk:Foo]].
In your above question, (n2, name) is the article the link *points to*
(ie. an outgoing link to a -maybe missing- page).
n1 is the page_id of the page with that link. You need the page.sql file
to find out which (page_namespace, page_title) does n1 correspond to.
More information about the Xmldatadumps-l
mailing list