[Xmldatadumps-l] inter-page links in the data dump

Platonides platonides at gmail.com
Thu Nov 17 16:14:11 UTC 2011


On 03/10/11 23:18, Greg Morrison wrote:
> Hi all,
> 
> Thanks for the help!  The pagelink.sql file sounds like exactly what I
> need.  I've downloaded it and expanded it, and am trying to sort out
> the contents.  Sorry to be dense here!  It appears that it's a
> sequence of INSERT INTO `pagelinks` VALUES (n1,n2,name), where none of
> these values are unique.  What do these elements correspond to?
> 
> For a quick test, the randomly chosen "Bely_Iyus_River" occurs three
> times in the sql file, but the wiki page for Bely_Iyus_River has at
> least 5 outgoing links.  So my guess is that the pagelinks table
> element (n1,n2,name) corresponds to an incoming link to the file name.
>  Where's that link coming from, though?  How do I use n1 and n2 to
> find the source?
> 
> Thanks again!

You probably already found out by yourself, but just in case, and
keeping for the record:

The pages are identified in the database by a tuple (namespace, title)
So (0, 'Foo') is the article [[Foo]] but (1, 'Foo') is [[Talk:Foo]].

In your above question, (n2, name) is the article the link *points to*
(ie. an outgoing link to a -maybe missing- page).
n1 is the page_id of the page with that link. You need the page.sql file
to find out which (page_namespace, page_title) does n1 correspond to.



More information about the Xmldatadumps-l mailing list