priyank bagrecha wrote:
I am trying to pull up a list of pageids from the
snapshot, which belong to
a specific category. Basically I am trying to pull up pages which are on
book portal pages. I looked at the snapshot mysql db to see which tables i
can use. but the fields of tables category and category_links didnt make any
sense to me in regard to what they stand for. so i was wondering if somebody
could help me with the sql.
Example category: "Mathematics"
Example namespace: "Portal" (100)
SELECT
page_namespace,
page_title
FROM page
JOIN categorylinks
ON cl_from = page_id
WHERE cl_to = 'Mathematics'
AND page_namespace = 100
AND page_is_redirect = 0 /* optional */;
*
http://www.mediawiki.org/wiki/Manual:Page_table
*
http://www.mediawiki.org/wiki/Manual:Categorylinks_table
*
http://www.mediawiki.org/wiki/Manual:Category_table
You don't really need the category table in this case because you just want
relationships (links). If you wanted meta-data about particular categories,
you could (left) join against the category table on cat_name = cl_to.
In order to figure out which namespace ID (integer) corresponds to your
target namespace, you can check the wiki configuration (LocalSettings.php
and DefaultSettings.php) or the wiki's API.
MZMcBride