Hi
I am trying to pull up a list of pageids from the snapshot, which belong to a specific category. Basically I am trying to pull up pages which are on book portal pages. I looked at the snapshot mysql db to see which tables i can use. but the fields of tables category and category_links didnt make any sense to me in regard to what they stand for. so i was wondering if somebody could help me with the sql.
thanks priyank
On Fri, Sep 16, 2011 at 1:33 AM, priyank bagrecha bagi.priyank@gmail.com wrote:
Hi
I am trying to pull up a list of pageids from the snapshot, which belong to a specific category. Basically I am trying to pull up pages which are on book portal pages. I looked at the snapshot mysql db to see which tables i can use. but the fields of tables category and category_links didnt make any sense to me in regard to what they stand for. so i was wondering if somebody could help me with the sql.
categorylinks is fairly simple.
cl_to is the name of a category cl_from is the ID of a page. This refers to page_id in the page table, where you can find the page name (by combining page_namespace and page_title) The other fields are various things you probably don't need
Roan Kattouw (Catrope)
priyank bagrecha wrote:
I am trying to pull up a list of pageids from the snapshot, which belong to a specific category. Basically I am trying to pull up pages which are on book portal pages. I looked at the snapshot mysql db to see which tables i can use. but the fields of tables category and category_links didnt make any sense to me in regard to what they stand for. so i was wondering if somebody could help me with the sql.
Example category: "Mathematics" Example namespace: "Portal" (100)
SELECT page_namespace, page_title FROM page JOIN categorylinks ON cl_from = page_id WHERE cl_to = 'Mathematics' AND page_namespace = 100 AND page_is_redirect = 0 /* optional */;
* http://www.mediawiki.org/wiki/Manual:Page_table * http://www.mediawiki.org/wiki/Manual:Categorylinks_table * http://www.mediawiki.org/wiki/Manual:Category_table
You don't really need the category table in this case because you just want relationships (links). If you wanted meta-data about particular categories, you could (left) join against the category table on cat_name = cl_to.
In order to figure out which namespace ID (integer) corresponds to your target namespace, you can check the wiki configuration (LocalSettings.php and DefaultSettings.php) or the wiki's API.
MZMcBride
wikitech-l@lists.wikimedia.org