You want the page_props table, and look for entries with the string
'hiddencat' for pp_propname. (*-page_props.sql.gz)
Ariel
Στις 10-01-2013, ημέρα Πεμ, και ώρα 09:58 -0800, ο/η Robert Crowe
έγραψε:
Perfect! Thanks Ariel. What is the best way to
distinguish hidden categories? I see that the category table used to have a cat_hidden
column, but that's been removed.
Robert
-----Original Message-----
From: Ariel T. Glenn [mailto:ariel@wikimedia.org]
Sent: Thursday, January 10, 2013 3:34 AM
To: Robert Crowe
Cc: xmldatadumps-l(a)lists.wikimedia.org
Subject: Re: [Xmldatadumps-l] Which files do I need?
If you are just trying to get at the structure from the various dump files, the page
table has page ids, titles, and whether the page is a redirect or not (*-page.sql.gz), the
category table has category names, ids, and summary information (*-category.sql.gz), and
categorylinks has the list of all category links in a page, with the page id and the
category name (*-categorylinks.sql.gz). You can find details on the tables here:
http://www.mediawiki.org/wiki/Manual:Categorylinks_table
(here's the category:
http://www.mediawiki.org/wiki/Category:MediaWiki_database_tables )
Hopefully this should get you started.
Ariel
Στις 09-01-2013, ημέρα Τετ, και ώρα 10:51 -0800, ο/η Robert Crowe
έγραψε:
I'd like to mirror just the category
structure of the English
Wikipedia, and I'm wondering which of the dump files I need to start
with.
I don't need the page content, just the page names, and only for the
most current revision. I need the categories and category members,
and I'd like to exclude hidden categories. I also need to distinguish
redirects, because I don't want to treat them as separate pages. As
much as possible I'd like to work with SQL files, but I can crunch
through XML if necessary.
So which files do I need to download? I may also need some help in
understanding the schemas.
Thanks,
Robert
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l