Hi again guys! I am still digging into wiki categories. As Brad said, there is not an hierarchy. I basically extracted categories from categorilinks.sql
I have questions about the results I obtained, which is in the form #id_source_category title_target_category Problem I have is that the directed graph is actually "mixed": sometimes you have a real sub-category pointing to a parent, sometimes viceversa so I can't understand the importance
- is there (maybe in another dump) a mark or parameter (smtg like "ns" maybe ? ) telling you which directory is head respect to another? As example, take "http://en.wikipedia.org/wiki/Category:World_War_II" It is written "this is root category...."
- the other question is about results: http://en.wikipedia.org/wiki/Category:World_War_II collects for sub-categories which I have not found in categorilinks.sql e.g. for WWII I otain the conflicts but not other ones.
Am I missing smthing ?
On Thu, May 30, 2013 at 7:49 PM, Pavan Kapanipathi pavan@knoesis.org wrote:
Hi Yury, I already tried this and there are quiet a few links missing from some categories to its parent categories. So i opted not to use dbpedia.
Thanks, Pavan
On Thu, May 30, 2013 at 11:51 AM, Yury Katkov katkov.juriy@gmail.com wrote:
The simplest way is of course to use Dbpedia for that and create a SPARQL query that will give you all the skos:broader terms, e.g. http://dbpedia.org/page/Category:Italian_Roman_Catholics
Yury Katkov, WikiVote
On Thu, May 30, 2013 at 1:05 PM, Luigi Assom luigi.assom@gmail.com wrote:
Thank you Mark,
I was indeed looking for the tables to construct that structure. You also confirm categorylinks and categorypages are enough?
On Thu, May 30, 2013 at 10:49 AM, Luigi Assom luigi.assom@gmail.com wrote:
Oh I see! (50K+ was based on a hold paper I read... 1M+ hehe wiki is growing :) )
thank you for the suggestions to the tables!
On Thu, May 30, 2013 at 1:33 AM, Petr Onderka gsvick@gmail.com wrote:
actually is there a schema of the graph? you said "directed", hence there should some hierarchy in it too.'
I'm not sure I understand what you're asking, but the hierarchy is based on the relation between a category and its subcategories. In more technical terms, in the directed graph of categories, edges are from a category to its subcategories.
I read there are 50K+ categories, maybe is there any list of "directions" aka "links" of how categories are connected?
Actually, there seem to be 1M+ categories (though not all of them are normal article categories). And category-subcategory links is exactly what the categorylinks dump contains, along with category-page links.
Petr Onderka [[en:User:Svick]]
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
-- Luigi Assom
Skype contact: oggigigi
-- Luigi Assom
Skype contact: oggigigi
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api