I have questions about the results I obtained, which
is in the form
#id_source_category title_target_category
Problem I have is that the directed graph is actually "mixed":
sometimes you have a real sub-category pointing to a parent, sometimes
viceversa so I can't understand the importance
That's not true. Each link in categorylinks is from a page (cl_from, this
can be a category page) to its parent category (cl_to).
- is there (maybe in another dump) a mark or parameter
(smtg like "ns"
maybe ? ) telling you which directory is head respect to another?
As example, take "http://en.wikipedia.org/wiki/Category:World_War_II"
It is written "this is root category...."
That text just describes what belongs to this category. It has nothing to
do with the structure, that's always the same.
- the other question is about results:
http://en.wikipedia.org/wiki/Category:World_War_II collects for
sub-categories which I have not found in categorilinks.sql
e.g. for WWII I otain the conflicts but not other ones.
If I look at the categorylinks for that category (cl_from = 690451) in the
latest dump, I get some “conflicts” categories, then some “Wars involving
X” categories and also few others, like “Modern Europe”. I don't see any
parent category that would be missing.
Maybe it would help if you described the issue in more detail: what results
exactly are you getting, what results are you expecting and how do the two
differ.
Petr Onderka
[[en:User:Svick]]