Hello Petr,
Thank you.
I want to obtain this result: a list of categories such as:
parent category -> subcategory
That's not true. Each link in categorylinks is
from a page (cl_from, this
can be a category page) to its parent category (cl_to).
Indeed I followed this structure, selecting only category pages and I obtained
e.g. 690070 (sub category "Futurama", cl_from) to parent category
"Comic_science_fiction"
e.g. 690451 (sub category "World War II", cl_from) to parent category
"Conflicts_in_1939"
so I my perception is the first example is correct, instead World War
II should be parent of Conflicts_in_1939
Am I right or am I missing / misunderstood something?
If I look at the categorylinks for that category
(cl_from = 690451) in the
latest dump, I get some “conflicts” categories, then some “Wars involving X”
categories and also few others, like “Modern Europe”. I don't see any parent
category that would be missing.
That's correct for Modern Europe but if you compare with the page
http://en.wikipedia.org/wiki/Category:World_War_II
you see some links are missing in the sql dump, such as:
http://en.wikipedia.org/wiki/Category:Military_deception_during_World_War_II
or
http://en.wikipedia.org/wiki/Category:Sociology_of_World_War_II
This is the list I obtain: the above sub categories are not present:
690451 1930s_conflicts
690451 1940s_conflicts
690451 20th-century_conflicts
690451 Categories_requiring_diffusion
690451 Conflicts_in_1939
690451 Conflicts_in_1940
690451 Conflicts_in_1941
690451 Conflicts_in_1942
690451 Conflicts_in_1943
690451 Conflicts_in_1944
690451 Conflicts_in_1945
690451 Global_conflicts
690451 Modern_Europe
690451 Wars_involving_Albania
690451 Wars_involving_Argentina
690451 Wars_involving_Australia
690451 Wars_involving_Austria
690451 Wars_involving_ .... [etc.]