Den 23-08-2013 17:45, A B skrev:
Apart of the problems I have related to speed
inserting page table
data from the SQL dump, there is a thing I don't understand. Why
there are about 30 millions "INSERT INTO page" in page.sql but only
about 10 millions in page-articles.sql dump?
The page.sql file have a table row for each page in the wiki.
I do not know any page-articles.sql file, but maybe you mean the
pages-articles.xml file which have info about pages
including the actual page content but only for pages in certain
namespaces (articles and some others) in xml format. The
pages-meta-current.xml file are similar and contains info all pages.
My goal is to have all pages with its correct
Then it is easiest to get the page_len field in the page.sql files. The
much larger pages-meta-current.xml which includes all page content can
also be used, but you will have to count the lengths yourself.