Hi Ryan,
pages-meta-history hasn't been generated for enwiki in a while (it's gotten too big), so I can't tell you anything about it. We're importing pages-articles.xml (currently about 20 GB, 5 GB as bzip2) using mwdumper. We're using MyISAM, not InnoDB. The import takes about 8 hours, most of it (80%) for creating the indexes.
Besides pages-articles.xml, we also import categorylinks.sql, imagelinks.sql, image.sql, langlinks.sql and templatelinks.sql.
The MySQL database filled from all these files takes up 39 GB hard drive space. The largest file is text.MYD - about 20 GB.
With the indexes defined in tables.sql, query performance is ok. For example, selecting the titles of all articles that are not redirects takes five or ten minutes (didn't profile it exactly).
Hope that helps.
Christopher
On Fri, Nov 20, 2009 at 14:13, Ryan Chan ryanchan404@gmail.com wrote:
Hello,
Anyone has experience in importing enwiki database dump at http://download.wikimedia.org/backup-index.html into a real MySQL server?
- It seems pages-meta-history has the max. size in term of download,
how much storage space does it take when imported into a table? (including index) 2. What are the total storage needed for importing the whole enwiki? 3. Do you experience performance problem when querying the database, seems I think most table if over 10GB size? Any suggestion?
I need this imformation as I require to prepare budget plan to buy a proper server for doing the job.
Thank you.
Ryan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l