Chad schrieb:
On Sat, Mar 14, 2009 at 1:33 AM, jidanni@jidanni.org wrote:
Gentlemen, it occurred to me that under close examination one finds that when making a backup of one's wiki's database, some of the tables dumped have various degrees of temporariness, and thus though needing to be present in a proper dump, could perhaps be emptied of their values, saving much space in the SQL.bz2 etc. file produced.
Looking at the mysqldump man page, one finds no perfect options to do so, so instead makes one's own script:
$ mysqldump my_database| perl -nwle 'BEGIN{$dontdump="wiki_(objectcache|searchindex)"} s/(^-- )(Dumping data for table `$dontdump`$)/$1NOT $2/; next if /^LOCK TABLES `$dontdump` WRITE;$/../^UNLOCK TABLES;$/; print;'
Though not myself daring to make any recommendations on http://www.mediawiki.org/wiki/Manual:Backing_up_a_wiki#Tables I am still curious which tables can be emptied always, which can be emptied if one is willing to remember to run a maintenance script to resurrect their contents, etc.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Really, the only 3 that are needed are page, revision and text. Of course, to keep old versions of stuff you'll need archive, oldimage and filearchive too.
That would be sufficient to keep page content. You would however also want to keep the user and user_groups tables, probably. and the interwiki table, for it can not be restored and determins the interpretation of link prefixes. The log, too, can't be restored, but if you need it is another question. The image table can generally be restored by looking at the files, though the result may not be exactly the same as the original.
I think it's better to put it this way: tables with "cache" in their name can safely be truncated, the same is true for the profiling table (if used at all). Tables with "link" in their name can always be rebuild, though it may take a while, same for the searchindex.
-- daniel