On Sat, Mar 14, 2009 at 1:33 AM, jidanni@jidanni.org wrote:
Gentlemen, it occurred to me that under close examination one finds that when making a backup of one's wiki's database, some of the tables dumped have various degrees of temporariness, and thus though needing to be present in a proper dump, could perhaps be emptied of their values, saving much space in the SQL.bz2 etc. file produced.
Looking at the mysqldump man page, one finds no perfect options to do so, so instead makes one's own script:
$ mysqldump my_database| perl -nwle 'BEGIN{$dontdump="wiki_(objectcache|searchindex)"} s/(^-- )(Dumping data for table `$dontdump`$)/$1NOT $2/; next if /^LOCK TABLES `$dontdump` WRITE;$/../^UNLOCK TABLES;$/; print;'
Though not myself daring to make any recommendations on http://www.mediawiki.org/wiki/Manual:Backing_up_a_wiki#Tables I am still curious which tables can be emptied always, which can be emptied if one is willing to remember to run a maintenance script to resurrect their contents, etc.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Really, the only 3 that are needed are page, revision and text. Of course, to keep old versions of stuff you'll need archive, oldimage and filearchive too.
For each table you remove from a dump, that's data you're losing. With some tables (recentchanges, *links) you can repopulate the data. Others you can't. I guess it comes down to deciding which data is important for you to backup.
-Chad