The wiki is now about 10 GB. It does compress to about 1 GB. Although
text pages are saved as their differences, as far as I know new versions
of images are saved in their entirety. Even saving only diffs, hundreds
of revisions of thousands of pages does add up.
I want to do a daily off-site backup. This wiki is on a shared virtual
host without command line access, thus no ability to use rsync over ssh,
which would allow only the changes to be moved. An entire image of the
system needs to be saved so that it can be restored fairly painlessly. I
want to shrink this as much as possible, since I am already running into
problems with the archiving of the system on the host due to size. For
example, I have had to compress each branch of the images folder
individually - gzipping it into a single archive fails, despite the
hosting company having bumped timeouts and memory allowances.
Moving to a host with complete command line access might be a solution,
but currently the hosting company deals with security issues (beyond
allowing only authorized users to log in of course). If we go to
something like rackspace, then security becomes our problem, and I don't
pretend to have sufficient expertise in that.
Suggestions and alternatives welcome.
----------------------------------------------------------------------
Message: 1
Date: Fri, 5 Feb 2016 16:28:49 +0000
From: Daniel Barrett <danb(a)cimpress.com>
To: MediaWiki announcements and site admin list
<mediawiki-l(a)lists.wikimedia.org>
Subject: Re: [MediaWiki-l] Permanently remove old revisions and unused
files?
Message-ID:
<AM3PR01MB3424C911FCB6549DCD3717ABCD20(a)AM3PR01MB342.eurprd01.prod.exchangelabs.com>
Content-Type: text/plain; charset=UTF-8
Mickey,
What business problem are you trying to solve by deleting old revisions of articles? They
don’t take up much disk space, and they aren’t visible unless you intentionally go looking
for them (with the View History tab). Is the problem just personal taste -- you don't
like seeing so many revisions -- or is there some other business reason? Note: If you
don’t want users to see revisions at all, you could hide the View History tab with a line
in Mediawiki:Vector.css (or Common.css), at least as a first step:
#ca-history { display:none; }
If the old versions are a security risk, there is feature to hide (not delete) particular
revisions:
https://www.mediawiki.org/wiki/Manual:RevisionDelete.
Regarding removal of unused, uploaded files, here is a SQL query that (I believe) lists
all unused files that are more than 90 days old. (Critiques are welcome.) You can then
feed the list to the script "maintenance/deleteBatch.php" supplied with
Mediawiki to delete them.
select
concat('File:', p.page_title) as 'unused file'
from
wp_page p
left outer join wp_imagelinks il on (il.il_to = p.page_title)
inner join wp_image i on (i.img_name = p.page_title)
where
il.il_to is null
and datediff(now(), i.img_timestamp) > 90
DanB
================
From: MediaWiki-l [mailto:mediawiki-l-bounces@lists.wikimedia.org] On Behalf Of Mickey
Feldman
Sent: Thursday, February 04, 2016 3:38 PM
To: mediawiki-l(a)lists.wikimedia.org
Subject: [MediaWiki-l] Permanently remove old revisions and unused files?
I have been looking for an extension or process to remove all revisions
of pages "older than _date_" or "all but the last _n_", but have not
found anything close.
This is a private corporate wiki used for internal documentation. Pages
evolve, but then generally stabilize and are then only for reference and
rarely edited. There is no need to keep the 100's of revisions that grew
them to their final form.
Likewise, there are older and unused versions of uploaded files that are
just clutter.
Extension:Nuke does not meet this need.
Extension:DeleteBatch doesn't either.
Extension:DeletePagePermanently - nope.
There are maintenance scripts for Deleting Archived revisions and
purging old text - also not what I'm looking for.
So far I'm finding no way to do this other than manually, one page at a
time, which is a no go. There are 10s of thousands of pages.
I may have to write a new extension from scratch, but I'm finding it
hard to believe this functionality does not already exist.
Have I overlooked something obvious? Am I the only one who has wanted
something like this?
Thanks in advance.