Uploaded files which are deleted can now be undeleted; admins can also view the deleted files without actually undeleting them. This is integrated into the existing Special:Undelete in what I hope is a fairly clear and intuitive manner.
It's my hope that this will encourage admins to tackle the deletion backlogs a little more aggressively, since mistakes will be easier to undo.
I've tested it both offline and on the live servers and everything seems fine so far, but if you do encounter problems please report them at http://bugzilla.wikimedia.org/
-- brion vibber (brion @ pobox.com)
Do you have a handle on how large the deleted content will be? Seems we could end up hosting a *lot* of deletions = lot of space.
On 6/16/06, Brion Vibber brion@pobox.com wrote:
Uploaded files which are deleted can now be undeleted; admins can also view the deleted files without actually undeleting them. This is integrated into the existing Special:Undelete in what I hope is a fairly clear and intuitive manner.
It's my hope that this will encourage admins to tackle the deletion backlogs a little more aggressively, since mistakes will be easier to undo.
I've tested it both offline and on the live servers and everything seems fine so far, but if you do encounter problems please report them at http://bugzilla.wikimedia.org/
-- brion vibber (brion @ pobox.com)
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Brad Patrick wrote:
Do you have a handle on how large the deleted content will be? Seems we could end up hosting a *lot* of deletions = lot of space.
Deleted text has always been much, much smaller than the non-deleted text, and the old image directories have always been much smaller than the current image directories. Storage space for commons is growing at a phenomenal rate, and deletable images are only a small part of that. We need to make sure we have an architecture which allows us to expand our capacity to multiple terabytes of files, while keeping costs down and maintaining some redundancy. Commons is going to get that big regardless of deletion policy.
We haven't reached consensus on exactly what software design we're going to use, but I believe we're largely in agreement that more amane-type machines would be useful. I've asked for another one or two depending on budget; I'm not sure what the status of that proposed order is.
-- Tim Starling
On 6/16/06, Tim Starling t.starling@physics.unimelb.edu.au wrote:
Deleted text has always been much, much smaller than the non-deleted text, and the old image directories have always been much smaller than the current image directories.
[snip]
Just to interject some data into the discussion.
There are 489678 image pages on enwiki. There are 207320 distinct image page titles which have been deleted. If we presume that deleted image pages with deletions have the same number of image versions and similar image sizes then we can say roughly that preserving deleteds would have increased our enwiki storage requirements by 50%. Because a substantial chunk of these may have been deletions from objects moved to commons, past performance may not indicate future results. Although with 73,214 (vs 148k uploads) image pages deleted on enwiki in the first three months of the year, we can expect the future storage requirements to be non-trivial.
Commons on the other hand has a far smaller number of deleted image pages (~40k), no doubt due to the sites lower profile for uninformed users leading to less problems images, the lack of a migration causing deletions, and less vigorous copyright enforcement.
In any case even if keeping deleteds were to double our image capacity requirements they may well be worth it (consider detecting reuploads of previously deleted images, even modified copies of previously deleted images) ... but lets not forget to calculate the real cost of storage (it's not just the $120 that a 320gb disk costs... think of backup, etc).
On 16/06/06, Gregory Maxwell gmaxwell@gmail.com wrote:
In any case even if keeping deleteds were to double our image capacity requirements they may well be worth it (consider detecting reuploads of previously deleted images, even modified copies of previously deleted images) ... but lets not forget to calculate the real cost of storage (it's not just the $120 that a 320gb disk costs... think of backup, etc).
Maybe they should just be deleted automatically after a 3 or 6 month inactive period or something.
~Mark
That is what I was thinking. I was imagining cleanup after no touches for a period of time. But this is a great advance!
On 6/16/06, Mark Ryan ultrablue@gmail.com wrote:
On 16/06/06, Gregory Maxwell gmaxwell@gmail.com wrote:
In any case even if keeping deleteds were to double our image capacity requirements they may well be worth it (consider detecting reuploads of previously deleted images, even modified copies of previously deleted images) ... but lets not forget to calculate the real cost of storage (it's not just the $120 that a 320gb disk costs... think of backup, etc).
Maybe they should just be deleted automatically after a 3 or 6 month inactive period or something.
~Mark _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Brad Patrick wrote:
That is what I was thinking. I was imagining cleanup after no touches for a period of time. But this is a great advance!
It's likely that there will not be a reason to actively delete archived files. The order of magnitude of kept images will likely remain much larger, making the usage insignificant.
(Among other things, deleted junk is often small images copied from other web sites, while original work is often multi-megapixel digital photos.)
If we ever do need to dump older files, of course, we can do so. Note that I've included the timestamp of the deletion in the file archive, so it would be relatively straightforward to, say, dump all files deleted more than one year ago.
-- brion vibber (brion @ pobox.com)
On 6/16/06, Mark Ryan ultrablue@gmail.com wrote:
Maybe they should just be deleted automatically after a 3 or 6 month inactive period or something.
~Mark _______________________________________________
Perhaps we could to auto-expire type deletion, Myth-tv style. Images wouldn't be deleted until a lack of space is detected. (Or if there is less than 100G free or so.) When there is a lack of space, the oldest unused images are deleted until the space condition is resolved. Kyle
On 6/16/06, Tim Starling t.starling@physics.unimelb.edu.au wrote: ....
We haven't reached consensus on exactly what software design we're going to use, but I believe we're largely in agreement that more amane-type machines would be useful. I've asked for another one or two depending on budget; I'm not sure what the status of that proposed order is.
-- Tim Starling
Pardon my cluelessness, but what is an "amane-type machine"? Checking [[Amane]] shows anime characters, but no sort of clustering scheme...
~maru
On 6/19/06, maru dubshinki marudubshinki@gmail.com wrote:
We haven't reached consensus on exactly what software design we're going to use, but I believe we're largely in agreement that more amane-type machines would be useful. I've asked for another one or two depending on budget; I'm not sure what the status of that proposed order is.
-- Tim Starling
Pardon my cluelessness, but what is an "amane-type machine"? Checking [[Amane]] shows anime characters, but no sort of clustering scheme...
You're looking for [[Nishi Amane]]. Actually, you're looking for http://meta.wikimedia.org/wiki/Wikimedia_servers which lists the current Wikimedia servers. Amane is simply the name given to one of them, which has the following specifications, according to the table:
amane 11/2005 NFS storage server Fedora Core 3U, 2x Xeon 3.4GHz, 8GB RAM, 16 x 400GB SATA disk
Erik
wikipedia-l@lists.wikimedia.org