[Wikipedia-l] Image uploads now undeletable

Gregory Maxwell gmaxwell at gmail.com
Fri Jun 16 06:06:37 UTC 2006


On 6/16/06, Tim Starling <t.starling at physics.unimelb.edu.au> wrote:
> Deleted text has always been much, much smaller than the non-deleted text,
> and the old image directories have always been much smaller than the current
> image directories.
[snip]

Just to interject some data into the discussion.

There are 489678 image pages on enwiki. There are 207320 distinct
image page titles which have been deleted. If we presume that deleted
image pages with deletions have the same number of image versions and
similar image sizes then we can say roughly that preserving deleteds
would have increased our enwiki storage requirements by 50%. Because a
substantial chunk of these may have been deletions from objects moved
to commons, past performance may not indicate future results. Although
with 73,214 (vs 148k uploads) image pages deleted on enwiki in the
first three months of the year, we can expect the future storage
requirements to be non-trivial.

Commons on the other hand has a far smaller number of deleted image
pages (~40k), no doubt due to the sites lower profile for uninformed
users leading to less problems images, the lack of a migration causing
deletions, and less vigorous copyright enforcement.

In any case even if keeping deleteds were to double our image capacity
requirements they may well be worth it (consider detecting reuploads
of previously deleted images, even modified copies of previously
deleted images) ... but lets not forget to calculate the real cost of
storage (it's not just the $120 that a 320gb disk costs... think of
backup, etc).



More information about the Wikipedia-l mailing list