Nick Jenkins wrote:
Brion Vibber wrote:
One possibility is to embed the timestamp into the URL. So the goatse version might be: http://upload.wikimedia.org/wikipedia/en/2005/10/23/074223/Puppy.jpg
and the reverted image would get a different URL, a few minutes later: http://upload.wikimedia.org/wikipedia/en/2005/10/23/074506/Puppy.jpg
Alternative but very similar idea would be to embed the revision number in the URL, instead of the upload timestamp:
Example original: http://upload.wikimedia.org/wikipedia/en/P/1/Puppy.jpg
Example revised: http://upload.wikimedia.org/wikipedia/en/P/2/Puppy.jpg
We had a lively discussion on in #wikimedia-tech on this subject; as well as the revision ID numbers another possibility discussed was using a content hash.
A content hash has the additional advantage that duplicate file versions only need to be stored once; for instance currently when reverting a file it makes a new copy of the file on the filesystem, which wastes space. (However you then need to be careful about deleting.)
So you might have something like: http://upload.wikimedia.org/584/590/5845907fdfc6eb1125129c4ce0da0704c496a7e4...
Obviously a disadvantage is that the filenames are ugly. One might tack a 'pretty' but ignored filename on the end, using rewrites or whatever tool to drop it on the backend:
http://upload.wikimedia.org/584/590/5845907fdfc6eb1125129c4ce0da0704c496a7e4...
This does though complicate the server configuration; I think a goal should be making it very easy to set up a file mirror that we can actually send requests to. Arbitrary filename additions may also have security implications for broken browsers like Internet Explorer which like to interpet filetype information out of the "extension" on the URL.
Lastly, it's easy for a human with the URL to see what revisions come before/after by incrementing/decrementing the digit in the URL, whereas the date and time of the upload of a previous revision cannot be predicted just from the image name.
That might be kind of neat, but requires maintaining a consistent revision sequence _within_ each image. If using revision numbers, it's easier to work with the global row id numbers as the database can guarantee their uniqueness.
-- brion vibber (brion @ pobox.com)