Hi folks,
Exec summary: I'm looking for help on a possible change needed to FileRepo::publish() API, since it currently makes some assumptions that seem to be incompatible with fixing bug 4421 ("Image file extension should not be part of the name"): https://bugzilla.wikimedia.org/show_bug.cgi?id=4421
I'm also more generally looking for feedback on my approach to 4421.
Details: A while back, I decided to try my hand at fixing bug 4421. It's a longstanding usability wart, and I figured it'd be a good way to re-familiarize myself with the codebase. It's mostly working, though working through all of the edge cases is a bit of a game of whack-a-mole. Most of the complexity comes from needing to store the files on the filesystem with appropriate file extensions, since these get served directly from the filesystem from Apache. Thus, there's a lot of convoluted logic for tacking on the file extension in the appropriate spots. You can read the comments on 4421 to understand why things are the way they are now.
An example of something that isn't working that I need advice on is this: with my modified version, it's possible to upload a jpeg to a location without an extension, then upload a png to that same location. The problem comes in LocalFile::publish(). Here's the call it makes from that function: $status = $this->repo->publish( $srcPath, $dstRel, $archiveRel, $flags );
This causes two things to happen: 1. copy $dstRel to $archiveRel 2. copy $srcPath to $dstRel.
The problem here is with uploading a png over the top of a jpg. For example, if the title name is "File:Foo", then the filename for the first version of the file will be "Foo.jpg", and the replacement will be "Foo.png". So, if we pass "Foo.jpg" to $dstRel, then step 1 works, but step 2 fails. If we pass "Foo.png", then the opposite problem occurs.
The only solution it would seem is to rework FileRepo::publish() to accommodate this case (adding another parameter). That doesn't seem *so* bad, though being an interface change. Here's what I'm hoping folks will weigh in on: a. If reworking FileRepo::publish() is the way to do it. b. If my general approach on 4421 seems sound, and to get a sense if there's going to be objections to landing this in trunk soon (presumably after 1.16). I've just updated my latest version here: https://bugzilla.wikimedia.org/attachment.cgi?id=6885&action=diff
Feel free to answer here or to ping me on IRC ("robla" on Freenode #mediawiki)
Thanks! Rob
Heh I've got this "bug", I suggested users not do stupid stuff like upload the same file name or they screw up things. Paying attention before you edit is a good thing IMO.
I like the idea of "obfuscating" the file names with a md5sum/timestamp/random_stuff replacing the file name. Then the whole issue is moot, you just need to add a pointer layer storing the new filesystem name to the original in-wiki name. It's not perfect but it sure cures a lot of headache.
FYI. I have to use the wgVerifyMimeType and wgMimeDetectorCommand="file -bi" on images. For whatever reason php's built in mime typing is busted on one of wiki's so to make images work I used these two variables. Contrary to comments in the bug stating otherwise it doesn't matter what the image file extension is, the browser gets the mime type sent to it and it processes it appropriately. If you go into your db and change the image mime type (major/minor fields in the image table) to text/plain the browser doesn't display the image regardless of the filename extension.
-----Original Message----- From: mediawiki-l-bounces@lists.wikimedia.org [mailto:mediawiki-l-bounces@lists.wikimedia.org] On Behalf Of Rob Lanphier Sent: Tuesday, December 15, 2009 1:43 PM To: MediaWiki Mailing List Subject: [Mediawiki-l] Bug 4421 - Image file extension should not be part of the name
Hi folks,
Exec summary: I'm looking for help on a possible change needed to FileRepo::publish() API, since it currently makes some assumptions that seem to be incompatible with fixing bug 4421 ("Image file extension should not be part of the name"): https://bugzilla.wikimedia.org/show_bug.cgi?id=4421
I'm also more generally looking for feedback on my approach to 4421.
Details: A while back, I decided to try my hand at fixing bug 4421. It's a longstanding usability wart, and I figured it'd be a good way to re-familiarize myself with the codebase. It's mostly working, though working through all of the edge cases is a bit of a game of whack-a-mole. Most of the complexity comes from needing to store the files on the filesystem with appropriate file extensions, since these get served directly from the filesystem from Apache. Thus, there's a lot of convoluted logic for tacking on the file extension in the appropriate spots. You can read the comments on 4421 to understand why things are the way they are now.
An example of something that isn't working that I need advice on is this: with my modified version, it's possible to upload a jpeg to a location without an extension, then upload a png to that same location. The problem comes in LocalFile::publish(). Here's the call it makes from that function: $status = $this->repo->publish( $srcPath, $dstRel, $archiveRel, $flags );
This causes two things to happen: 1. copy $dstRel to $archiveRel 2. copy $srcPath to $dstRel.
The problem here is with uploading a png over the top of a jpg. For example, if the title name is "File:Foo", then the filename for the first version of the file will be "Foo.jpg", and the replacement will be "Foo.png". So, if we pass "Foo.jpg" to $dstRel, then step 1 works, but step 2 fails. If we pass "Foo.png", then the opposite problem occurs.
The only solution it would seem is to rework FileRepo::publish() to accommodate this case (adding another parameter). That doesn't seem *so* bad, though being an interface change. Here's what I'm hoping folks will weigh in on: a. If reworking FileRepo::publish() is the way to do it. b. If my general approach on 4421 seems sound, and to get a sense if there's going to be objections to landing this in trunk soon (presumably after 1.16). I've just updated my latest version here: https://bugzilla.wikimedia.org/attachment.cgi?id=6885&action=diff
Feel free to answer here or to ping me on IRC ("robla" on Freenode #mediawiki)
Thanks! Rob
_______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
On Tue, Dec 15, 2009 at 11:11 AM, St. Laurent, Steven Steven.StLaurent@arrisi.com wrote:
Contrary to comments in the bug stating otherwise it doesn't matter what the image file extension is, the browser gets the mime type sent to it and it processes it appropriately. If you go into your db and change the image mime type (major/minor fields in the image table) to text/plain the browser doesn't display the image regardless of the filename extension.
Actually, the file extension matters in some cases, but it depends on context. When the URL to the image is included via img tag, most browsers will compensate for an incorrect media type, and display the image regardless of format. When deep linking to a URL (e.g. http://upload.wikimedia.org/wikipedia/commons/3/34/HTTP-Cookie-Google.png ), browsers won't do said compensation anymore.
The MIME type will be correctly stored in the database, but the database doesn't get read by the webserver that's actually serving the files.
Rob
Steven, I don't think you understood the original post.
Rob, I'd add another parameter to give current and archive filenames and temp and new filenames. You could do some magic based on the parameters extension, but that would be too ugly.
Currently, if a new image is uploaded, views of cached pages including the image (before rerendering) get the other image. If the new filename is different, they will give a 404 until those pages are reparsed. What about changing thumb.php to give a 301 to the new name on such case?
Hi Platonides,
Thanks for the tips. More below:
On Tue, Dec 15, 2009 at 1:55 PM, Platonides Platonides@gmail.com wrote:
Rob, I'd add another parameter to give current and archive filenames and temp and new filenames. You could do some magic based on the parameters extension, but that would be too ugly.
Yeah, I'm pretty much resigned to this. Just wanted to make sure there wasn't going to be feedback to make me rethink the whole approach.
Currently, if a new image is uploaded, views of cached pages including the image (before rerendering) get the other image. If the new filename is different, they will give a 404 until those pages are reparsed. What about changing thumb.php to give a 301 to the new name on such case?
Oh...good idea, and thanks for reminding me of something else I need to make sure to test (I might not be thanking you after the next raft of bugs I find) ;)
Rob
mediawiki-l@lists.wikimedia.org