[Mediawiki-l] Uploading office docs

McHale, Nina Nina.McHale at ucdenver.edu
Thu Mar 10 19:19:05 UTC 2011


Jim, thanks so much for this thorough explanation, very helpful!

Does anyone have any experience with Microsoft's Word Add-in?

http://www.microsoft.com/downloads/en/details.aspx?FamilyID=8e519637-afb0-4134-a91f-7b0ebea8d933

Nina

Nina McHale, MA/MSLS
Assistant Professor, Web Librarian
University of Colorado, Auraria Library
Facebook & Twitter: ninermac
http://milehighbrarian.net



On 3/10/11 12:11 PM, "Sullivan, James (NIH/CIT) [C]" <sullivan at mail.nih.gov> wrote:

This is a known problem but I'm not sure where Mediawiki.org is on fixing it or whether they are attempting to fix it, though I have heard rumors of a fix in 1.17.  The problem stems from Microsoft (surprise!) having changed its format for Office 7 documents.  Its format gives its files a mime-type of a zip file, so when Mediawiki checks the file it sees a docx extension but when it looks at the mime-type of the file (a binary string near the beginning of the file) it sees what appears to be a zip file.  Thus the "corruption".  These checks help prevent users from uploading malicious files with innocent looking extensions, like uploading a nasty zip file with a jpg extension for example.

You can turn off mime-type checking with a switch in LocalSettings.php:
$wgVerifyMimeType = false;
and then Mediawiki will just check the extension, docx, see that it is allowed and allow the upload without checking the mime-type.  Of course this also turns off a good security function, so if you do this you should trust who is uploading documents to your wiki.  In my case, as a workaround, I ask users to email me when they want to upload Office 7 files, I turn off mime-type checking, then after the upload turn it back on.

Now this gets the file uploaded but does not fix the problem because once uploaded your docx file still looks like a zip file to Mediawiki, so if someone makes a link to the docx file on a wiki page and someone on a PC clicks on the link, instead of their PC using Word to open up the docx file you will have whatever opens up zip files come up, like WinZip.  Your users will need to download the file to their PCs to then open it in Word.  And saving docx files in "doc" format does not help.  They still have a zip mime type for some reason.

So, that is the long and short of it.  Dealing with Office 7 documents uploaded to mediawiki is problematic.  Assuming these files are not being put there to be edited by others (a bad method of document development anyway) I suggest having your users save their files in PDF format.  Then upload the PDF file to the wiki, which will avoid the whole Office-7-zip-mime-type problem.


More information about the MediaWiki-l mailing list