Hello everyone, I am trying to do a batch upload using the GW Toolset but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
Any ideas as to why this doesn't work? (site is white-listed ect)
Thanks
Jason
Not only the site is whitelisted, but I've manually tested upload by URL using [[Special:Upload]] ant it works.
The URL contains two non alphanumeric elements in path : "2.0" and "1000," so I'd guess it's where to find an explanation
Hypothesis 1: GWT considers the extension is ".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
Hypothesis 2: GWT stop to parse the URL before the comma and want to handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000, where there isn't any extension and is 404.
The dot in URL is rather frequent, the comma less, so the second hypothesis is more plausible.
On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans jason.evans@llgc.org.uk wrote:
Hello everyone, I am trying to do a batch upload using the GW Toolset but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
Any ideas as to why this doesn't work? (site is white-listed ect)
Thanks
Jason
Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
That's very helpful, thank you very much.
On Thursday, February 4, 2016 17:14 GMT, Sébastien Santoro dereckson@espace-win.org wrote:
Not only the site is whitelisted, but I've manually tested upload by
URL using [[Special:Upload]] ant it works.
The URL contains two non alphanumeric elements in path : "2.0" and "1000," so I'd guess it's where to find an explanation
Hypothesis 1: GWT considers the extension is ".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
Hypothesis 2: GWT stop to parse the URL before the comma and want to
handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000, where there isn't any extension and is 404.
The dot in URL is rather frequent, the comma less, so the second hypothesis is more plausible.
On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans jason.evans@llgc.org.uk wrote:
Hello everyone, I am trying to do a batch upload using the GW Toolset but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
Any ideas as to why this doesn't work? (site is white-listed ect)
Thanks
Jason
Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Sébastien Santoro aka Dereckson http://www.dereckson.be/
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Thanks for the report Jason, I filed this on Phabricator : < https://phabricator.wikimedia.org/T125846%3E
2016-02-04 17:18 GMT+00:00 Jason J. Evans jason.evans@llgc.org.uk:
That's very helpful, thank you very much.
On Thursday, February 4, 2016 17:14 GMT, Sébastien Santoro < dereckson@espace-win.org> wrote:
Not only the site is whitelisted, but I've manually tested upload by
URL using [[Special:Upload]] ant it works.
The URL contains two non alphanumeric elements in path : "2.0" and "1000," so I'd guess it's where to find an explanation
Hypothesis 1: GWT considers the extension is ".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
Hypothesis 2: GWT stop to parse the URL before the comma and want to
handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000, where there isn't any extension and is 404.
The dot in URL is rather frequent, the comma less, so the second hypothesis is more plausible.
On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans jason.evans@llgc.org.uk
wrote:
Hello everyone, I am trying to do a batch upload using the GW Toolset
but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
Any ideas as to why this doesn't work? (site is white-listed ect)
Thanks
Jason
Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Sébastien Santoro aka Dereckson http://www.dereckson.be/
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
I think its more likely gwtoolset looks at the mime type set in the Content-type header set by the webserver.
-- Bawolff
On Thursday, February 4, 2016, Sébastien Santoro dereckson@espace-win.org wrote:
Not only the site is whitelisted, but I've manually tested upload by URL using [[Special:Upload]] ant it works.
The URL contains two non alphanumeric elements in path : "2.0" and "1000," so I'd guess it's where to find an explanation
Hypothesis 1: GWT considers the extension is ".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
Hypothesis 2: GWT stop to parse the URL before the comma and want to handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000, where there isn't any extension and is 404.
The dot in URL is rather frequent, the comma less, so the second hypothesis is more plausible.
On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans jason.evans@llgc.org.uk
wrote:
Hello everyone, I am trying to do a batch upload using the GW Toolset
but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
Any ideas as to why this doesn't work? (site is white-listed ect)
Thanks
Jason
Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Sébastien Santoro aka Dereckson http://www.dereckson.be/
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
2016-02-04 19:31 GMT+00:00 Brian Wolff bawolff@gmail.com:
I think its more likely gwtoolset looks at the mime type set in the Content-type header set by the webserver.
(Disclaimer: I checked out the GWToolset for the first time today).
According to the documentation of *getFileExtension *in *includes/Handlers/UploadHandler.php* (assuming I’m looking a the right place); it is doing both:
attempts to get the file extension of a media file url using the
$options provided. it will first look for a valid file extension in the
url; if none is found it will fallback to an appropriate file extention
based on the content-type
On Thu, Feb 4, 2016 at 2:44 PM, Jean-Frédéric jeanfrederic.wiki@gmail.com wrote:
2016-02-04 19:31 GMT+00:00 Brian Wolff bawolff@gmail.com:
I think its more likely gwtoolset looks at the mime type set in the Content-type header set by the webserver.
(Disclaimer: I checked out the GWToolset for the first time today).
According to the documentation of getFileExtension in includes/Handlers/UploadHandler.php (assuming I’m looking a the right place); it is doing both:
attempts to get the file extension of a media file url using the
$options provided. it will first look for a valid file extension in the
url; if none is found it will fallback to an appropriate file extention
based on the content-type
I just checked this file. The webserver is indeed misconfigured, and returning a mime type of image/jpg (Correct mime type is image/jpeg).
I believe that code you're referencing is used to determine the extension to give the image when uploading it. e.g. jpg files are allowed to have the extension jpg, jpeg or even jpe. So it uses the url to decide between which of those three alternatives to use, in the case the mime type sent is image/jpeg. But if a different mime type is sent, then it won't consider .jpg to be a valid extension for that type.
The code comment itself is a bit misleading. If you look at the actual code [Simplifying things to make it clearer] $result = null; ... $pathinfo['extension'] = <extension from url> if ( in_array( $pathinfo['extension'], $wgFileExtensions ) && strpos( $MimeMagic->getTypesForExtension( $pathinfo['extension'] ), $options['content-type'] ) !== false ) { // So, if the extension from url is in $wgFileExtensions (Allowed upload extensions) // And, when we get the list of possible mime types for that extension, one of them matches what was sent by the server // then use the extension from the url $result = $pathinfo['extension']; } elseif ( !empty( $options['content-type'] ) ) { // Otherwise, just use the default extension for this mime type. $result = explode( ' ', $MimeMagic->getExtensionsForType( $options['content-type'] ) );
if ( !empty( $result ) ) { $result = $result[0]; } } ----
So what happens in this case. Image has mime type of image/jpg, which is not a real mime type. First it tries to use the extension from the webserver (.jpg), but when it checks if .jpg is a valid extension for that mime type, it determines no it isn't (Since there is no valid extensions for non-existent mime types), so it discards that option. Then it tries to use the default extension for the given mime type, but again fails since there are no default extensions for the non-existent mime type. And thus the check overall fails.
-- -bawolff