Hi,
I'm preparing an image donation of some 350 picture books from 1810 to 1880 (taken from the collection http://www.geheugenvannederland.nl/?/en/collecties/prentenboeken_van_1810_t…)
For every book I've constructed an XML file describing the pages (metadata). So eg. for a book of 20 pages I've an XML with 20 records. I can upload these in the normal way via the GWToolset webinterface, also assigning a Commons category to the book.
For 1 book that's doable, but for 350 books I would need to upload 350 XML files, 1 by 1, using the GWT-webinterface (using the same json mapping file for all uploads). But this would take me a lot of time (and it's rather boring)...
So I'm wondering if / how I could automate this. Is there a more direct/efficient way?
I can image that I could do some command line interfacing (Pywiki??), with the XML, the json-mapping and the target Commonscat-name as input parameters. Would that be an option?
Any tricks, tips & directions are very welcome
Met vriendelijke groet / With kind regards
Olaf Janssen
Wikipedia & open data coordinator
Koninklijke Bibliotheek - National Library of the Netherlands
olaf.janssen(a)kb.nl<mailto:olaf.janssen@kb.nl>
+31 (0)70 3140 388
@ookgezellig
www.slideshare.net/OlafJanssenNL<http://www.slideshare.net/OlafJanssenNL>
[Koninklijke Bibliotheek, National Library of the Netherlands]
Prins Willem-Alexanderhof 5 | 2595 BE Den Haag
Postbus 90407 | 2509 LK Den Haag | (070) 314 09 11 | www.kb.nl<http://www.kb.nl/>
[http://www.kb.nl/sites/default/files/dots.jpg]
English version<http://www.kb.nl/en/email> | Disclaimer<http://www.kb.nl/disclaimer>
Hi Fae,
Thanks for this insight, this is very helpful! I was not yet aware enough of that option in the regular GTW-interface . I'll modify my code to create 1 large XML-dump....
Best,
Olaf
-----Original Message-----
From: Glamtools [mailto:glamtools-bounces@lists.wikimedia.org] On Behalf Of glamtools-request(a)lists.wikimedia.org
Sent: dinsdag 16 februari 2016 13:00
To: glamtools(a)lists.wikimedia.org
Subject: Glamtools Digest, Vol 35, Issue 5
Send Glamtools mailing list submissions to
glamtools(a)lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.wikimedia.org/mailman/listinfo/glamtools
or, via email, send a message with subject or body 'help' to
glamtools-request(a)lists.wikimedia.org
You can reach the person managing the list at
glamtools-owner(a)lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Glamtools digest..."
Today's Topics:
1. Re: [GWToolset] Uploading 350 books - via command line or
API? (Fæ)
----------------------------------------------------------------------
Message: 1
Date: Mon, 15 Feb 2016 12:53:18 +0000
From: Fæ <faewik(a)gmail.com>
To: Conversations revolving around the development of GLAM Digital
Tools <glamtools(a)lists.wikimedia.org>
Subject: Re: [Glamtools] [GWToolset] Uploading 350 books - via command
line or API?
Message-ID:
<CAH7nnD2MrU0GRA93j5oJCGfSfmn3288zJNeDQiYeqDFO1vrZNQ(a)mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
The easiest way is to merge your XML files into one large XML, then put this through the GWT. Though I can imagine ways of automating the front end of GWT, it would be a clumsy way of going about it.
If your concern is that you want to create separate book categories, then add a category field in the XML that can vary by book. You can add several variable categories as an option on the mappings page. For example <https://commons.wikimedia.org/wiki/File:DAILY_MENU_%28held_by%29_REVERE_HOU…>
was uploaded with the category "NYPL Rare Book Division" automatically generated from the NYPL metadata. (To be fair, I'm not using the GWT for most of the NYPL material for reasons mentioned on the related project page.)
Responding to J Hayes comment in this thread, you can mass upload to IA with off-the shelf Python modules. However just as much care should be taken to map out the metadata using IA's metadata options, as they are incredibly open/flexible, the archives tend to be confusingly inconsistent. This would still leave a challenge of finding a good mapping for Commons templates if you then wanted to upload from IA to Commons rather than from somewhere else.
Fae
On 15 February 2016 at 12:18, Olaf Janssen <Olaf.Janssen(a)kb.nl> wrote:
>
> Hi,
>
> I’m preparing an image donation of some 350 picture books from 1810 to
> 1880 (taken from the collection
> http://www.geheugenvannederland.nl/?/en/collecties/prentenboeken_van_1
> 810_tot_1950)
>
> For every book I’ve constructed an XML file describing the pages (metadata). So eg. for a book of 20 pages I’ve an XML with 20 records. I can upload these in the normal way via the GWToolset webinterface, also assigning a Commons category to the book.
>
>
>
> For 1 book that’s doable, but for 350 books I would need to upload 350
> XML files, 1 by 1, using the GWT-webinterface (using the same json
> mapping file for all uploads). But this would take me a lot of time
> (and it’s rather boring)…
>
>
>
> So I’m wondering if / how I could automate this. Is there a more direct/efficient way?
>
>
>
> I can image that I could do some command line interfacing (Pywiki??), with the XML, the json-mapping and the target Commonscat-name as input parameters. Would that be an option?
>
>
>
> Any tricks, tips & directions are very welcome
>
>
>
>
>
> Met vriendelijke groet / With kind regards
>
>
>
> Olaf Janssen
>
>
>
> Wikipedia & open data coordinator
>
>
>
> Koninklijke Bibliotheek - National Library of the Netherlands
> olaf.janssen(a)kb.nl
>
> +31 (0)70 3140 388
> @ookgezellig
>
> www.slideshare.net/OlafJanssenNL
--
faewik(a)gmail.com https://commons.wikimedia.org/wiki/User:Fae
------------------------------
Subject: Digest Footer
_______________________________________________
Glamtools mailing list
Glamtools(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools
------------------------------
End of Glamtools Digest, Vol 35, Issue 5
****************************************
Hi all,
I would like to ask for Beta GWToolset user rights to test a bulk upload of
images from the Catharijneconvent in the Netherlands.
My username on Beta is AWossink
Regards,
Arne Wossink
Projectleider / Project Lead Wikimedia Nederland
*(Werkdagen: maandag, dinsdag, donderdag / Office hours: Monday, Tuesday,
Thursday)*
Tel. +31 (0)6 11000505
E-mail: wossink(a)wikimedia.nl
*Postadres*: * Bezoekadres:*
Postbus 167 Mariaplaats 3
3500 AD Utrecht Utrecht
I can supply a control condition. The Bodleian Library URLs, which work without a problem, have the similar form:
http://iiif.bodleian.ox.ac.uk/iiif/image/e998e3d6-17e2-40ca-bf23-bc4278feb1…
Note that there is a comma but no dot before the filename. This suggests that commas are not the problem.
-----Original Message-----
------------------------------
Message: 2
Date: Thu, 4 Feb 2016 18:14:26 +0100
From: Sébastien Santoro <dereckson(a)espace-win.org>
To: Conversations revolving around the development of GLAM Digital
Tools <glamtools(a)lists.wikimedia.org>
Subject: Re: [Glamtools] GW Toolset problems
Message-ID:
<CAKg6iAEi1dFqSZDXAhRAHFeybTjCKnDb5fqsVX34dLQ=pb4GEA(a)mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Not only the site is whitelisted, but I've manually tested upload by
URL using [[Special:Upload]] ant it works.
The URL contains two non alphanumeric elements in path : "2.0" and
"1000," so I'd guess it's where to find an explanation
Hypothesis 1: GWT considers the extension is
".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
Hypothesis 2: GWT stop to parse the URL before the comma and want to
handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,
where there isn't any extension and is 404.
The dot in URL is rather frequent, the comma less, so the second
hypothesis is more plausible.
On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans <jason.evans(a)llgc.org.uk> wrote:
> Hello everyone, I am trying to do a batch upload using the GW Toolset but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
>
> http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
>
> Any ideas as to why this doesn't work? (site is white-listed ect)
>
> Thanks
>
> Jason
> --
> Jason Evans
> Wicipediwr Preswyl / Wikipedian in Residence
> Llyfrgell Genedlaethol Cymru / National Library of Wales
> jason.evans(a)llgc.org.uk
> Ffon/Tel: +44 (0)1970 632405
>
> _______________________________________________
> Glamtools mailing list
> Glamtools(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/glamtools
--
Sébastien Santoro aka Dereckson
http://www.dereckson.be/
Hello everyone, I am trying to do a batch upload using the GW Toolset but the tool will not recognize the file extension from the Image URL's provided. Here is an example of a url from the XML file:
http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
Any ideas as to why this doesn't work? (site is white-listed ect)
Thanks
Jason
--
Jason Evans
Wicipediwr Preswyl / Wikipedian in Residence
Llyfrgell Genedlaethol Cymru / National Library of Wales
jason.evans(a)llgc.org.uk
Ffon/Tel: +44 (0)1970 632405