Wikisource-l December 2013

wikisource-l@lists.wikimedia.org

17 participants
24 discussions

Crop images tool
by Andrea Zanni 26 Apr '14

26 Apr '14

Dear all, user Alex Brollo has developed a very interesting tool which I think many of you will find very useful. (afaiu he developed this from a previous tool, but he can explain it better) To test it, go on your oldwikisource account, and in User:YOU/vector.js copy this: importScript("User:Alex_brollo/cornersDependencies.js"); importScript("User:Alex brollo/corners.js"); importStylesheet("User:Alex brollo/common.css"); The tool will appear under Tools, and it's called "Ritaglio immagini" (crop images). It opens a window which allows you to select and crop an image directy on the page. It is extremely useful to use illustration directly on the page. It is usable both in View and Edit mode. In the Italian Wikisource we have made it a Gadget and button, so it's active for everyone. Aubrey

4 5

Data namespace for Wikisource
by Andrea Zanni 29 Dec '13

29 Dec '13

Dear all, volcanic User:Alex brollo is working on "dictionaries", aka generating lists of used words in Pages and works. For example: https://it.wikisource.org/wiki/Discussioni_pagina:Il_cavallarizzo.djvu/2, as a list of the book Il cavallarizzo. I bet that in the next few years (with more books, more users, Wikidata, and the world domination led by Wikisource) we would have more and more of these experiments. A list of used words of an ancient book could help customize OCR and tools for typo corrections, for example. Moreover, we will have Wikidata, and maybe we will need to store some metadata (eg page numbers, or metadata about images and scans) into Wikisource. Lua could help us build tools for creating automatic indexes, or textual version in ns0 (eg precompile the pagelist tag...) So, the question is: want we Wikisource communities a new Data namespace? How do you like the idea? Would you want to have the Wikibase extension in it, or just a normal namespace? I'm sure you will find this mail confusing, but I think we are in the need of something, I just don't know what it is :-) Aubrey

4 4

US National Archives virtual internship program
by Dominic McDevitt-Parks 23 Dec '13

23 Dec '13

Hi all, I'd like to announce the US National Archives' new virtual internship program<http://www.archives.gov/careers/internships/virtual/index.html#wikipedian>for Wikipedians. We are offering unpaid internships at the National Archives for experienced Wikipedians with technical or community skills. This is intended to be a way for Wikipedians interested in working on NARA's GLAM efforts to formalize their affiliation with NARA, and receive academic credit, work experience, and a reference. The interns will have a staff mentor (me) to guide their work, and the chance to have a real impact on the state of Wikipedia and public access to cultural heritage. We are initially offering internships for Wikipedians with technical skills, who would help us with Commons image uploads, analytics, etc., and those skilled at organizing the Wikimedia community, to help coordinate our WikiProject and communicate our activities. There is no required time commitment or start date, and these sorts of details can be negotiated. I would encourage anyone on this list with interest to apply or share out the posting with other members of the Wikimedia community who might be a good fit. Please feel free to reply here or contact me personally if you have any questions. More information and instructions for applying can be found here: http://www.archives.gov/careers/internships/virtual/index.html#wikipedian Thanks! -- Dominic McDevitt-Parks Digital Content Specialist, Wikipedian in Residence National Archives and Records Administration Dominic.McDevitt-Parks(a)nara.gov (301) 837-0356

1 0

Re: [Wikisource-l] [Commons-l] The British Library releases 1 million images
by Lars Aronsson 20 Dec '13

20 Dec '13

On 12/20/2013 10:23 PM, Lars Aronsson wrote: > where some fine print is no longer legible. What I > want is one that has only been reduced down > to 300 dpi or so. How can I get that? With a little help from okfn-labs (Open Knowledge Foundation), here is a script that works for my book: #!/bin/sh pag=1 while true do hex=`printf "%04X" $pag` dec=`printf "%04d" $pag` if [ ! -s $dec.jpg ] then echo -n . wget -q -O $dec.jpg "http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00$hex/0…" || break sleep 2 else echo -n : fi pag=`expr $pag + 1` done That is the URL for one tile, but the tile that I request starts at 0,0 and is 10000 pixels wide, so it contains the full page 1800x2400 pixels, in full (pct:100) = 300 dpi resolution. This was faster than waiting for BL's webmaster's response on Monday. In my case, I want the JPEGs. But if you want to use a book in Wikisource, you might want to create a Djvu or PDF bundle of all the JPEGs for the entire book. -- Lars Aronsson (lars(a)aronsson.se) Project Runeberg - free Nordic literature - http://runeberg.org/

1 0

Re: [Wikisource-l] [Commons-l] The British Library releases 1 million images
by Lars Aronsson 20 Dec '13

20 Dec '13

On 12/15/2013 05:08 PM, Emilio J. Rodríguez-Posada wrote: > Quote from full announcement > http://britishlibrary.typepad.co.uk/digital-scholarship/2013/12/a-million-f… > > We have released over a million images > <http://www.flickr.com/photos/britishlibrary> onto Flickr Commons > for anyone to use, remix and repurpose. These images were taken > from the pages of 17th, 18th and 19th century books digitised by > Microsoft > <http://pressandpolicy.bl.uk/Press-Releases/The-British-Library-19th-Century…> > [...] > > > Flickr account http://www.flickr.com/photos/britishlibrary > Example of image http://www.flickr.com/photos/britishlibrary/11307195524/ > Example of all images from a book > http://www.flickr.com/photos/britishlibrary/tags/sysnum002660292 > Stuff for coders https://github.com/BL-Labs/imagedirectory I found an illustration from a Swedish book, found it in the catalog of the British Library, and from there I could both download a PDF and view the whole book in an online 'item viewer'. However, the downloaded PDF has a much lower resolution (I estimate it at 150 dpi) than the real scans (which I estimate at 300 dpi). The illustrations on Flickr are in full resolution. Has anybody found out how to download the whole book in full resolution? The 'item viewer' appears to be a Javascript zoom and pan interface based on layers of 'tiles' (similar to OpenStreetMap), scaled and cut from the scanned images. I had the same problem with books scanned by the Norwegian national library, but there I was able to figure out how to download images in full resolution by requesting large tiles at full zoom. The URLs used by the British Library are opaque to me. Here is the illustration found on Flickr, http://www.flickr.com/photos/britishlibrary/11067189413/ The description there says 'page 331 of Elfsyssel', Identifier: 000507311, an easily identifiable book. How can I search Flickr for other 'Elfsyssel' pictures? This search yields nothing, http://www.flickr.com/search/?q=elfsyssel The library catalog record is found here, http://explore.bl.uk/primo_library/libweb/action/search.do?mode=Basic&vid=B… After I downloaded the PDF, I made the book available for reading and proofreading here, http://runeberg.org/elfsyssel/ The illustration (on "page 331") is here, http://runeberg.org/elfsyssel/0331.html but even if you select "full resolution" there, you only get the image from the PDF, and not the good picture from Flickr. -- Lars Aronsson (lars(a)aronsson.se) Project Runeberg - free Nordic literature - http://runeberg.org/

1 0

European Commission copyright consultation
by David Cuenca 17 Dec '13

17 Dec '13

The European Comission is asking for feedback about copyright laws. I think now it would be a good opportunity to voice our concern from the Wikisource perspective about copyright laws: https://meta.wikimedia.org/wiki/European_Commission_copyright_consultation Cheers, Micru

1 0

Issues with ProofreadPage
by Thomas Tanon 16 Dec '13

16 Dec '13

Hi everyone! I’m really sorry for all the issues affecting Wikisources. A fix for most of them have just been deployed (it have been an hard work) and I’ll try to fix all the remaining ones for the next Tuesday deployment. Here is a list of the fixed issues: * The image didn’t appear for Page: pages that are member of a multipage file but that haven’t as Index: page a page called Index:NAME_OF_THE_DJVU (it have affected mostly pl.wikisource) * A part of some Page: pages body content appeared as part of their footer in edit mode when those pages contains a <noinclude> tag * The color of the link to a Page: page was red in the Index: page when its level category name contained a whitespace (to fix this issue a purge of each Page: page affected is required) * The creation of a Page: page using the API caused a fatal error * An indentation was displayed on the first paragraph of a Page: page Here is the remaining known issues (thanks to report ones that haven’t been listed here): * The Page: pages edit summary contains twice the proofreading level change tag. A fix for it is on review that will let the software adds the tag on Page: page saving (the tag won’t be visible during the editing process. * The addition of default header and footer content adds a strange string instead of tags like <references />. A fix for it is on review * The body textarea on Page: pages editing is too big. I’m working on a fix that would use the size defined in User preferences instead. * Fatal error on submit for a very few pages (a fix is on review) * It isn’t possible anymore to zoom in with a mouse. I’m working on a fix * the issue with gadget that Aubrey have reported here (I think that the solution is more on the gadget side that on the extension one) * It’s not possible to edit only the body of a page throw the API I’m going to work on automatized tests in the next weeks in order to avoid a so major number of bugs the next times. Sorry again, Thomas PS: For people that ignore it, the maintenance work of the ProofreadPage extension is mostly done by volunteers like you. So, please be kind

8 20

Re: [Wikisource-l] [Wikitech-ambassadors] Collection Extension (Not rendering to formats other than PDF)
by Federico Leva (Nemo) 13 Dec '13

13 Dec '13

Matthew Walker, 13/12/2013 01:20: > Just wanted to post to this list that I have just reverted the offending > commit that was causing the Collection extension to not be able to > render to any format other than PDF (bugs 58151 [1], 57975 [2], and > 57920 [3].) I understand that these bugs were causing some amount of > grumbling in the WikiSource community. > > I've still yet to determine why the commit in question [4] was causing > issues; but eventually I'll figure it out. Thanks! Much appreciated. Nemo > > [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=58151 > [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=57975 > [3] https://bugzilla.wikimedia.org/show_bug.cgi?id=57920 > [4] https://gerrit.wikimedia.org/r/#/c/95483/ > > ~Matt Walker > Wikimedia Foundation > Fundraising Technology Team > > > _______________________________________________ > Wikitech-ambassadors mailing list > Wikitech-ambassadors(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-ambassadors >

1 0

Re: [Wikisource-l] [Wikidata-l] DNB 11M bibliographic records as CC0
by Luiz Augusto 11 Dec '13

11 Dec '13

On Mon, Dec 9, 2013 at 12:18 PM, Tom Morris <tfmorris(a)gmail.com> wrote: > I'm not sure I agree. There's a lot of good data in OpenLibrary, but > there's also a lot of junk. Freebase imported a bunch of OpenLibrary data, > after winnowing it to what they thought was the good stuff, and still ended > up deleting a bunch of the supposedly "good" stuff later because they found > their goodness criteria hadn't been strict enough. > > One of the reasons OpenLibrary is such a mess is because *they* > arbitrarily imported junky data (e.g. Amazon scraped records). The last > thing the world needs is more duplicate copies of random junk. We've > already got the DPLA for that. :-) > > Another issue with the OpenLibrary metadata is that there's no clear > license associated with it. IA's position is that they got it from > wherever they got it from and you're own your own if you want to reuse it, > which isn't very helpful. The provenance for major chunks of it is > traceable and new stuff by users is nominally being contributed under CC0, > so they could probably be sorted out with enough effort (although the same > thing is true of the data quality issues too). > > > Gosh, I withdraw my support for fully reusage of Open Library data. That was probably the best efforts they can do in past years, before the mass disponibilization of data dumps directly from well known libraries catalogs, but now we are in a very different scenario. Even a simple mass import from the already mentioned datahub [1] in the openlibrary engine (open source software) without further editing will generate best quality data. [1] - http://datahub.io/group/bibliographic

1 0

Re: [Wikisource-l] [Wikidata-l] DNB 11M bibliographic records as CC0
by Federico Leva (Nemo) 09 Dec '13

09 Dec '13

Edward Summers, 09/12/2013 12:18: > If OpenLibrary gets active again, [...] Definition of active? The fact that there's no software development/investment doesn't mean it's inactive. Are there stats on users activity there and can it be compared in some way to ours as regards that kind of data? Nemo

2 2

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikisource-l December 2013