Wikisource-l November 2014

wikisource-l@lists.wikimedia.org

17 participants
10 discussions

by Federico Leva (Nemo)

See https://meta.wikimedia.org/wiki/Grants_talk:APG/Staff_proposal_assessment_f… : there seems to still be an assumption that Wikipedia is ok by default while everything else needs to be justified. Nemo

9 years, 4 months

GLAMs crowdsourcing transcription and branding

by Ben Brumfield

In a separate thread (sorry--digest mode bit me), Dominic wrote: Many cultural institutions are developing their own crowdsourced transcription projects. I think Wikisource can be a much more robust platform than these one-off projects, with a more well-developed community that aggregates the transcription efforts of texts from many institutions in a single place with a proven process. I'm a big fan of Wikisource, and have recommended it, but I don't think that data extraction is the biggest barrier to adoption the GLAM sector faces. Branding is a much, much bigger deal. I talked about this the ALA this summer ( http://manuscripttranscription.blogspot.com/2014/07/collaborative-digitizat… -- see the slide with a screenshot of Wiksource next to one of Letters 1916, which uses DIY History/Scripto as its platform): "The first one is is the French-language version of Wikisource. Wikisource is a sister project to Wikipedia that was spun off around 2003 that allows people to transcribe documents and do OCR correction both. This is being used by the Departmental Archives of Alpes-Maritimes to transcribe a set of journals of episcopal visits <http://fr.wikisource.org/wiki/Livre:FRAD006_001J201.pdf>. The bishop in the sixteenth century would go around and report on all the villages [in his diocese], so there's all this local history, but it's also got some difficult paleography. "So they're using Wikisource <http://manuscripttranscription.blogspot.com/2012/04/french-departmental-arc…>, which is a great tool! It has all kinds of version control. It has ways to track proofreading. It does an elegant job of putting together indiviual pages into larger documents. But, do you see "Departmental Archives of Alpes-Maritimes" on this page? No! You have no idea [who the institution is]. Now, if they're using this internally, that may be fine -- it's a powerful tool. "By contrast, look at the Letters of 1916 <http://dh.tcd.ie/letters1916/diyhistory/>. [Three sentences inaudible.] This is public engagement in a public-facing site. " There were a lot of nods in the room, and even more when I revisited the slide in a crowdsourcing workshop a month later. If an institution were able to attach a custom stylesheet to pages displaying its 'project', if it were able to send users to an attractive homepage for its 'project', showing the project's materials, and recent activity on them, with ways for admins to monitor their volunteers' questions or discussions on talk pages, or announce news -- that would drop that barrier to entry. At the moment, a GLAM that points its users to Wikisource effectively 'loses' them -- they're sending them off to a different community and a different site that just happens to contain copies of the institution's material, with no easy way for the users to get back to the institution. That said, think bulk export of transcripts would help, especially if there were an easy way for the institution to match each transcript to the identifier in its own system. Plaintext may be good enough for e.g. a library that's using a CMS and just wants their docs to be searchable. I've seen TEI recommended in the past, and while I'm a big fan, I suspect it's of secondary importance. Ben

9 years, 4 months

What is our next major hurdle, or where we need most development assistance

by Wiki Billinghurst

What do we see as the next components for Wikisource? What are our major hurdles for system development? If we were offered development help where do people think that we should be making use of that help? Is it incremental fixes, transactional changes, or are we wanting transformational changes, completely new features, and new opportunities? Regards, Billinghurst

9 years, 4 months

Wikisource contest starts tonight!

by Andrea Zanni

Just a reminder :-) Aubrey

9 years, 4 months

Creating an OPDS feed for Wikisource (was Re: [KIWIX][GUTENBERG] 50.000 public domain books available to everybody, everywhere, offline)

by Luiz Augusto

On Wed, Nov 19, 2014 at 10:18 AM, Emmanuel Engelhart <kelson(a)kiwix.org> wrote: > [...] > > We also plan to use this code base to aggregate other online PD/free books > libraries. Wikisource is one of the first we would love to add, this might > be done pretty easily as soon as an OPDS feed is available. > > How we can develop an OPDS feed for Wikisource? Maybe this is the big chance to us start to ignore some misunderstandings that we've inherited from Wikipedia ("local communities autonomy") and remember that in the Libraries world standardization of practices is a big plus? I mean specially on ways to deliver to users contents by subject or author despite it's language of production...

9 years, 5 months

[KIWIX][GUTENBERG] 50.000 public domain books available to everybody, everywhere, offline

by Emmanuel Engelhart

Hi, The Kiwix team is happy to release the whole Project Gutenberg (http://www.gutenberg.org/) library in a ZIM format: http://download.kiwix.org/zim/gutenberg/gutenberg_mul_all_2014-11.zim.torre…. We also provide a few language specific versions here http://download.kiwix.org/zim/gutenberg. This file is dedicated to an offline usage (no connection to Internet) and it readable with Kiwix (http://www.kiwix.org). This allows anybody with a computer or a smartphone to own his own copy of this 50.000 books big library. You can also make it available for read to other people on your network, they only need a web browser. In this ZIM file, you will find all the books available in HTML (directly readable), but also in EPUB (and time to time in PDF). We have created a custom user interface which is really simple to use: in a few clicks you can find your book, read it or download it. What is also unique is that Kiwix proposes a fulltext search engine over all books content. You can see by yourself using this demonstration web site: http://library.kiwix.org/gutenberg_mul_all_2014-11/ Most of the work was done during a week long hackathon in Lyon, France by four Kiwix volunteer developers. This hackathon was funded by the Fondation Orange with the administrative help of Framasoft and Wikimedia CH. The Fondation Orange is the first beneficiary of this work and use it already for its own deployments in Africa. The solution to build this ZIM file is 100% free software and is available here https://github.com/kiwix/gutenberg. This solution allows to release easily new up2date versions. This is not a "one shot" project and we will release periodically new version of this offline version of the Project Gutenberg. We also plan to use this code base to aggregate other online PD/free books libraries. Wikisource is one of the first we would love to add, this might be done pretty easily as soon as an OPDS feed is available. We hope to see this work deployed by other third part organisation which are on place where Internet is not available, expensive or censored. We also need more developer (mostly Python) workforce for the next steps, a hackathon with this purpose will hopefully be organised in 2015 (sponsor needed). Last but not least: users, please report any problem here: https://github.com/kiwix/gutenberg/issues Regards Emmanuel -- Kiwix - Wikipedia Offline & more * Web: http://www.kiwix.org * Twitter: https://twitter.com/KiwixOffline * more: http://www.kiwix.org/wiki/Communication

9 years, 5 months

Wikisource 11th Anniversary Contest

by Andrea Zanni

Hello guys, are you preparing for the contest? Can we use this mailing list to coordinate and understand how many contests will be there? It would be nice if we could coordinate, or at least use shared scripts and tools. Maybe it's better to have an overview of what is needed to run the contest. WHAT DO YOU NEED * a collection of books to be proofread this is really easy :-) * a Wikisource contest page like this one: https://it.wikisource.org/wiki/Wikisource:Undicesimo_compleanno_di_Wikisour… * some social media coverage you can use social media etc., we always try to convince the it.wikipedia to use their SiteNotice... Of course, you must also use your own Wikisource sitenotice. * some awards if you have a national Wikimedia chapter, it's good to ask for few bucks. In Italy, we awarded 3 prizes with just 100 euros (50 euros worth book voucher as a 1st prize, 30 and 20 for 2nd and 3rd). * a way to count validated and proofread pages. If I'm not mistaken, the code is here: http://pastebin.com/Vk6ikCUg WHAT YOU NEED TO DECIDE * time In Italy, we wanted to go from 24 November at 00.01 till 1st December at 23.59. * scoring In it.source, we will probably award 3 points for every proofread page and 1 point for every validated page. * awards Last year, it.source only allowed to validate pages (and not to proofread). We awarded the first prize to the "user who validated more pages". The second and the third prizes instead were randomly extracted from the others: but the more pages a user validated, the more chances he had. Every validated page (or point) counts as a lottery ticket: the more I have, the more chances too. Cristian Cantoro made also this awesome tool to pick the 3 winners, we should adapt it for every contest: http://balist.es/wscontest/ I believe everything can be easily adapted if you both allow to proofread and validate at the same time I really hope many Wikisources will be present this year :-) Aubrey

9 years, 5 months

How to retrieve text layer from a djvu page by API

by Alex Brollo

I'd like to get text layer of a djvu page, just as proofread extension does, by an API call or any exotic trick, in different settings from the usual trigger condition (the creation of a new page). Is this possible? I browsed API doc but I failed. Alex brollo

9 years, 5 months

Epub download

by Andrea Zanni

Dear all, on 26 October I made a little test on the Italian Wikisource. I've always been an enthusiast about the WSexoport, Tpt's tool for epub conversion. But I always thought that the link was not visible enough... So, I've been bold and added the link to the converter directly in the Header template. The result, I think, is quite stunning: on the 26th we had 4738 dowloads, now we are at 8700! In ten days we doubled the number of downloads we had in few years... I think it is a simple but powerful edit. You just have to put in your header a thing like this: <div class="noprint" style="text-align: left;"> <small>{{epub|{{PAGENAME}}|testo=Scarica questo testo come EPUB}}</small></div> I suggest you keep track of the stats *before* and *after* the edit here http://wsexport.wmflabs.org/tool/stat.php Let me know how it goes! :-) Aubrey

9 years, 5 months

IEEE CFP: Fifth International Conference on Digital Information and Communication Technology and its Applications (DICTAP2015)

by Conference Updates

The Fifth International Conference on Digital Information and Communication Technology and its Applications (DICTAP2015) Faculty of Engineering - Lebanese University, Beirut, Lebanon April 29 – May 01, 2015 http://sdiwc.net/conferences/dictap2015/ The conference is technically co-sponsored by IEEE Lebanon Section. All registered papers will be submitted to IEEE for inclusion to IEEE Xplore as well as other Abstracting and Indexing (A&I) databases. ================================================================= You are invited to submit your papers to the conference. The DICTAP2015 welcomes submissions on any topic in the field of digital information, communications technology and any related topics: - Security in Information and Telecommunication System - Network Systems and Devices - Wireless and Optical Communications - Algorithms, Architecture, and Infrastructures - Information Content Security - Cloud Computing and Computer Networks - Sensor Networks and Embedded System - E-Learning, E-Commerce, E-Business and E-Government - Data Exchange Issues and Supply Chain - Information Retrieval - Web Services, Web based Application - Data Grids, Data and Information Quality - Data Warehouses and Data Mining - Image Analysis and Image Processing - Management and Diffusion of Multimedia Applications - Mobile, Ad Hoc and Sensor Network Security - Video Search and Video Mining - Enterprise Computing - Web Mining including Web Intelligence and Web 3.0 - Knowledge Management - Compression and Coding - XML and other extensible languages - Intelligent and Robust System - ICT for Social and Humanity - Security and Access Control - Constraint Programming - Ubiquitous Systems - Semantic Web, Ontologies and Rules - Communication Protocols, Communication Systems - Network Management Techniques - Telecommunication Business & Regulation - Modeling, Algorithm, and Optimization - Information Theory, System, and Technology - Scientific Computing and Multimedia Processing - Transmission, Antenna & Propagation - Artificial Intelligence and Decision Support Systems - Data Life Cycle in Products and Processes - Information Visualization - Web Metrics and its Applications - Data Models for Production Systems and Services - Data, Text, and Web Content Mining - Multimedia and Interactive Multimedia - Case Studies on Data Management, Monitoring and Analysis - Mobile Data Management - Computer Graphics - Soft Computing - Networks Security, Encryption and Cryptography - Peer to Peer Data Management - Natural Language Processing - Human-Computer Interaction - Distributed Information Systems - Temporal and Spatial Databases - Digital Rights Management - Quality of Service Issues - Interoperability Papers should be submitted electronically as pdf format without author(s) name. You can submit your research paper at http://sdiwc.net/conferences/dictap2015/paper-submission/ IMPORTANT DATES =============== Submission Deadline: March 1, 2015 Notification of Acceptance: March 22, 2015 Camera Ready Submission: March 30, 2015 Registration: March 30, 2015 Conference Dates: April 29 – May 01, 2015

9 years, 5 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikisource-l November 2014