Wikisource-l April 2018

wikisource-l@lists.wikimedia.org

14 participants
16 discussions

Fwd: [Wikimedia-l] Wikimedia Conference 2018 - Registration closes in one month!
by mathieu stumpf guntz 07 Nov '18

07 Nov '18

Hi everybody, I just wanted to make sure there will be some wikisourcerers at this event. Is that so? -------- Message transféré -------- Sujet : [Wikimedia-l] Wikimedia Conference 2018 - Registration closes in one month! Date : Fri, 15 Dec 2017 14:23:18 +0100 De : Michelle Poltier <michelle.poltier(a)wikimedia.de> Répondre à : Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org> Pour : wikimedia-l(a)lists.wikimedia.org Dear Wikimedians, Thanks to those of you who have already registered for the Wikimedia Conference 2018 – we are already very excited to host the conference from April 20 to April 22, 2018 in Berlin! == Registration information == You have not registered for the conference yet? Your affiliation is eligible [1] to attend the conference and you have been selected to represent your affiliation in Berlin? Then please be kindly reminded to register until Monday, January 15, 2018. To register for the conference, please find the link to the registration form below. [2] == Holiday break of the organizing team == In view of the upcoming holidays, we would like to inform you that the WMCON18 organizing team will be out of the office from December 22, 2017 until January 3, 2018. If you need assistance or have any questions, we recommend you to contact us before December 22. Otherwise, we will respond to your emails as soon as possible upon our return. == Visa information == We strongly advise all those of you who need a visa to register until Monday, December 18, 2017. We will do our best to send the documents (letter of invitation, foreign travel health insurance, copy of registration of association of WMDE) by December 22, 2017, which are needed for the visa application process, to everyone, who registers until Monday, December 18, 2017. Should you register after this date, we will prepare and send the documents to you beginning of January. Further information on the visa process and assistance can be found on Meta. [3] Should you have any questions, please do not hesitate to contact us. Best regards, Daniela & Michelle on behalf of the organizing team Wikimedia Deutschland wmcon(a)wikimedia.de [1] https://meta.wikimedia.org/wiki/Wikimedia_Conference_2018/Eligibility_Crite… [2] https://meta.wikimedia.org/wiki/Wikimedia_Conference_2018/Registration_Info… [3] https://meta.wikimedia.org/wiki/Wikimedia_Conference_2018/Visa_Process_and_… _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>

11 27

Carl Malamud is doing India
by Federico Leva (Nemo) 10 May '18

10 May '18

«Received 36 books today. Totally thrilled with the selection. I’m paying $10 or even $20 for books that sell for 150 rupees, but many of these are old and I’d never be able to find them except on the market. Going to have the definitive set of high-quality scans of GOI books.» «5 scrapes of sites are going well. State library crawl of 1903-2014 southern gazettes is about to hit year 2000, the crawl of 23k books from Bengal half-way done (so 3 more months), the download of videos up to 700 films out of 5k, about 200 gbytes so far.» https://twitter.com/carlmalamud/status/989975678207455232 https://twitter.com/carlmalamud/status/989974421476491264 https://twitter.com/carlmalamud/status/990096301361545216 Federico

3 4

WMCON2018 report
by Nicolas VIGNERON 26 Apr '18

26 Apr '18

Hi, I'm quite busy right now so sadly we won't be able to give a full report report of the Wikimedia Conference before next week (at best, there is so much to tell!). In the meantime, here two things to wait until the full report : - the notes taken during the meetup https://meta.wikimedia.org/wiki/Wikimedia_Conference_2018/Thematic,_regiona… (thank you Balajijagadesh ! also Jayantanth, Econterms and everyone who was here) - the poster : https://commons.wikimedia.org/wiki/File:Wikisource_-_WMCON2018_poster.pdf (feel free to send me comments) And no matter how busy I am, don't hesitate to contact me or to ask question ;) Cheers, ~nicolas

2 1

উইকিসংকলন কর্মশালা, শিলচর
by Bodhisattwa Mandal 26 Apr '18

26 Apr '18

সুধী, আগামী ১৮ই মে শিলচরের অসম বিশ্ববিদ্যালয়ের বাংলা বিভাগের সহায়টায় উইকিসংকলনের একটি কর্মশালা অনুষ্ঠিত হবে। বিস্তারিত এখানে - https://bn.wikisource.org/s/fw5o পশ্চিমবঙ্গ ইউজার গ্রুপ এই কর্মশালার জন্য প্রয়োজনীয় খরচ বহন করছে। ধন্যবাদান্তে, বোধিসত্ত্ব

1 0

Translation namespaces
by Bodhisattwa Mandal 25 Apr '18

25 Apr '18

Hi, The translation namespace pages needs to be linked with page numbers and source index files. Any idea how to do so? Regards, -- Bodhisattwa

3 3

Wikisource meetup at WMCon
by Bodhisattwa Mandal 20 Apr '18

20 Apr '18

Hi, I hope there will be a Wikisource meetup at Wikimedia Conference 2018. https://meta.wikimedia.org/wiki/Wikimedia_Conference_2018/Thematic,_regiona… Regards, Bodhisattwa

5 6

Re: [Wikisource-l] Do we have tools for offline collaboration?
by mathieu stumpf guntz 13 Apr '18

13 Apr '18

Good to know. I consulted the website of ABBYY and it say one option is an "Open license for local use on workstations", but I guess it's not a FLOSS license, unfortunately. By the way, what is the state of the affair regarding Indic languages? Do we have a central page documenting existing OCR pipeline used by the wikisource community? What should I say to a contributor which come to me asking "I have this old PD book in my personnal library that I would like to digitalize, share and proofread in Wikisource, where should I start?". Do we have an online service, for example on tool labs, which enable to either upload or simply input url of a facsimile and that launch the OCR for example backed on tesseract? Shouldn't we update our roadmap[1], or is there a more up to date document elsewhere? [1] https://meta.wikimedia.org/wiki/Wikisource_roadmap Le 13/04/2018 à 08:28, Nahum Wengrov a écrit : > I use ABBYY Finereader, don't remember the exact version (probably 12 > or 11). I bought it a few years ago and it works perfectly for my > language (Hebrew). > > On Fri, Apr 13, 2018 at 2:22 AM, mathieu stumpf guntz > <psychoslave(a)culture-libre.org <mailto:psychoslave@culture-libre.org>> > wrote: > > Thank you Nahum, > > Could you indicate which OCR solution you are using? > > > Le 26/03/2018 à 17:27, Nahum Wengrov a écrit : >> I frequently work offline on he.wikisource. I download the entire >> pdf file from commons to my hard drive, and OCR the page I need >> myself. One can use the OCR of wikisource and download the text >> too, I guess, page by page. Then I proof the text in a Word >> document, open to the lower half of my screen, with the pdf open >> on the upper half of the screen, where I go to the page I need >> with acrobat reader, and scroll both windows down or up as needed. >> >> On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz >> <psychoslave(a)culture-libre.org >> <mailto:psychoslave@culture-libre.org>> wrote: >> >> Le 24/03/2018 à 16:22, billinghurst a écrit : >>> Though that would defeat the purpose of online proofreading >>> with account verification. Some of the true value of our >>> online process is that contribution builds a level of trust >>> and knowledge and that is reflected in both our patrolling >>> and the allocation of autopatrolled status. >> How providing tools to make batch work offline would >> interfere in anyway with that? Once the work is done, it can >> be uploaded to Wikisource with whichever account the user want. >> >> Actually, to my mind, the main benefit of the online aspect >> is the peer to peer production model. Also there is no need >> of a central node carrying accounts to take into account the >> trust given to a particular contributor. There is digital >> signature technologies such as gpg for example. Having a >> central node with a web interface just makes things easier >> for most users, it doesn't improve the trustability of the >> environment. On the contrary, with a single point of failure, >> we actually rely on a weaker solution on this regard. >> >>> Also how would you have access to templates, and components >>> like that from off-line? >> Well, that just show how innefecient are this tools to >> continue to contribute while being offline. It's allways >> possible to install Mediawiki and download required >> templates, but currently this process seems way to >> complicated, doesn't it. >> >>> >>> Also we generally cannot download the images separately as >>> that is usually part of the later clean-up where people have >>> the technical skills. >> I'm afraid the term "image" misguided your answer. It's seems >> you interpreted that as picture elements from files, while I >> was talking about this files themselves. >> >>> So yes, there is the capacity to have the text and proofread >>> the text, that actual checking the text against the image is >>> not the sole component of proofreading, and further it would >>> not be at all helpful for validation. >> There is nothing magic about working directly in a browser. >> People do download and upload all the required material >> anyway, but on a page per page base. The result is just as >> valid as it is done when transactions are operated on a file >> repository level. >> >> Cheers >> >> _______________________________________________ >> Wikisource-l mailing list >> Wikisource-l(a)lists.wikimedia.org >> <mailto:Wikisource-l@lists.wikimedia.org> >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >> <https://lists.wikimedia.org/mailman/listinfo/wikisource-l> >> >> >> >> >> _______________________________________________ >> Wikisource-l mailing list >> Wikisource-l(a)lists.wikimedia.org >> <mailto:Wikisource-l@lists.wikimedia.org> >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >> <https://lists.wikimedia.org/mailman/listinfo/wikisource-l> > >

3 2

Do we have tools for offline collaboration?
by mathieu stumpf guntz 13 Apr '18

13 Apr '18

Hello, A person in a local Wikisource workshop asked me if we could download all material of a specific work to proofread it offline. So download both the pictures and the OCRed text. Additionaly I think it would be good to provide tool to at least have side by side plain text and pictures. So, are you aware of anything close to such a tool? :) Cheers

4 5

Bold try running into it.source
by Alex Brollo 11 Apr '18

11 Apr '18

We are testing a trick, useful for IA items where there's no djvu file but there's a _djvu.xml file. _djvu.xml file is splitted into pages and uploaded "as it is" as page text. An jQuery script can parse xml and convert it into an excellent plain text. The same trick runs both in djvu and in pdf based Index pages. Another advantage is that mapped text is saved as first version of page content and that it can be recovered and used with no external tool. While parsing xml, the same script can fix too some FineReader severe mistakes from wrong analysis of text layout (wrong splitting of text into columns/regions) using words coordinates. Alex brollo

1 0

Pages where template include size is exceeded
by balaji 07 Apr '18

07 Apr '18

Hi, I have created a transclusion page in ta.wikisource where some pages are not displayed. A localised category which in English means "Pages where template include size is exceeded" https://translatewiki.net/wiki/MediaWiki:Post-expand-template-inclusion-cat… https://translatewiki.net/wiki/MediaWiki:Post-expand-template-inclusion-cat… The page is https://ta.wikisource.org/s/8wuc What to be done for all the pages to be displayed. Regards, J. Balaji (User:Balajijagadesh)

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikisource-l April 2018