Wikisource-l June 2013

wikisource-l@lists.wikimedia.org

15 participants
17 discussions

Scripto, free software for transcribing documents

by Lars Aronsson

Scripto is an alternative to the ProofreadPage extension used by Wikisource. It is based on Mediawiki but also on OpenLayers, the software used to zoom and pan in OpenStreetMap. The only website I have seen that uses Scripto is the U.K. War Department papers, and in many ways it is more clumsy than ProofreadPage. But there might be a few ideas that could be worth picking up. Take a look. The software is described at http://scripto.org/ As for reference installations, they mention http://wardepartmentpapers.org/transcribe.php -- Lars Aronsson (lars(a)aronsson.se) Aronsson Datateknik - http://aronsson.se

10 years, 7 months

RfC draft for the new Wikisource tools and future workflow

by David Cuenca

I've been preparing a document that explains how the three GsoC-related projects will affect Wikisource and how book metadata could be connected with Wikidata https://meta.wikimedia.org/wiki/User:Micru/Wikisource_across_projects All the tools are supposed to be opt-in, so no community will be forced to take any tool or way of working they don't want to. I would appreciate your feedback about the draft because we would like to send a message to the most active users in all wikisources and invite them to join this mailing list and the proposed Wikisource User Group [1] The tentative list that Andrea has been preparing is here. Please expand/reduce as you feel convenient. You know better who could be interested! https://meta.wikimedia.org/wiki/Global_message_delivery/Targets/Wikisource_… Usually we would have preferred to use the Central Discussions pages only, but experience shows that this messages tend to be ignored, maybe there are too many of them. Since in this case the changes/improvements are quite big, we believe that it is important to reach out to as many users as possible to give them the opportunity to participate in the discussions and voice their opinion. Would be anyone available to help to write the invitation or translate it into other languages? Cheers David ---Micru [1] https://meta.wikimedia.org/wiki/Wikisource_User_Group

10 years, 10 months

ProofreadPage changes

by Thomas PT

Hi! As part of the Google Summer of Code 2013, Aarti Kumari Dwivedi (User:Rtdwivedi), Thibaut Horel (User:Zaran) and I are working on a refactoring of the Proofread Page extension that will allow us to add the ability to edit Page: pages using the Visual Editor. For more information, see https://www.mediawiki.org/wiki/User:Rtdwivedi We are currently rewriting a lot of code inside of the extension, changes that may cause bugs, like the {{{pagenum}}} one that will be fixed next monday. We are trying to do our best to avoid bugs by increasing the test coverage of the extension but some other ones may occur. Sorry in advance for the inconvenience. Thomas PT User:Tpt PS: We have changed last monday the canonical namespaces names for Page: and Index: namespaces from internationalized ones to English ones ("Page" and "Index") in order to be consistent with MediaWiki core and the other extensions. This allows a more easily sharing of JavaScript gadgets (to test if a page is a Page: page, you just have now to do mw.config.get( "wgCanonicalNamespace" ) === "Index", test that will work in every wikis) but breaks some scripts that are based on the internationalized namespaces names. This change add also "Page" and "Index" as aliases for the Page: and Index: namespaces in every wikis.

10 years, 10 months

Request for comment on the book management extension (GSoC)

by Gorilla Warfare

Hello! I'm working on the "Improve support for book structures" Google Summer of Code project (https://meta.wikimedia.org/wiki/Book_management) and have just created a request for comment on the extension and its design (http://www.mediawiki.org/wiki/Requests_for_comment/Book_management). I would really appreciate any feedback you have on any aspect of it. Yours, Molly White (GorillaWarfare)

10 years, 10 months

Proposal to move the Oldwikisource projects to the Incubator

by David Cuenca

Someone (not me) has proposed to move the language projects hosted in Oldwikisource to the Incubator https://meta.wikimedia.org/wiki/Meta:Babel#Move_betawikiversity_and_oldwiki… I'm not sure if is a good idea or if it is technically feasible at all, but you can comment it on the link. Thanks! Micru

10 years, 10 months

OA books in Wikisource

by Andrea Zanni

Hi guys, I'm in Geneva (with fellow wikimedians) at a OA conference and we are talking *a lot* about Wikisource. We have found a very high quality publisher of OA books ( http://www.openbookpublishers.com/, released in CC-BY), that would be utmost happy to have their books in Wikisource. I think the first issue is technical: * do we have a tool that easily takes an EPUB/HTML and convert it in books in Wikisource? I'm thinking now about ns0, not nspage. I think that if we can take a HTML/EPUB index, and transform it in a draft Wikisource index of links, and upload all the chapters, formatted, we would have done the 90% of an upload of a book. This would be really important to insert up to date, high quality OA content in Wikisource, easily accessible for Wikipedians too. And, moreover, Open Access books are more relevant to Wikisource than Open Access articles (IMHO). Aubrey

10 years, 10 months

Impressions from LODLAM 2013

by David Cuenca

I'm just back from the LODLAM summit in Montreal, Canada and here there is a short report. ==About LODLAM and why I was there== LODLAM (http://lodlam.net) is a gathering of people interested in LOD (linked open data) and LAM (Libraries, Archives, and Museums), so I thought it would be interesting to find partners and raise awareness about the Wikisource revitalization effort, all this thanks to the Grants:IEG support. The audience was very diverse, not only from cultural institutions, but also from some research centers and private companies. OKFN, Europeana, DPLA, and other big players had representatives there. AFIK, I was the only person from the Wikimedia movement, so I ended up representing "all things wiki", specially Wikidata. These spontaneous activities are briefly described here [1]. The format of the event was that of an [[open-space technology]] gathering, similar to unconferences. Some information and reflexions to share: == Rewards & contributor retention == During a talk about licenses (which dealt about the difficulties of having content with different licenses), there were some mention about Datahub [2], a recently launched project to share datasets, formerly known as ckan. The discussion revolved around the reward that contributors get for releasing their datasets. There was some consensus that "the use of the released data is the reward", which lead to another debate about how to convey data use to contributors. It can be complicated or simplified to just leave a gratitude comment by the person using the dataset. All this led me to think about the emotional vs rational rewards that users (or institutions) obtain from contributing content to Wikipedia, Commons, Wikisource, etc. Are really "active thanks", as currently implemented, suistainable and scalable? Will all the contributors who deserve it get a thanks some day? Could personalized view counts/ratings reports about uploaded pictures, major contributions to WP articles, etc. have some impact on contributor satisfaction/retention? Would "automated personal impact reports" free collaborators from the duty of thanking one another, or would that mean less personal interactions? These are some questions that I leave open here. ==Semantic annotations == As you might know there is a GSoC [3] which aims to convert the OKFN Annotator [4] into a Mediawiki extension. That is a great project that will enable inline comments in mediawiki projects, but it shouldn't be seen as the end, but only an step in the direction of semantic annotations. What could semantic annotations mean for Wikipedia? More precise answers to questions. Instead of just having "millions of articles" there would be the possibility of answering "trillions of questions" (or at least pointing to the text fragment(s) that has/have the answer). This kind of paradigm shift might need some pondering and broad community discussion. What could semantic annotations mean for Wikisource? Text interconectedness. Be able to relate concepts, authors, fragments... and then be able to query those relationships. ==Input interfaces for linked data== The best linked data it is the one that is invisible to the user, but then, how to enable end users to "write" linked data? From the several approaches, the most convincing seemed to use a text symbol (#, +, !, or others) to indicate that the text following it represents a linked entity. In the case of the VisualEditor in Wikipedia, one could write "#article_name", and right after entering the "#" and the first letters, a list of options (from Wikidata) would show up to autocomplete/disambiguate. After selecting the right item, one could continue writing or type a dot to select a property (like in some object-oriented programming languages do). This approach simplifies the interlinking and also the data inclusion. ==Other news== - The Getty vocabularies will be published as linked open data (late 2013, ODC_BY 1.0 license) [6] - Pund.it [5] - open source semantic annotation project that won the lodlam challenge award - Karma, tools for mapping data to ontologies [7] Cheers, Micru [1] http://lists.wikimedia.org/pipermail/wikidata-l/2013-June/002388.html [2] http://datahub.io/ [3] https://www.mediawiki.org/wiki/User:Rjain/Proposal-Prototyping-inline-comme… [4] http://okfnlabs.org/annotator/ [5] http://www.thepund.it/ [6] http://www.getty.edu/research/tools/vocabularies/index.html [7] http://summit2013.lodlam.net/2013/06/20/karma-tools-for-mapping-data-to-ont…

10 years, 10 months

ABBYY xml files: any of you is working about?

by Alex Brollo

IA gives abbyy xml files too (as .gz files); I opened one of them after a suggestion of Phe, and I'm dreaming about extracting anything useful to help proofreading. The only "small" problem is that I barely know what a xml is and that is similat to html in its (well-formed) structure, and that something called XLST exists. :-( Is any of you working about abbyy xml files with a "little bit" of more skill? Alex brollo

10 years, 10 months

Use of public book scanners

by Lars Aronsson

Some research libraries in Stockholm (at archives and museums) have put up book scanners that the public can use. They have the same function as a public copier, but you get your copies on a USB stick rather than on paper. This opens an interesting opportunity for Wikisource and similar volunteer book scanning projects. Instead of buying expensive equipment, experimenting with cameras and lighting, or building your own scanner, you can just visit such a library. I guess you can even bring your own book and scan it there, instead of just using the library's books. (Of course you still need to consider copyright. That goes without saying.) Wikimedia Sverige, the Swedish chapter of the WMF, started a wiki page to document some experience from this kind of use, in Swedish of course, https://se.wikimedia.org/wiki/Allm%C3%A4nhetens_bokscanner Here is an example of a book that was scanned this way, http://runeberg.org/nordmuseet/1897/0001.html (Ironically, it is the annual report for 1897 of the museum where it was scanned. They have the scanner standing in their own library, but they have not scanned their own reports.) Are you familiar with anyting similar? Any other pages that we should link to? -- Lars Aronsson (lars(a)aronsson.se) Wikimedia Sverige - stöd fri kunskap - http://wikimedia.se/ Project Runeberg - free Nordic literature - http://runeberg.org/

10 years, 10 months

Converting pdf files into wiki markup

by David Cuenca

It is not a trivial matter. The best bet would be to take an existing pdf import tool for a word processor, and try to write a similar tool for wikitext. There is the Oracle PDF Import Extension for Open Office, the code can be browsed, maybe it can give you some ideas http://extensions.services.openoffice.org/project/pdfimport Micru On Wed, Jun 12, 2013 at 12:38 PM, Alex Brollo <alex.brollo(a)gmail.com> wrote: > When we tried to convert into wiki code (a needed step to add links and to > convert files into a "wiki hypertext") a pdf file, that's a opaque, closed > format, such a work turned off in a nightmare. If we simply load free pdf > books "as they are", I don't see any advantage, but "feed wikisource > numbers/statistics" nd this in presently far from my personal interest. > > As you guess, I'm one of users who don't support Aubrey's enthusiasm about > texts born digital, even if free. :-) > > Alex > > > 2013/6/12 David Cuenca <dacuetu(a)gmail.com> > >> Nobody is saying anything about using copyrighted works, there are many >> books that have an open license that would allow to include them in >> Wikisource. >> >> For instance in ca-ws we have this translation from 2009: >> >> http://ca.wikisource.org/wiki/Llibre:El_secret_de_l%E2%80%99or_que_creix_%2… >> >> The original is in the PD, and the translator gave away his rights. It >> would have been much easier to work directly with the pdf, instead of >> converting to djvu. >> >> Micru >> >> >> On Wed, Jun 12, 2013 at 10:47 AM, Aarti K. Dwivedi < >> ellydwivedi2093(a)gmail.com> wrote: >> >>> If I am not wrong, as of today, most books that were born digital, are >>> still under copyright. Of course, they are available freely on the >>> internet. But we can't use the pirated copies. How would we go about the >>> procurement of these books? >>> If we procure these copyrighted books, then the only we would have to do >>> is to check for proper formatting. Isn't it? >>> >>> >>> On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson <lars(a)aronsson.se> wrote: >>> >>>> On 06/12/2013 02:48 PM, Andrea Zanni wrote: >>>> >>>>> We could define some tasks as >>>>> * corrected the page >>>>> * OPTIONAL added optional templates/links/annotations >>>>> *... >>>>> >>>> >>>> Geotagged all the photos, ... >>>> >>>> The list doesn't end. You need a generic mechanism >>>> for any new feature you can invent. But aren't our >>>> existing templates and categories the best way to >>>> do this? You could just add to each page: >>>> {{done|proofread=user1|**validated=user2|geotagged=**user4|...}} >>>> >>>> >>>> -- >>>> Lars Aronsson (lars(a)aronsson.se) >>>> Project Runeberg - free Nordic literature - http://runeberg.org/ >>>> >>>> >>>> >>>> >>>> ______________________________**_________________ >>>> Wikisource-l mailing list >>>> Wikisource-l(a)lists.wikimedia.**org <Wikisource-l(a)lists.wikimedia.org> >>>> https://lists.wikimedia.org/**mailman/listinfo/wikisource-l<https://lists.wikimedia.org/mailman/listinfo/wikisource-l> >>>> >>> >>> >>> >>> -- >>> Aarti K. Dwivedi >>> >>> >>> _______________________________________________ >>> Wikisource-l mailing list >>> Wikisource-l(a)lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >>> >>> >> >> >> -- >> Etiamsi omnes, ego non >> _______________________________________________ >> Wikisource-l mailing list >> Wikisource-l(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >> >> > > _______________________________________________ > Wikisource-l mailing list > Wikisource-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > -- Etiamsi omnes, ego non

10 years, 10 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikisource-l June 2013