Scripto is an alternative to the ProofreadPage extension used
by Wikisource. It is based on Mediawiki but also on OpenLayers,
the software used to zoom and pan in OpenStreetMap.
The only website I have seen that uses Scripto is the U.K.
War Department papers, and in many ways it is more clumsy
than ProofreadPage. But there might be a few ideas that could
be worth picking up. Take a look.
The software is described at http://scripto.org/
As for reference installations, they mention
http://wardepartmentpapers.org/transcribe.php
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se
I've been preparing a document that explains how the three GsoC-related
projects will affect Wikisource and how book metadata could be connected
with Wikidata
https://meta.wikimedia.org/wiki/User:Micru/Wikisource_across_projects
All the tools are supposed to be opt-in, so no community will be forced to
take any tool or way of working they don't want to.
I would appreciate your feedback about the draft because we would like to
send a message to the most active users in all wikisources and invite them
to join this mailing list and the proposed Wikisource User Group [1]
The tentative list that Andrea has been preparing is here. Please
expand/reduce as you feel convenient. You know better who could be
interested!
https://meta.wikimedia.org/wiki/Global_message_delivery/Targets/Wikisource_…
Usually we would have preferred to use the Central Discussions pages only,
but experience shows that this messages tend to be ignored, maybe there are
too many of them.
Since in this case the changes/improvements are quite big, we believe that
it is important to reach out to as many users as possible to give them the
opportunity to participate in the discussions and voice their opinion.
Would be anyone available to help to write the invitation or translate it
into other languages?
Cheers
David ---Micru
[1] https://meta.wikimedia.org/wiki/Wikisource_User_Group
Hi!
As part of the Google Summer of Code 2013, Aarti Kumari Dwivedi (User:Rtdwivedi), Thibaut Horel (User:Zaran) and I are working on a refactoring of the Proofread Page extension that will allow us to add the ability to edit Page: pages using the Visual Editor. For more information, see https://www.mediawiki.org/wiki/User:Rtdwivedi
We are currently rewriting a lot of code inside of the extension, changes that may cause bugs, like the {{{pagenum}}} one that will be fixed next monday. We are trying to do our best to avoid bugs by increasing the test coverage of the extension but some other ones may occur. Sorry in advance for the inconvenience.
Thomas PT
User:Tpt
PS: We have changed last monday the canonical namespaces names for Page: and Index: namespaces from internationalized ones to English ones ("Page" and "Index") in order to be consistent with MediaWiki core and the other extensions. This allows a more easily sharing of JavaScript gadgets (to test if a page is a Page: page, you just have now to do mw.config.get( "wgCanonicalNamespace" ) === "Index", test that will work in every wikis) but breaks some scripts that are based on the internationalized namespaces names. This change add also "Page" and "Index" as aliases for the Page: and Index: namespaces in every wikis.
Hi guys,
I'm in Geneva (with fellow wikimedians) at a OA conference and we are
talking *a lot* about Wikisource.
We have found a very high quality publisher of OA books (
http://www.openbookpublishers.com/, released in CC-BY), that would be
utmost happy to have their books in Wikisource.
I think the first issue is technical:
* do we have a tool that easily takes an EPUB/HTML and convert it in books
in Wikisource? I'm thinking now about ns0, not nspage.
I think that if we can take a HTML/EPUB index, and transform it in a draft
Wikisource index of links, and upload all the chapters, formatted, we would
have done the 90% of an upload of a book.
This would be really important to insert up to date, high quality OA
content in Wikisource, easily accessible for Wikipedians too.
And, moreover, Open Access books are more relevant to Wikisource than Open
Access articles (IMHO).
Aubrey
I'm just back from the LODLAM summit in Montreal, Canada and here there is
a short report.
==About LODLAM and why I was there==
LODLAM (http://lodlam.net) is a gathering of people interested in LOD
(linked open data) and LAM (Libraries, Archives, and Museums), so I thought
it would be interesting to find partners and raise awareness about the
Wikisource revitalization effort, all this thanks to the Grants:IEG
support. The audience was very diverse, not only from cultural
institutions, but also from some research centers and private companies.
OKFN, Europeana, DPLA, and other big players had representatives there.
AFIK, I was the only person from the Wikimedia movement, so I ended up
representing "all things wiki", specially Wikidata. These spontaneous
activities are briefly described here [1].
The format of the event was that of an [[open-space technology]] gathering,
similar to unconferences.
Some information and reflexions to share:
== Rewards & contributor retention ==
During a talk about licenses (which dealt about the difficulties of having
content with different licenses), there were some mention about Datahub
[2], a recently launched project to share datasets, formerly known as ckan.
The discussion revolved around the reward that contributors get for
releasing their datasets. There was some consensus that "the use of the
released data is the reward", which lead to another debate about how to
convey data use to contributors. It can be complicated or simplified to
just leave a gratitude comment by the person using the dataset.
All this led me to think about the emotional vs rational rewards that users
(or institutions) obtain from contributing content to Wikipedia, Commons,
Wikisource, etc. Are really "active thanks", as currently implemented,
suistainable and scalable? Will all the contributors who deserve it get a
thanks some day? Could personalized view counts/ratings reports about
uploaded pictures, major contributions to WP articles, etc. have some
impact on contributor satisfaction/retention? Would "automated personal
impact reports" free collaborators from the duty of thanking one another,
or would that mean less personal interactions?
These are some questions that I leave open here.
==Semantic annotations ==
As you might know there is a GSoC [3] which aims to convert the OKFN
Annotator [4] into a Mediawiki extension. That is a great project that will
enable inline comments in mediawiki projects, but it shouldn't be seen as
the end, but only an step in the direction of semantic annotations.
What could semantic annotations mean for Wikipedia? More precise answers to
questions. Instead of just having "millions of articles" there would be the
possibility of answering "trillions of questions" (or at least pointing to
the text fragment(s) that has/have the answer). This kind of paradigm shift
might need some pondering and broad community discussion.
What could semantic annotations mean for Wikisource? Text
interconectedness. Be able to relate concepts, authors, fragments... and
then be able to query those relationships.
==Input interfaces for linked data==
The best linked data it is the one that is invisible to the user, but then,
how to enable end users to "write" linked data? From the several
approaches, the most convincing seemed to use a text symbol (#, +, !, or
others) to indicate that the text following it represents a linked entity.
In the case of the VisualEditor in Wikipedia, one could write
"#article_name", and right after entering the "#" and the first letters, a
list of options (from Wikidata) would show up to autocomplete/disambiguate.
After selecting the right item, one could continue writing or type a dot to
select a property (like in some object-oriented programming languages do).
This approach simplifies the interlinking and also the data inclusion.
==Other news==
- The Getty vocabularies will be published as linked open data (late 2013,
ODC_BY 1.0 license) [6]
- Pund.it [5] - open source semantic annotation project that won the lodlam
challenge award
- Karma, tools for mapping data to ontologies [7]
Cheers,
Micru
[1] http://lists.wikimedia.org/pipermail/wikidata-l/2013-June/002388.html
[2] http://datahub.io/
[3]
https://www.mediawiki.org/wiki/User:Rjain/Proposal-Prototyping-inline-comme…
[4] http://okfnlabs.org/annotator/
[5] http://www.thepund.it/
[6] http://www.getty.edu/research/tools/vocabularies/index.html
[7]
http://summit2013.lodlam.net/2013/06/20/karma-tools-for-mapping-data-to-ont…
IA gives abbyy xml files too (as .gz files); I opened one of them after a
suggestion of Phe, and I'm dreaming about extracting anything useful to
help proofreading. The only "small" problem is that I barely know what a
xml is and that is similat to html in its (well-formed) structure, and that
something called XLST exists. :-(
Is any of you working about abbyy xml files with a "little bit" of more
skill?
Alex brollo
Some research libraries in Stockholm (at archives and
museums) have put up book scanners that the public
can use. They have the same function as a public
copier, but you get your copies on a USB stick rather
than on paper.
This opens an interesting opportunity for Wikisource and
similar volunteer book scanning projects. Instead of
buying expensive equipment, experimenting with
cameras and lighting, or building your own scanner,
you can just visit such a library. I guess you can even
bring your own book and scan it there, instead of just
using the library's books. (Of course you still need to
consider copyright. That goes without saying.)
Wikimedia Sverige, the Swedish chapter of the WMF,
started a wiki page to document some experience
from this kind of use, in Swedish of course,
https://se.wikimedia.org/wiki/Allm%C3%A4nhetens_bokscanner
Here is an example of a book that was scanned this way,
http://runeberg.org/nordmuseet/1897/0001.html
(Ironically, it is the annual report for 1897 of the museum
where it was scanned. They have the scanner standing in
their own library, but they have not scanned their own
reports.)
Are you familiar with anyting similar? Any other pages
that we should link to?
--
Lars Aronsson (lars(a)aronsson.se)
Wikimedia Sverige - stöd fri kunskap - http://wikimedia.se/
Project Runeberg - free Nordic literature - http://runeberg.org/
It is not a trivial matter. The best bet would be to take an existing pdf
import tool for a word processor, and try to write a similar tool for
wikitext.
There is the Oracle PDF Import Extension for Open Office, the code can be
browsed, maybe it can give you some ideas
http://extensions.services.openoffice.org/project/pdfimport
Micru
On Wed, Jun 12, 2013 at 12:38 PM, Alex Brollo <alex.brollo(a)gmail.com> wrote:
> When we tried to convert into wiki code (a needed step to add links and to
> convert files into a "wiki hypertext") a pdf file, that's a opaque, closed
> format, such a work turned off in a nightmare. If we simply load free pdf
> books "as they are", I don't see any advantage, but "feed wikisource
> numbers/statistics" nd this in presently far from my personal interest.
>
> As you guess, I'm one of users who don't support Aubrey's enthusiasm about
> texts born digital, even if free. :-)
>
> Alex
>
>
> 2013/6/12 David Cuenca <dacuetu(a)gmail.com>
>
>> Nobody is saying anything about using copyrighted works, there are many
>> books that have an open license that would allow to include them in
>> Wikisource.
>>
>> For instance in ca-ws we have this translation from 2009:
>>
>> http://ca.wikisource.org/wiki/Llibre:El_secret_de_l%E2%80%99or_que_creix_%2…
>>
>> The original is in the PD, and the translator gave away his rights. It
>> would have been much easier to work directly with the pdf, instead of
>> converting to djvu.
>>
>> Micru
>>
>>
>> On Wed, Jun 12, 2013 at 10:47 AM, Aarti K. Dwivedi <
>> ellydwivedi2093(a)gmail.com> wrote:
>>
>>> If I am not wrong, as of today, most books that were born digital, are
>>> still under copyright. Of course, they are available freely on the
>>> internet. But we can't use the pirated copies. How would we go about the
>>> procurement of these books?
>>> If we procure these copyrighted books, then the only we would have to do
>>> is to check for proper formatting. Isn't it?
>>>
>>>
>>> On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson <lars(a)aronsson.se> wrote:
>>>
>>>> On 06/12/2013 02:48 PM, Andrea Zanni wrote:
>>>>
>>>>> We could define some tasks as
>>>>> * corrected the page
>>>>> * OPTIONAL added optional templates/links/annotations
>>>>> *...
>>>>>
>>>>
>>>> Geotagged all the photos, ...
>>>>
>>>> The list doesn't end. You need a generic mechanism
>>>> for any new feature you can invent. But aren't our
>>>> existing templates and categories the best way to
>>>> do this? You could just add to each page:
>>>> {{done|proofread=user1|**validated=user2|geotagged=**user4|...}}
>>>>
>>>>
>>>> --
>>>> Lars Aronsson (lars(a)aronsson.se)
>>>> Project Runeberg - free Nordic literature - http://runeberg.org/
>>>>
>>>>
>>>>
>>>>
>>>> ______________________________**_________________
>>>> Wikisource-l mailing list
>>>> Wikisource-l(a)lists.wikimedia.**org <Wikisource-l(a)lists.wikimedia.org>
>>>> https://lists.wikimedia.org/**mailman/listinfo/wikisource-l<https://lists.wikimedia.org/mailman/listinfo/wikisource-l>
>>>>
>>>
>>>
>>>
>>> --
>>> Aarti K. Dwivedi
>>>
>>>
>>> _______________________________________________
>>> Wikisource-l mailing list
>>> Wikisource-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>>
>> --
>> Etiamsi omnes, ego non
>> _______________________________________________
>> Wikisource-l mailing list
>> Wikisource-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> _______________________________________________
> Wikisource-l mailing list
> Wikisource-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
--
Etiamsi omnes, ego non