On Jul 7, 2011, at 2:50 AM, Andrea Zanni <zanni.andrea84(a)gmail.com> wrote:
2011/7/7 Ting Chen <wing.philopp(a)gmx.de>
On
de.wikisource.org they scan every page of the
original text, upload
the scan on Commons and show the scan on the right part of every page as
an image. It is even obligatory to have the original scan of the text.
The following page is an example:
http://de.wikisource.org/wiki/Seite:Oberamt_Tettnang_231.jpg (I just hit
the random page)
I know - in fact, it was exactly what I wanted to explain :-)
I think this system is perfect for digitized documents, aka paper documents
which has been scanned and need transcription.
MVHO is that the same system is redundant for born-digital documents.
If we use the Proofread Extension (that's how it's called), you need to
re-transcribe the whole text, or at least have it formatted. Then you
transclude the text in ns0.
The text is reliable, but it is a lot of work, and lot of it is just
redundant (why write by hand something tha has just benn written in a good
pdf?).
If we use the simple ns0 (many wikisources are not so sctrict as de.source
in this regard) you need to do the same (transform in wikitext, format). So
the issues remain.
Now, I was wondering if we can find another (technical? organizational?
political?)solution for born-digital documents, as pdf, scientific articles
etc.
You hardly need to re-transcribe the digital document. You just need to re-format the
images and special text within the paste, edit in appropriate wikilinks, and proofread it
to ensure nothing was misplaced. Proofreading is not at all redundant for documents that
have been re-formatted with only the lightest editing. I am certain you will find
something to correct in any document of length, no matter how little editing you feel you
have done. Having a corpus with some depth on Wikisource will open up a much different
reading experience than an index of PDFs, even though the words all match. Just look at
what is being done with the SCOTUS documents, Wikis simply offer a richer study
environment for documents that are properly linked together than other sorts of digital
libraries. For all that born digital documents emphasize the "digital" they
often treat the text as if printed on a page by regularly using hypertext only in
footnoted references. It is worth putting such things on Wikisource, if you can anticipate
being able to get a decent sized corpus of scholarship of some field under a free license.
And that will vary by field and maybe even sub-specialty.
BirgitteSB