On Jan 4, 2008 9:28 AM, Yann Forget yann@forget-me.net wrote:
Hello,
This is now off topic for foundation-l. Better to continue this on wikisource-l.
Klaus Graf wrote:
Hello,
I agree with Ray here, and I think that Klaus' mail does not report exactly the reality. The French Wikisource has the greatest numbers of scanned texts so far,
Is there a proof for this claim?
http://wikisource.org/wiki/Wikisource:ProofreadPage_Statistics lists 40,043 pages for fr.ws and 16,939 for de.ws.
I am surprised how few page scans are in the en namespace:
4883 of which 2792 are the NSRW. Of the remaining 2091, * 620 : Maxwell's Treatise on Electricity and Magnetism * 368 : H.R. Rep. No. 94-1476 * 289 : The How and Why Library * 200 :[[History of Iowa/Volume 4]] (google book; no images) * 66 : Spirella Manual (1913) * 59 : Faraday's Experimental researches in electricity * 45 : Secret history of the French court under Richelieu and Mazarin (google book; images on google) + 444 misc. others
Even those numbers are inflated, as I know that many of the pages in Maxwell's treatise are empty shells that once contained only a header before the days of Index: pages.
http://fr.wikisource.org/wiki/Wikisource:Livres_disponibles_en_mode_page lists 62,326 scanned pages (not yet all ocred and proofread, and I am not sure that this page is up to date).
That number must include images that do not have Page:s, as there are 40046 articles in ns104 on frwikisource.
http://tools.wikimedia.de/~st47/cgi-bin/pagesinns.pl?server=3&db=frwikis...
If en.ws was to include those, all of the EB1911 images could be included :-)
but does not make mandatory to have them to
publish a text there. It is only a suggestion, which many contributors follow.
40 thousand pages is a testament to the success of the fr.ws approach.
I think that the important point is not scanned texts, but notation on whether and how the texts are proofread by editors, whatever means the editors use to proofread the texts.
I am monitoring discussions on digitization projects as archival professional since years. It's standard to give not only e-texts but scans. Wikisource demands no scans when a permanent web adress (e.g. library project) for the scans outside Commons is given.
I think the average quality of other Wikisource branches is very poor. In most cases there is no source given: one cannot know which source is used, and for scholarly purposes the e-text is worthless.
While indicating a source is critical for scholarly purposes where many editions exist, it is also often completely irrelevant when only one edition is likely to exist. There are many newspaper and journal articles that are _only_ available online by way of Wikisource, and even when not proof-read these are invaluable for scholarly purposes.
An example of a high quality transcription that is occurring without page scans, see Journal of Discourses and EB1911.
http://en.wikisource.org/wiki/Journal_of_Discourses http://en.wikisource.org/wiki/EB1911
Even in the case when many editions exist, a copy on Wikisource of poor provenance is _not_ worthless - it is merely a work in progress - anyone can improve it by pinning it down to one edition, and because Wikisource is a WMF project, any scholar knows without asking that the Wikisource community will welcome their involvement in our project. As a result, poor copies that put Wikisource into google search results do bring in new and valuable contributors.
Stipulating that imperfection is intolerable is inhumane. Eradicating the imperfection is the path to perfection.
I think we already have had this discussion earlier, but the misunderstanding continues. There are two different issues here. It is important not to mix them:
- Scans provided alongside texts.
- Notation of quality.
Quality is not an absolute value. It is relative to the sources available for a given text. Quality does not have the same meaning for a text from 1920 and a text from the 15th century. So one should not talk about quality, but about notation of quality.
I agree that giving the source is important and should be part of a quality notation. The most important is to have a clear notation so that readers know how and by whom the texts have been proofread. Scans alone are not a proof of quality, but they help getting a better quality. They are not the only way to get good quality texts. Some texts may be proofread by several contributors, so of very good qualilty, but Wikisource might not be able to have scanned images if a public domain edition is not easily avalaible.
Page scans are, for the most part, orthogonal to quality. Meticulous transcriptions of page scans of a collection poorly collated, or of a rushed translation, is nothing short of a wasted effort.
On en.ws, we do need to increase the percentage of texts with page scans. Any suggestions on how to achieve this?
-- John