Hi everybody,
Here is my attempt at giving my point of view while trying to
summarize the discussion:
1. I think the role of Index: pages should be to present the
*source* of a work. This is true whether the source is a scanned
edition (as is most often the case at the moment), or a digital
PDF (that is, containing text and not images) as is the case for
most "digital-born" documents. I think it is good to have a neat
separation between the original source and how Wikisource presents
the work in the main namespace. Indeed, even if Wikisource tries
to be as true as possible to the original content, there are very
often some changes in the way it is presented in the main
namespace.
2. Ideally, the metadata about the source of a work (author, date
of printing, etc.) should be located in Wikidata. But metadata
related to proofreading (e.g. the proofreading level of each
individual page), being specific to the mission of Wikisource,
should be located in Wikisource. How to do this while keeping the
interface simple (i.e. hide it from the user so that she doesn't
have to go from Wikisource to Wikidata to Wikisource) is a valid
and very important concern, but is also beyond my current
understanding of Wikidata and its integration into Wikimedia
projects.
3. The current system with 4 quality levels to represent the
proofreading state of a page is not sufficient to represent the
diversity of proofreading scenarios. Indeed, there is a
distinction to make between the *correctness* of the text and its
*formatting*. In the case of a scanned edition which has been
OCRed, we do need several passes before reaching a satisfying
level of confidence about the correctness of the text as well as a
suitable formatting (proper use of the wikicode, etc.). For
digital-born documents however, as billinghurst said, we can
automatically assume that the extracted text is correct, but that
still doesn't mean that the text is correctly formatted and ready
to be transcluded in the main namespace. Maybe we should add
another level meaning "text is correct, still needs formatting"?
Ideally, we should have to scales of quality levels: one dealing
with the correctness of the text, and one dealing with its
formatting. This would probably be too heavy and confusing
though...
Thibaut (user:Zaran on Wikisource)
On 06/12/2013 01:35 PM, Andrea Zanni wrote:
_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l