On 7 November 2013 08:00:18, Michael Peel <michael.peel at manchester.ac.uk> wrote:

> Something that WMUK could support?

>> The contest would be from Nov 24th till Dec 1st. During that time the
>> participants would proofread a selection of books and they would get points
>> per page. The one with the most points would win an ebook reader.


I think it might be a little too short notice to get something going. 

(FYI The next similar date would be ten years of English Wikisource as its own project, in September 2015.)

Still, if you were to do it, the Proofread of the Month model seems the easiest.



On 8 November 2013 10:52:56, Fæ <faewik@gmail.com> wrote:

> Having 'been around' for quite a while, dabbled in Wikisource and
> lurked around its back passages, I still find it comparatively hard to
> understand. If this is to attract newcomers, then it would be nice to
> see this go hand-in-hand with improving both the guidelines on exactly
> how to proofread (there's a complex multi-stage process that could do
> with a simpler work-flow)


I've heard this before and tried to write a lot of new help pages to solve it.  I'm hampered by the fact that I don't actually find any problem with the work-flow; it's really straight forward to me.


> and the rather convoluted underpinning process for turning a
> document/book into a djvu file, loading it on Commons and then setting
> it up as a book on Wikisource (phew).


Most users would never need to actually create a DjVu themselves.  I've tried scanning from scratch and I wouldn't recommend it to anyone who wasn't already committed to the project.   Nothing Wikisource or Wikimedia can do is likely to change that, however.


> (e.g. how do you mark up ... "this word is missing from the original"?)


That, at least, is easily solved: you don't.  A missing word is shown by missing the word.  English Wikisource is about making faithful reproductions of texts as they are.  Even an occasional wikilink can be contentious.

Anyway, to my point for these quotes:


On 8 November 2013 11:33:43, Richard Nevell <richard.nevell at wikimedia.org.uk> wrote:

> But, returning to the point of this thread: proofreading Wikisource. If
> there is an appetite to take part, how would it be best organised? Checking
> for typos and using a spellchecker seems like the simplest approach, would
> it be one which results in a significant impact?


For the same reason as above, I wouldn't recommend correcting typos with a spellchecker as the challenge.  People are likely to interpret that as altering the original text, rather than correcting a previous user's mistakes.  That's just going to cause aggravation all round as edits get reverted.

(To be clear, typos in the original text are left as they are, to be preserved for all time, although they can be marked with a SIC template.)


On 8 November 2013 11:05:13, Charles Matthews <charles.r.matthews at ntlworld.com> wrote:

> ProofReadPage, the MediaWiki extension that allows
> proofing via "text opposite scan", should become the USP, but needs to be
> supplemented by sound policies on annotation and translation.


I'm working on it! :)

Actually getting either annotation or translation agreed as acceptable in a general sense was hard.  Getting wikilinks allowed in any mainspace page at all was harder that I thought it would be.  I have been meaning to finish the annotation policy after the RfC but I've been busy; fortunately it doesn't come up much.  The new Translation namespace was the solution to allowing user translated works; the existing ones are still being migrated.

That's not important right now though.


On 8 November 2013 11:42:24, Charles Matthews <charles.r.matthews at ntlworld.com> wrote:

> English Wikisource runs a "Proofread of the Month" and one model would be
> to adapt that to the needs of the competition ("all must have prizes" in
> PoTM, namely a template on your userpage). So there could be a definite
> bunch of works selected, where people proofread and validate them page by
> page.


This would be the easiest to judge and the fairest to run, as all texts could be of equal-ish clarity.  I am currently working on a Georgian era book and it can be quite awkward.  Victorian to 20th Century would be easiest as an entry point.

On the other hand, this does require someone actually coming up with a list of books to proofread.

That someone would need to make the list, organise everything and advertise the challenge in the next fortnight.  Multiple languages would need multiple lists.

Nevertheless, it would be easiest to score by completed pages.  Where "page" means a page from the original book/magazine/whatever and "completed" means fully transcribed (which would be the yellow status, "proofread", for those who understand what I'm talking about) rather than OCR gibberish.

It could be done.  It's not as if there is any great shortage of scanned texts available.

- Adam