Thank you Nahum,
Could you indicate which OCR solution you are using?
Le 26/03/2018 à 17:27, Nahum Wengrov a écrit :
I frequently work offline on he.wikisource. I download
the entire pdf
file from commons to my hard drive, and OCR the page I need myself.
One can use the OCR of wikisource and download the text too, I guess,
page by page. Then I proof the text in a Word document, open to the
lower half of my screen, with the pdf open on the upper half of the
screen, where I go to the page I need with acrobat reader, and scroll
both windows down or up as needed.
On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz
Le 24/03/2018 à 16:22, billinghurst a écrit :
Though that would defeat the purpose of
online proofreading with
account verification. Some of the true value of our online
process is that contribution builds a level of trust and
knowledge and that is reflected in both our patrolling and the
allocation of autopatrolled status.
How providing tools to make batch work
offline would interfere in
anyway with that? Once the work is done, it can be uploaded to
Wikisource with whichever account the user want.
Actually, to my mind, the main benefit of the online aspect is the
peer to peer production model. Also there is no need of a central
node carrying accounts to take into account the trust given to a
particular contributor. There is digital signature technologies
such as gpg for example. Having a central node with a web
interface just makes things easier for most users, it doesn't
improve the trustability of the environment. On the contrary, with
a single point of failure, we actually rely on a weaker solution
on this regard.
Also how would you have access to templates,
and components like
that from off-line?
Well, that just show how innefecient are this tools to
contribute while being offline. It's allways possible to install
Mediawiki and download required templates, but currently this
process seems way to complicated, doesn't it.
Also we generally cannot download the images separately as that
is usually part of the later clean-up where people have the
I'm afraid the term "image" misguided your
answer. It's seems you
interpreted that as picture elements from files, while I was
talking about this files themselves.
So yes, there is the capacity to have the
text and proofread the
text, that actual checking the text against the image is not the
sole component of proofreading, and further it would not be at
all helpful for validation.
There is nothing magic about working directly
in a browser. People
do download and upload all the required material anyway, but on a
page per page base. The result is just as valid as it is done when
transactions are operated on a file repository level.
Wikisource-l mailing list
Wikisource-l mailing list