I use ABBYY Finereader, don't remember the exact
version (probably 12
or 11). I bought it a few years ago and it works perfectly for my
On Fri, Apr 13, 2018 at 2:22 AM, mathieu stumpf guntz
Thank you Nahum,
Could you indicate which OCR solution you are using?
Le 26/03/2018 à 17:27, Nahum Wengrov a écrit :
I frequently work offline on he.wikisource. I
download the entire
pdf file from commons to my hard drive, and OCR the page I need
myself. One can use the OCR of wikisource and download the text
too, I guess, page by page. Then I proof the text in a Word
document, open to the lower half of my screen, with the pdf open
on the upper half of the screen, where I go to the page I need
with acrobat reader, and scroll both windows down or up as needed.
On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz
Le 24/03/2018 à 16:22, billinghurst a écrit :
Though that would defeat the purpose of
with account verification. Some of the true value of our
online process is that contribution builds a level of trust
and knowledge and that is reflected in both our patrolling
and the allocation of autopatrolled status.
How providing tools to
make batch work offline would
interfere in anyway with that? Once the work is done, it can
be uploaded to Wikisource with whichever account the user want.
Actually, to my mind, the main benefit of the online aspect
is the peer to peer production model. Also there is no need
of a central node carrying accounts to take into account the
trust given to a particular contributor. There is digital
signature technologies such as gpg for example. Having a
central node with a web interface just makes things easier
for most users, it doesn't improve the trustability of the
environment. On the contrary, with a single point of failure,
we actually rely on a weaker solution on this regard.
Also how would you have access to
templates, and components
like that from off-line?
Well, that just show how innefecient are
this tools to
continue to contribute while being offline. It's allways
possible to install Mediawiki and download required
templates, but currently this process seems way to
complicated, doesn't it.
Also we generally cannot download the images separately as
that is usually part of the later clean-up where people have
the technical skills.
I'm afraid the term "image"
misguided your answer. It's seems
you interpreted that as picture elements from files, while I
was talking about this files themselves.
So yes, there is the capacity to have the
text and proofread
the text, that actual checking the text against the image is
not the sole component of proofreading, and further it would
not be at all helpful for validation.
There is nothing magic about
working directly in a browser.
People do download and upload all the required material
anyway, but on a page per page base. The result is just as
valid as it is done when transactions are operated on a file
Wikisource-l mailing list
Wikisource-l mailing list