2011/2/5 River Tarnell r.tarnell@ieee.org
In article AANLkTikWLU5Y8C2UokYRN=v1-zwhb1ktHNXi4xtbmXja@mail.gmail.com, David Gerard dgerard@gmail.com wrote:
On 5 February 2011 15:12, Alex Brollo alex.brollo@gmail.com wrote:
Just to let you know that Aubrey just prestented it.source idea for wikicaptcha into wikisource-l
What would it take to get this into place? What's the captcha load on WMF sites? Would e.g. the toolserver melt under the load? Perhaps on one project at a time?
I don't think this should be hosted on the Toolserver; as CAPTCHAs are a core part of the site, they should not rely on the TS to work.
- river.
IMHO, it could be an opportunity to think again to the role of Commons as a central library. I imagine something like this:
1. as soon as a djvu file with a text layer is uploaded, a complete set of pages text layers is extracted, saving words coordinates too; 2. such text layers could be browsed by a script, extracting all words marked as doubtful (usually with a ^ characters), but extracting too words which don't match with a good dictionary; 3. a dynamic recaptcha database is updated and word images are submitted to wiki contributors, both as a formal captcha for unlogged user edits, and as a volunteer job to help wikisource projects; updates will fix text files; 4. a tool should be build, to upload "pure text" from such text files into any wikisource project; 5. finally refined text could be re-uploaded into djvu file, so converting it into a "djvu file with a wiki text layer".
Alex
4.