A Project Gutenberg support group runs a wiki system where proofreaders can fix OCR text from scans. This speeds up the digitization process of public domain books since image files of the book pages can be uploaded and published alongside the text itself.
An example of how it looks like can be found on PG's Nordic sister project Project Runeberg:
http://runeberg.org/pictswed/0257.html
Could these features be included in MW? What's needed is: * Batch uploading image files while OCR read text will be included in the page body. * A split screen where the image can be viewed in a column on the left while the text may be edited on the right hand side. * A check box where the page can be flagged as "proofread"
This feature could be really useful, especially in Wikisource.
Links http://www.pgdp.net/c/ Distributed Proofreaders
Sourceforge page for the wiki web application http://dproofreaders.sourceforge.net/
-- Harald Groven University of Tromsø web: www.groven.no/harald
On 5/21/06, Harald Groven haraldg@sv.uit.no wrote:
Could these features be included in MW? What's needed is:
- Batch uploading image files while OCR read text will be included in
the page body.
- A split screen where the image can be viewed in a column on the left
while the text may be edited on the right hand side.
- A check box where the page can be flagged as "proofread"
Hmm.
Batch uploading was something which Rob recently whipped up. It's been available through bot software though. Images and text can go in the same wiki page without a problem.
A split screen.. Open up two browser windows, with one showing the picture at the top of the article, and the other in editing-mode.
A checkbox? Just add some sort of note to the top {{proofread}}. Possibly even have several, to describe that item's status (who's reading it, how much they have done, etc)
A clean interface would have to be dreamed up.. to make it more straightforward to have collections of pages. A bot uploader would need to be tweaked to do something like.. add pages in subdirectories like so:
* [[article]] * [[article/page1]] * [[article/page2]]
then pictures would need to be uploaded as:
[[image:article--page1]] [[image:article--page2]]
then automatically add the images to the [[article/page1]] etc items.
This functionality sounds like something which doesn't need official mediawiki support.. it would just need a nice upload helper which would upload the pictures and prepare the stub-articles in the wiki.
mediawiki-l@lists.wikimedia.org