--- Ray Saintonge saintonge@telus.net wrote:
Jesse Martin (Pathoschild) wrote:
That's a good point. How about a much cleaner
syntax that can be used
to generate the OCR markup? With your example
text:
{{ocr line| The first experiments were made on the
absorption of carbonic }}
{{ocr line| acid gas by water: and here a singular
disagreement was observed }}
{{ocr line| in the first trials made under exactly
the same circumstances. It }}
This is much easier to read, you know where the
line breaks go, and
it's immediately clear even to someone stumbling
across the text that
we're specifically keeping track of lines (so they
don't helpfully
remove unneeded line breaks). Since single line
breaks are ignored by
MediaWiki, we can just use the same line width so
the template syntax
lines up for easier ignoring.
I'm still skeptical about what this will accomplish, but will address that later. The above does not address the treatment of hyphens. When MediaWiki wraps single line breaks it ignores the hyphens that break up a word at the end of the line, and treats the word as though it were two.
Ec
I am agreeing with EC here. I think you are trying to do far too much with the same piece of text. Perfectly readable/editable wikimarkup and exactly macthing OCR text are not possible with the same text. I suggest you find a way to hack having the text existing twice in the proofreading page. Something like below:
<!-- Here is text with OCR breaks and hyphens which matches the printed page-->
Here is the wikimarkup text that is trancluded to the WS page
Of course this means both sets of text need to be proofread, but I think a script should be able highlight all the differences between them making it simple to proofread one from the other. If you really want to have only one version of the text, it will have to have the exactness of OCR sacrificed. People will always go through the markup "fixing" the hyphens.
Birgitte SB
____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ