Aryeh Gregor a écrit :
On Wed, Aug 11, 2010 at 6:05 PM, thomasV1@gmx.de wrote:
I guess you also mean "see multiple scans at once". This would be another workaround, but it would not deter contributors from trying to leave the text where it was in the first place, and where we think it belongs.
Why do you think it belongs split across separate pages, instead of in one place, when it's logically one unit?
I am not sure if I really understand this question ; you said earlier that you agree with the idea of splitting books into physical pages. When we do this, the logical organization of the book (chapters, etc) is achieved through transclusion.
And why do you think there'd by any big added risk that people won't obey Wikisource conventions in transcription?
It is not a risk, it is a fact. I try to promote the use of <ref> instead of templates, because it is the only way for the software to know that a footnote is a footnote. However, proofreaders are reluctant to move the text of footnotes to a different page. They do whatever they can to keep the text in front of the scan.
Here are a few examples of what they do.
1.They often do something like this : http://fr.wikisource.org/wiki/Page:Sima_qian_chavannes_memoires_historiques_... In this example, the user did not use <ref> at all. The page uses a combination of labeled sections and templates. The page is transcluded here, in the corresponding chapter : http://fr.wikisource.org/wiki/M%C3%A9moires_historiques/Introduction/Chapitr... Because of the sections, the user is unable to use the <pages/> tag ; he wrote a bunch of section transclusions , manually. This user tried to do it in a consistent way, by adding two sections to all pages, regardless of whether they actually contain footnotes or not. Other users do the same thing but are more messy.
2. Another example is here, on the English Wikisource: http://en.wikisource.org/wiki/Page:Geology_and_Mineralogy_considered_with_re... In this case, the user decided to replicate the text in two places: at the second page of the footnote, and at the page where the footnote begins : http://en.wikisource.org/w/index.php?title=Page:Geology_and_Mineralogy_consi...
3. Here is a third example, on the Italian wikisource : http://it.wikisource.org/wiki/Pagina:Manzoni.djvu/22 This page uses a template and labeled sections. However, this template fails when the footnote lasts for more than 2 pages.
I can give you many more examples; the point is that proofreaders do not want to move away the text of a footnote.
To fully appreciate the situation, you have to know that :
1. When a page is created, it comes preloaded with OCR text. Thus, the text of the footnote is initially on the page where the footnote is written. Moving it to the previous page involves some extra work. see here : http://fr.wikisource.org/w/index.php?title=Page:S%C3%A9vign%C3%A9_-_Lettres,...
2. Footnotes are not small. Footnotes can be very long. Footnotes can be spread over more than 2 pages. In some books, footnotes can weight more than half of the total of the text. Moving around 50% of the text of a book is a real pain. Please have a look at this example, which is 3 pages long: http://en.wikisource.org/wiki/Page:Geology_and_Mineralogy_considered_with_re... While I agree that hyphenated words can easily be moved to the previous or next page, it is not practical to do so with footnotes.
3. Even if we had a way to move the text of footnotes without pain, I do not think that this convention would be adopted. Some users absolutely want to preserve the page-by-page structure of the book, even if it involves using LST and the complicated constructs that you saw.
To the contrary: if you add a new magic <ref> attribute, *nobody* will be able to figure out the right way to do it unless they're told, because this will be the only place in any wiki anywhere where that attribute is actually used.
"nobody" outside of Wikisource ? I can understand that this feature is of no direct relevance to Wikipedia, but is this a reason to reject it ? it sounds a bit like "WMF==Wikipedia". Note that the proposed feature does not interfer with the way Cite currently works, so Wikipedia users do not need to figure out how it works.
Also, for Wikisource users, compare the complexity of the solutions that have been deployed in the above examples, to the complexity of writing <ref follow="foo"></ref> around the text of a reference. Which one do you find more complex ?
If you do a multi-page approach, then at least proofreaders don't have to remember anything extra on a technical level.
No, but they would need to move around big chunks of text. And when they want to proofread it, they wound need to search for the page where it has been moved from (which is not always the previous one). In addition, the examples I provided above demonstrate that proofreaders do whatever they can in order to keep the text at the page where it comes from, even if it means insanely complicated work ; I do not think that adding the capability to display multiple scans in front of the text will change anything to that.
You've made a reasonable case that *some* software change is needed. However, I think you've got the wrong one. Trying to add this weird special-case feature to Cite, which is totally useless unless you're using ProofreadPage in the particular way Wikisource is using it, loses major points for inelegance, complexity, and mixing extensions together. If the use-case can be adequately addressed by just having ProofreadPage display multiple scans and edit boxes on one page, that would be a much simpler and more intuitive solution. Not only that, but you could also stop using magic templates to split words across pages and things like that, so it would be considerably easier to use.
A few years ago, I introduced the <pages/> tag, which replaces manual transclusions. This command adds a new line between all transcluded pages. Before that, with manual transclusions, it was possible to transclude 2 pages containing a hyphenated word next to eachother, not separated by a newline, so that the word was not split. This was a bad practice, at least seen by programmer, because the information needed for text rendering is not embedded in the pages; it is much better if the target page (where text is transcluded) does not need to know anything about the content of individual pages. This is why I recommended to use <pages/>, and to keep hyphenated words unsplit, on a single page.
However, users did not like that solution ; they wanted to reproduce exactly what is on the scan. In order to adapt to the new <pages/> tag, they designed those magic templates for hyphenated words. Using these templates is in no way mandatory. However, it turned out that users prefer to use them, because they make it possible to display exactly what is on the scan. And after some time, I too became convinced that these templates are a good thing, in part because they create a convention that is unambiguous. This is just another fact, that illustrates how we wikisourcers are : we want to be able to display the text in a way that faithfully reproduces the scan. And this is why I doubt that adding a possibility to display multiple scans in front of the text will change anything to the way footnotes are handled.
Thomas