Thanks for illustrating clearly this point of view.
Nevertheless: are we digitalizing works or are we digitalizing books? It's different.
Apart from theory of difference between a work and a book, there's too some practical consequence. If nsPage is central as it is in my opinion, then any data, both regarding author's words and anything other useful to use them, to quote them, and to assemble them with illustrations, references, logical structure of chapters and so on, should be contained into nsPage, while presently such data are splitted into many different containers (nsIndex, nsPage, ns0 and their infoboxes). I vaguely feel, that there's something to fix in this apporach; and when I recently discovered TEI, and I went a little deeper into OCR text representation as it is, i.e., into abbyy.xml files, my feel became stronger.
In brief, my proposal is: can we consider the possibility to bring into nsPage any structural/logic data needed to build any possible non-paged representation of a work?
Alex
2013/8/26 billinghurst billinghurst@gmail.com
On Sun, 25 Aug 2013 07:46:22 +0200, Alex Brollo alex.brollo@gmail.com wrote:
Into a recent talk at en.source Scriptorium, it has been told that
nsPage
can be viewed merely as a proofreading tool, the ns0 transclusion/text being the real core of source content.
I have a different opinion, since I see nsPage code as the real core of source content, ns0 being merely a derived content, that could be
obtained
with complete automation with a set of data wrapped into a Lua/Scribunto set of structural data (wrapping any needed data for header template and for pages tag), so that any ns0 page/subpage could be obtained with a template {{Derive|index base page name}}.
Giving to nsPage such a core content role, it will be much simpler to
wrap
into it TEI data, and any POV related to different styles of chapter/sections structure/naming could be avoided; html rendering will
be
unchanged, so saving IMHO conversion in ePub.
What do you think about?
Alex brollo
I am fairly certain that 95% of our transcribers would have little or no concept about which you are talking, and I am not certain that I do either. Once we get out of the scope of the obvious, further suggestions start to be difficult.
The concept that we utilise at enWS is that
- Page: ns is a working, non-presentation area. It is a means for
formatting text for transclusion to the main ns (for straight transcription) and for translation (for WS sourced translations).
- Main ns is the presentation layer of the work produced by the author.
We are not into the slavish concept of "the page" as produced by the printer as its own entity beyond it being a carriage for the text. I would think that any further interpretation about structural data is getting too weighed down in other considerations, not the concept of the capturing of the words of an author.
Regards, Billinghurst
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l