Interesting.
You might want to have a look at Microsoft's Seadragon technology : http://www.dailymotion.com/video/x2738e_sea-dragon-and-photosynth-demo_tech (check at 1min20s if you don't want to watch the whole video)
Now, getting back to your proposal : A javascript interface similar to the ones at IA or Google Books, that downloads only the few scans that need to be shown to the user, would be fairly easy to write using the API. We could even do it for text, as long as it is rendered as well-separated physical pages.
However, it would be more complicated to apply the same principle if text is to be rendered without page separations, and preserving its logical structure. We would need to either pre-parse the whole document and develop an API that lets us download small bits of it, or to parse the current page together with previous and next pages. I am not sure if it is really worth the effort ; the bandwidth saving would be less significant than for scans.
Thomas
Lars Aronsson a écrit :
On 08/11/2010 09:46 PM, Aryeh Gregor wrote:
This seems like a very weird way to do things. Why is the book being split up by page to begin with? For optimal reading, you should put a lot more than one book-page's worth of content on each web page.
ThomasV will give the introduction to ProofreadPage and its purpose. I will take a step back. A book is typically 40-400 pages, because that is how much you can comfortably bind in one volume (one spine) and sell as a commercial product. A web 1.0 (plain HTML + HTTP) page is typically a smaller chunk of information, say 1-100 kbytes. To match (either in Wikisource or Wikibooks) the idea of a book with web technology, the book needs to split up, either according to physical book pages (Wikisource with the ProofreadPage extension) or chapters (Wikisource without ProofreadPage or Wikibooks).
In either case, the indiviual pages have a sequential relationship. If you print the pages, you can glue them together and the sequence makes sense, which is not the case with Wikipedia. Such pages have links to the previous and next page in sequence (which Wikipedia articles don't have).
Wikipedia, Wikibooks and Wikisource mostly use web 1.0 technology. A very different approach to web browsing was taken when Google Maps was launched in 2005, the poster project for the "web 2.0". You arrive at the map site with a coordinate. From there, you can pan in any direction and new parts of the map (called "tiles") are downloaded by advanced JavaScript and XML (AJAX) calls as you go. Your browser will never hold the entire map. It doesn't matter how big the entire map is, just like it doesn't matter how big the entire Wikipedia website is. The unit of information to fetch is the "tile", just like the web 1.0 unit was the HTML page.
If we applied this web 2.0 principle to Wikibooks and Wikisource, we wouldn't need to have pages with previous/next links. We could just have smooth, continuous scrolling in one long sequence. Readers could still arrive at a given coordinate (chapter or page), but continue from there in any direction.
Examples of such user interfaces for books are Google Books and the Internet Archive online reader. You can link to page 14 like this: http://books.google.com/books?id=Z_ZLAAAAMAAJ&pg=PA14 and then scroll up (to page 13) or down (to page 15). The whole book is never in your browser. New pages are AJAX loaded as they are needed. It's like Google maps except that you can only pan in two directions (one dimensions), not in the four cardinal directions. And the zoom is more primitive here. After you have scrolled to page 19, you need to use the "Link" tool to know the new URL to link to.
At the Internet Archive, the user interface is similar, but the URL in your browser is updated as you scroll (for better or worse), http://www.archive.org/stream/devisesetembleme00lafeu#page/58/mode/1up
If we only have scanned images of book pages, this is simple enough, because each scanned image is like a "tile" in Google maps. But in Wikisource, we have also run OCR software to extract a text layer for each page, and we have proofread that text to make it searchable. I still have not learned JavaScript, but I guess you could make AJAX calls for a chunk of text and add that to the scrollable web page, just like you can add tiled images. Google has not done this, however. If you switch to "plain text" viewing mode, http://books.google.com/books?pg=PA14&id=Z_ZLAAAAMAAJ&output=text you get traditional web 1.0 "pages" with links to the previous and next web page. (Each of Google's text pages contains text from 5 book pages, e.g. page 11-15, only to make things more confusing.)
But the real challenge comes when you want to wiki edit one such chunk of scrollable text. I think it could work similar to our section editing of a long Wikipedia article. But to be really elegant, I should be able, when editing a section, to scroll up or down beyond the current section, in an eternal textarea.
If we can solve this, "section editing 2.0" that goes outside of the box (or maybe we should skip directly to WYSIWYG editing), then we can have the beginning of a whole new Wikisource interface.