On Sat, Aug 14, 2010 at 8:49 PM, Thomas Voegtlin thomasV1@gmx.de wrote:
Also, in on_body_scroll, you could avoid the for loop : divide
$('#body').position()['scrollTop'] by the height of an image
'fraid not - sometimes the rendered text runs longer than the image, so the "row" can be higher than the image. Example: http://toolserver.org/~magnus/book2scroll/index.html (scroll down and you'll see it)
hmm, you are right ; I had a "pure scan" version in mind.
But it would be nice to have a version that does not load the text, just in order to see if the WMF servers are fast enough to provide the same fluidity as in the Google Books interface.
I don't think the text retrieval is the slow step here...
For the size quantization, I think it is better to request a desired width than a desired height ; the API does not exactly give you the height you request. In addition, if you quantize the width you will be likely to request thumbs that are already created by ProofreadPage.
I've switched to specifying width rounded to 100s; however, the API still gives me one-off images (599 instead of 600 px). I could hack the API thumbnail URL, though. Better yet, I can probably skip that step entirely after the first one...
Also, for the text, I just had a crazy idea : instead of requesting the text of each page, you can do a single request for the whole book, using &action=parse (pass the <pages/> command to it, as in this script : http://wikisource.org/wiki/MediaWiki:Dictionary.js ).
Then we can split the returned string with a regexp that detects the page breaks (they are in a special span element), and place it in the corresponding divs ; things will break whenever a html formatting element ends on a different page than where it begins, but we could write a function that balances the missing elements.
Why load a giant text and then hack around on broken HTML, when I can just query each page individually? It's not really slow, at least not in Google Chrome.
Meanwhile, I added a feature to hide "header elements" like the proofread line, which kind of disrupts the reading flow. There's a checkbox to toggle header display.
And for Klaus, I added de.wikisource: http://toolserver.org/~magnus/book2scroll/index.html?lang=de&numlen=3&am...
Cheers, Magnus