On Sat, Aug 14, 2010 at 8:49 PM, Thomas Voegtlin <thomasV1(a)gmx.de> wrote:
Also, in on_body_scroll, you could avoid the for loop
$('#body').position()['scrollTop'] by the height of an
'fraid not - sometimes the rendered text runs longer than the image,
so the "row" can be higher than the image. Example:
(scroll down and you'll see it)
hmm, you are right ; I had a "pure scan" version in mind.
But it would be nice to have a version that does not load
the text, just in order to see if the WMF servers are fast
enough to provide the same fluidity as in the Google Books
I don't think the text retrieval is the slow step here...
For the size quantization, I think it is better to
a desired width than a desired height ; the API does not
exactly give you the height you request. In addition, if
you quantize the width you will be likely to request thumbs
that are already created by ProofreadPage.
I've switched to specifying width rounded to 100s; however, the API
still gives me one-off images (599 instead of 600 px). I could hack
the API thumbnail URL, though. Better yet, I can probably skip that
step entirely after the first one...
Also, for the text, I just had a crazy idea : instead
requesting the text of each page, you can do a single
request for the whole book, using &action=parse (pass the
<pages/> command to it, as in this script :
Then we can split the returned string with a regexp that detects
the page breaks (they are in a special span element), and place
it in the corresponding divs ; things will break whenever a html
formatting element ends on a different page than where it
begins, but we could write a function that balances the
Why load a giant text and then hack around on broken HTML, when I can
just query each page individually? It's not really slow, at least not
in Google Chrome.
Meanwhile, I added a feature to hide "header elements" like the
proofread line, which kind of disrupts the reading flow. There's a
checkbox to toggle header display.
And for Klaus, I added de.wikisource: