This seems like a very weird way to do things. Why is the book being split up by page to begin with? For optimal reading, you should put a lot more than one book-page's worth of content on each web page. It's hard to say what an appropriate fix is if I don't know why this is being done to begin with.
This is being done in order to proofread the text, and in order to make it more trustable than text without scan.
Is the idea is that the pages should later be transcluded into one big page, and they're only temporarily on separate pages for proofreading purposes? If so, why not just have the extension that displays the wikitext and Djvu pages side-by-side (ProofreadPage?) display a bunch of pages at once?
yes, this is precisely what the extension does. It has a tag that transcludes a contiguous list of pages. Single pages displayed with scans are used for proofreading, but they are not the final result.
You could then put all indivisible content on the page where it begins, so put the full ref text on the first page.
This is one of the workarounds that we have been using ; in the example that I have posted, you can see that we did exactly that.
However, this solution is not satisfying, and it is too difficult for many users. Even without this multiple pages problem, it is difficult to convince contributors to use the <ref> tag, because it implies to move the footnote's text away from its original location (remember that we start from OCR text). Now, when they see a footnote that spans over two or more pages, they tend to refuse <ref>, and to favor templates instead, combined with section transclusion (which makes the transclusion work very complicated)
Some contributors have designed tricks that combine the <ref> tag with section transclusion, in order to leave the footnote text in front of the scan. However, this results in complicated formulas, that are unacceptably difficult for most users.
If you can see multiple pages at once, this isn't much harder to proofread, since you can just look down a bit.
I guess you also mean "see multiple scans at once". This would be another workaround, but it would not deter contributors from trying to leave the text where it was in the first place, and where we think it belongs.
We have been dealing with this problem for several years now, and all the solutions that we have found have drawbacks. I do not think that we can solve this without extending the tool that manages references.
Thomas