Just to fix our present thoughts/"discoveries".
1. ABBYY OCR procedure outputs _abbyy.xml file, containing any detail about multi-level text structure and detailed information, character by character, about formatting and recognition quality; _abbyy.xml file is published by IA as _abbyy.gz file;
2. some of _abbyy.xml data are wrapped into IA djvu text layer; multi-layer structure is saved, but details about characters are discarded;
3. MediaWiki gets the "pure text" from djvu text layer, and discards all other data multi-layer data of djvu layer, and loads the text into new nsPage pages;
4. finally & painfully wikisource users then add formatting again into raw text; in a large extent, they re-build by scratch some of data that was present into original, source abbyy.xml file and - in part - into djvu text layer. :-(
This seems deeply unsound IMHO; isn't it?
Alex