On Tue, Nov 25, 2014 at 6:34 PM, Dominic McDevitt-Parks <mcdevitd@gmail.com> wrote:
You have a good point, though. One of the differences between Wikisource and most other platforms is that it is actually richly formatted. It's kind of a shame to strip all that formatting information out when extracting the transcriptions. (Though many destinations wouldn't know what to do with formatted text anyway.) 

I think this is a crucial issue.
Many projects do give you the possibility to download a .txt, which is ok for digital preservation, but I challenge anyone to actually read a book in txt. :-)
I believe that accessibility is having good ebooks accessible and readable on numerous devices.
IMHO, what Tpt has done with his EPUB tool is remarkable: a nice, quick tool for generate fairly formatted ebooks, allow readers to actually read an ebook on a Kindle or a Kobo (or a tablet). It also work both with Index and ns0 books. The problem is that it's not perfectly formatted, of course, and it's not integrate within MediaWiki.
When I put the link to the ebook converter directly in the Header template, stats skyrocketed : in few weeks we had thousands of downloads. (see it here: http://wsexport.wmflabs.org/tool/stat.php)

So, readibility is one big issue. We are here to be read.

Structured formats it's good for export, integration with different libraries, standardisation, and so on. It's fundamental, I think, for the development of the whole project. if we convince the WMF to put some permanent staff time, many things could be achieved :-)


PS: the script it's bad, I warn you, bit that's what I come up so far. I hope to improve it in the next weeks. If you can make it better, please do :-)