OK, I'll do. I hate to move many Mby around the web without a real and strong need.... but I hope to build some tools to help users while contributing, and this, IMHO, is one from the best justifications to use band and servers time.
Alex
2013/11/29 Thomas Tanon thomaspt@hotmail.fr
For these use cases I think that download the file is the best way to do it. It’s very quick because the connection between labs and the other Wikimedia clusters is very good.
Thomas
Le 29 nov. 2013 à 18:25, Alex Brollo alex.brollo@gmail.com a écrit :
Thanks Thomas, but I'm looking for something much subtler: I look for mapped text of OCR with any possible detail - t.i. I need at least the output of djvutxt, djvudump, djvused - and obviously of a copy of djvu file.
Presently I can't follow the wikidata adventure nor the metadata flow - I focus my interest on tools to help user while editing/formatting text of pages.
Alex
2013/11/29 Thomas Tanon thomaspt@hotmail.fr
There are already a lot of data in the img_metadata field of the image table. I hope all data you are looking for are in it.
Thomas
Le 28 nov. 2013 à 23:57, Alex Brollo alex.brollo@gmail.com a écrit :
I feel uncomfortable thinking to upload large files just to use a little bit of data... I presume that djvu are saved as "bundled" files, have you any new about saving them as "indirect" files, t.i. as single pages + an index file and some rather small pieces? Who could give me some detail about djvu files storage, and about projects to develop their management?
I opened a "bold bug" in Bugzilla asking for some API actions bridging API and djvulibre routines; who is in your opinion an API developer which could be interested about such a rough idea?
Alex
2013/11/28 Federico Leva (Nemo) nemowiki@gmail.com
Alex Brollo, 28/11/2013 20:36:
I'll try to test some routines to manage both image and text layers of
our itsource djvu files. My question: have I to upload them from Commons, or there's the possibility to access to them/to a copy of them into some folder of Labs without any need of uploading (painfully) a copy?
No, labs doesn't have media. Only XML dumps and pageview stats. Download from upload.wikimedia.org is supposed to be rather fast though.
Nemo
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l