Here's an example of a remarkable publication that we should support capturing, in its elements and its final layout, to support reuse and sharing in other sorts of documents:
http://skateistan.org/skateistan_blog/out-now-student-mag-arts-skateboarding http://www.skateistan.org/PDFs/Bridge-Final.pdf
We need to improve automation for adding these sorts of things to wikisource: scripts to request and capture license information, and to batch upload PDFs, extracting individual images and text from source files, uploading them separately, and approximating the original layout.
Sam.
I think that it's time to bring this issue up: how can we manage efficiently born-digital documents?
I think Wikisource has developed a pretty amazing workflow to deal with digitized documents, transcribing and proofreading them with the Proofread extension. But every time we use on Wikisource this extension to transcribe back a PDF (for thesis, CC-BY-SA books, etc.), I feel something is wrong.
We don't have many tools (or if we have they are spread out in different places) to extract automatically formatted text from them, and afaik we can't take a LaTeX source and simply upload that on Source.
What do you think? Could we discuss the issue?
Aubrey
2011/6/26 Samuel Klein meta.sj@gmail.com:
Here's an example of a remarkable publication that we should support capturing, in its elements and its final layout, to support reuse and sharing in other sorts of documents:
http://skateistan.org/skateistan_blog/out-now-student-mag-arts-skateboarding http://www.skateistan.org/PDFs/Bridge-Final.pdf
We need to improve automation for adding these sorts of things to wikisource: scripts to request and capture license information, and to batch upload PDFs, extracting individual images and text from source files, uploading them separately, and approximating the original layout.
Sam.
-- identi.ca:sj w:user:sj +1 617 529 4266
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
wikisource-l@lists.wikimedia.org