Maybe this is of interest?
Aubrey
---------- Forwarded message ---------- From: Seth Woodworth seth@sethish.com Date: Tue, Aug 19, 2014 at 8:09 PM Subject: [open-humanities] Forking Project Gutenberg to Github To: A list for people interested in the use of open source tools and open access in humanities teaching and research open-humanities@lists.okfn.org, okfn-labs okfn-labs@lists.okfn.org
Hello Humanities!
I've been working on a project called GITenberg http://gitenberg.github.io .
The aim is to move Project Gutenberg's books to github.
As you probably know, Project Gutenberg (PG) is an amazing organization that has been digitizing public domain books since the 1970s. They have around 45,000 books.
But PG is hesitant to upgrade their tools, and have limited resources to work on new projects. But there are issues with the current collection. There are some remaining typos and transcription errors. And many books are using old encoding formats (PG predates unicode).
I want to help with that, and along the way, produce something that more developers, OKFN hackers, digital humanists and other groups can readily build upon.
Enter GITenberg.
GITenberg uses git and github to keep track of books. This adds a number of features right out of the gate, including: + version control via git + public bug tracking (PG uses a private RT instance to track reported issues) + public collaboration (pull requests under public review)
PG's metadata is provided in RDF/XML, in a 230mb zip file. While this is a wonderful resource, RDF isn't the easiest format for most developers to pick up and use. In fact, the .zip file has so many top-level folders, it can't be completely unpacked on some filesystems (ext3).
I've created repos and included the book source files (often including images!) for 43,000 of PG's books and put them on github.
There is a lot yet that I hope to do, but I would love to get OKFN's feedback, requests, or assistance!
Uploading script https://github.com/sethwoodworth/GITenberg Mailing list https://groups.google.com/forum/#!forum/gitenberg-project
All the best, Seth
_______________________________________________ open-humanities mailing list open-humanities@lists.okfn.org https://lists.okfn.org/mailman/listinfo/open-humanities Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
Definitely it is. It would be interesting to keep an eye on the project and see how it develops.
On Wed, Aug 20, 2014 at 5:37 PM, Andrea Zanni aubreymcfato@gmail.com wrote:
Maybe this is of interest?
Aubrey
---------- Forwarded message ---------- From: Seth Woodworth seth@sethish.com Date: Tue, Aug 19, 2014 at 8:09 PM Subject: [open-humanities] Forking Project Gutenberg to Github To: A list for people interested in the use of open source tools and open access in humanities teaching and research open-humanities@lists.okfn.org, okfn-labs okfn-labs@lists.okfn.org
Hello Humanities!
I've been working on a project called GITenberg http://gitenberg.github.io.
The aim is to move Project Gutenberg's books to github.
As you probably know, Project Gutenberg (PG) is an amazing organization that has been digitizing public domain books since the 1970s. They have around 45,000 books.
But PG is hesitant to upgrade their tools, and have limited resources to work on new projects. But there are issues with the current collection. There are some remaining typos and transcription errors. And many books are using old encoding formats (PG predates unicode).
I want to help with that, and along the way, produce something that more developers, OKFN hackers, digital humanists and other groups can readily build upon.
Enter GITenberg.
GITenberg uses git and github to keep track of books. This adds a number of features right out of the gate, including:
- version control via git
- public bug tracking (PG uses a private RT instance to track reported
issues)
- public collaboration (pull requests under public review)
PG's metadata is provided in RDF/XML, in a 230mb zip file. While this is a wonderful resource, RDF isn't the easiest format for most developers to pick up and use. In fact, the .zip file has so many top-level folders, it can't be completely unpacked on some filesystems (ext3).
I've created repos and included the book source files (often including images!) for 43,000 of PG's books and put them on github.
There is a lot yet that I hope to do, but I would love to get OKFN's feedback, requests, or assistance!
Uploading script https://github.com/sethwoodworth/GITenberg Mailing list https://groups.google.com/forum/#!forum/gitenberg-project
All the best, Seth
open-humanities mailing list open-humanities@lists.okfn.org https://lists.okfn.org/mailman/listinfo/open-humanities Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
-- http://aubreymcfato.wordpress.com
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
wikisource-l@lists.wikimedia.org