Re: [Wikisource-l] Importing books from Project Gutenberg

16 Oct 2016


      Ok, I'll use https://www.wikidata.org/wiki/Q27245478 as an example and I'll
submit it to it.source WD specialists to see if we can retrieve, or add
data for a test work.
Alex
2016-10-16 1:28 GMT+02:00 Sam Wilson sam@samwilson.id.au:
...
Hm, it should work fine for it.ws too. Can you give me a WD item for a
book with a PG ID and a it.ws Index page? I'll investigate further... :-)
One cool thing that I've only recently found is this list of PG's sources:
http://www.pgdp.net/c/tools/project_manager/show_image_sources.php (you
need to log in)
It's not very structured, but it's the only place I've found that links a
PG ID to a scan on the Internet Archive or elsewhere. I'm thinking of
writing a scraper to get the data so that it can at least link more PG IDs
and IA identifiers on Wikidata.
—Sam
On 13/10/16 23:27, Andrea Zanni wrote:
I think the idea is good,
but I would like to try that in my wikisource:
could you manage to take also the few italian books that PG has?
Thanks!
On Fri, Oct 14, 2016 at 8:23 AM, Anika Born WikiAnika@wikipedia.de
wrote:
...
corr1: [...] does not ha*ve*/show the scans, [...]
Anika
2016-10-14 8:18 GMT+02:00 Anika Born WikiAnika@wikipedia.de:
...
Hy Sam,
would be good, cause PG does not hat/show the scans,
But
as I remember there was/is a policy at de.ws to not use texts from
other projects (say: if there is text A in PG, there won't be a similar
text A in de.WS),
cause at the time de.WS did use PG-texts... Google said WS is a mirror
of PG and all other (not PG)-texts were left out in Google-Search-Results
as well....  The (small) visibility of WS got lost completely... That is
the reason, why there are no new projects on de-WS about texts that are
available in a (nearly) similar project
(besides the effort: why spending so much time on a text that already is
avilable? - you'd have to proofread ist at least two times)
But that is this special German-thing.....
What do the others think about it?
Anika
2016-10-14 3:20 GMT+02:00 Sam Wilson sam@samwilson.id.au:
...
Hi all,
I've been tinkering with an idea I've had for importing Project
Gutenberg books into Wikisource: http://tools.wmflabs.org/pg2ws/
The idea is that, if Wikidata makes a link between a PG ID number and a
Wikisource Index page, then we can go through that Index page one page at a
time, and copy the page's text from the PG book to the WS page.
The interface so far isn't very brilliant, but I'm just trying to
figure out if this is worthwhile or not. Basically, it's a matter of
selecting the right chunk of text in the right-most text box (the full PG
text) and hitting the button to move it left into the centre box. Then
cleaning it up (manually and with the magic cleaning button) to make it
match the image, and then uploading it to Wikisource.
It's a bad tool though, because it doesn't handle the running header,
and the copy-across button doesn't do nice things with {{hws}} etc. — not
to mention all the other things it doesn't do.
Anyway, just thought I'd mention it. :-) Anyone think this is an avenue
worth exploring? Certainly I'd love to be able to say we've got everything
PG has *and more*!
—Sam
PS changes made by this tool are all tagged as "OAuth CID: 638" —
https://en.wikisource.org/w/index.php?title=Special:RecentCh
anges&tagfilter=OAuth+CID%3A+638

Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Wikisource-l mailing listWikisource-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikisource-l

Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] Importing books from Project Gutenberg