Importing books from Project Gutenberg - Wikisource-l

14 Oct 2016

Hy Alex,

My comment was not about spending some time on a PG-Projekt or not
spending any time at all.

The point/question (when it comes to de-WS) is a different one:

(A) to spend some of our valuable contributions into a project
that already is freely available (in another format) or spend this
time in a (related) project that is NOT already freely available?
(and we do have a lot of them)

    // note, it is not about not spending any time in proofreading
    or the Wikisourceproject... it is about finding valuable
    projects/texts to invest our time...

+ (B) to spend this time in a project, that may cost us the
findability of the whole wikisource-project (and all other texts
on wikisource) because Google/Bing/others do tag us as
fork/reuser/copy of ... (as happened in the past, at least with
de, when we had some texts of the commercial
http://gutenberg.spiegel.de/ that is also supported by ABBY with a
free softwarelizense)

Anika

2016-10-14 10:13 GMT+02:00 Alex Brollo <alex.brollo@gmail.com
<mailto:alex.brollo@gmail.com>>:

    I'm too very interested both into the idea and into its
    technical implementation, but I need some more doc for dummies
    to understand it fully :-(

    About importing into wikisource texts alreary proofread: a
    text into wikisource is different from a similar text into
    another web site, since it is "a node into wiki network", and
    this goal deserves IMHO some pain to proofread (and re-format)
     it again, adding lots of wiki cross links.

    Alex

    2016-10-14 8:27 GMT+02:00 Andrea Zanni
    <zanni.andrea84@gmail.com <mailto:zanni.andrea84@gmail.com>>:

        I think the idea is good,
        but I would like to try that in my wikisource:
        could you manage to take also the few italian books that
        PG has?
        Thanks!

        On Fri, Oct 14, 2016 at 8:23 AM, Anika Born
        <WikiAnika@wikipedia.de <mailto:WikiAnika@wikipedia.de>>
        wrote:

            corr1: [...] does not ha*ve*/show the scans, [...]

            Anika

            2016-10-14 8:18 GMT+02:00 Anika Born
            <WikiAnika@wikipedia.de <mailto:WikiAnika@wikipedia.de>>:

                Hy Sam,

                would be good, cause PG does not hat/show the scans,

                But

                as I remember there was/is a policy at de.ws
                <http://de.ws> to not use texts from other
                projects (say: if there is text A in PG, there
                won't be a similar text A in de.WS),

                cause at the time de.WS did use PG-texts... Google
                said WS is a mirror of PG and all other (not
                PG)-texts were left out in Google-Search-Results
                as well....  The (small) visibility of WS got lost
                completely... That is the reason, why there are no
                new projects on de-WS about texts that are
                available in a (nearly) similar project

                (besides the effort: why spending so much time on
                a text that already is avilable? - you'd have to
                proofread ist at least two times)

                But that is this special German-thing.....

                What do the others think about it?
                Anika

                2016-10-14 3:20 GMT+02:00 Sam Wilson
                <sam@samwilson.id.au <mailto:sam@samwilson.id.au>>:

                    Hi all,

                    I've been tinkering with an idea I've had for
                    importing Project Gutenberg books into
                    Wikisource: http://tools.wmflabs.org/pg2ws/
                    <http://tools.wmflabs.org/pg2ws/>

                    The idea is that, if Wikidata makes a link
                    between a PG ID number and a Wikisource Index
                    page, then we can go through that Index page
                    one page at a time, and copy the page's text
                    from the PG book to the WS page.

                    The interface so far isn't very brilliant, but
                    I'm just trying to figure out if this is
                    worthwhile or not. Basically, it's a matter of
                    selecting the right chunk of text in the
                    right-most text box (the full PG text) and
                    hitting the button to move it left into the
                    centre box. Then cleaning it up (manually and
                    with the magic cleaning button) to make it
                    match the image, and then uploading it to
                    Wikisource.

                    It's a bad tool though, because it doesn't
                    handle the running header, and the copy-across
                    button doesn't do nice things with {{hws}}
                    etc. — not to mention all the other things it
                    doesn't do.

                    Anyway, just thought I'd mention it. :-)
                    Anyone think this is an avenue worth
                    exploring? Certainly I'd love to be able to
                    say we've got everything PG has /and more/!

                    —Sam

                    PS changes made by this tool are all tagged as
                    "OAuth CID: 638" —

                    https://en.wikisource.org/w/index.php?title=Special:RecentChanges&tagfilter=OAuth+CID%3A+638
                    <https://en.wikisource.org/w/index.php?title=Special:RecentChanges&tagfilter=OAuth+CID%3A+638>

                    _______________________________________________
                    Wikisource-l mailing list
                    Wikisource-l@lists.wikimedia.org
                    <mailto:Wikisource-l@lists.wikimedia.org>
                    https://lists.wikimedia.org/mailman/listinfo/wikisource-l
                    <https://lists.wikimedia.org/mailman/listinfo/wikisource-l>

            _______________________________________________
            Wikisource-l mailing list
            Wikisource-l@lists.wikimedia.org
            <mailto:Wikisource-l@lists.wikimedia.org>
            https://lists.wikimedia.org/mailman/listinfo/wikisource-l
            <https://lists.wikimedia.org/mailman/listinfo/wikisource-l>

        _______________________________________________
        Wikisource-l mailing list
        Wikisource-l@lists.wikimedia.org
        <mailto:Wikisource-l@lists.wikimedia.org>
        https://lists.wikimedia.org/mailman/listinfo/wikisource-l
        <https://lists.wikimedia.org/mailman/listinfo/wikisource-l>

    _______________________________________________
    Wikisource-l mailing list
    Wikisource-l@lists.wikimedia.org
    <mailto:Wikisource-l@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikisource-l
    <https://lists.wikimedia.org/mailman/listinfo/wikisource-l>

_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
<mailto:Wikisource-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
<https://lists.wikimedia.org/mailman/listinfo/wikisource-l>