Wow, what's Wikipedia's policy about using a bot to scrape everything?
On Sat, Jun 20, 2009 at 2:47 PM, Brian <Brian.Mingus(a)colorado.edu> wrote:
That is against the law. It violates Google's
ToS.
I'm mostly complaining that Google is being Very Evil. There is nothing we
can do about it except complain to them. Which I don't know how to do -
they
apparently believe that the plain text versions of their books are akin to
their intellectual property and are unwilling to give them away.
On Sat, Jun 20, 2009 at 12:34 PM, Falcorian <
alex.public.account+WikimediaMailingList@gmail.com<alex.public.account%2BWikimediaMailingList@gmail.com>
<alex.public.account%2BWikimediaMailingList@gmail.com<alex.public.account%252BWikimediaMailingList@gmail.com>
wrote:
So the bot just has to run at human speeds so it
does not get banned, it
still won't get tired or make unpredictable mistakes. And you can run it
from different IPs to parallelize.
--Falcorian
On Sat, Jun 20, 2009 at 11:04 AM, Brian <Brian.Mingus(a)colorado.edu>
wrote:
Not likely. I've been banned from
Google's regular search at least a
dozen
> times during semi-frenetic search sprees in which I was identified as a
> bot.
> There is no doubt that if you try to automate it you will be quickly
shot
down.
On Sat, Jun 20, 2009 at 12:02 PM, Platonides <Platonides(a)gmail.com>
wrote:
>
> > Brian wrote:
> > > Unfortunately the only way I've found to download the full text of
a
> > public
> > > domain book from Google is to flip through the book a page at a
time,
> > > copying the text to your
clipboard.
> > > There are roughly 2-3 million public domain books in Google Books.
> >
> > That's easy to fix :)
> >
> >
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l(a)lists.wikimedia.org
> > Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l
_______________________________________________
foundation-l mailing list
foundation-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l
_______________________________________________
foundation-l mailing list
foundation-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l
_______________________________________________
foundation-l mailing list
foundation-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l