[Foundation-l] excluding Wikipedia clones from searching
WJhonson at aol.com
WJhonson at aol.com
Fri Dec 10 21:19:42 UTC 2010
In a message dated 12/10/2010 1:10:26 PM Pacific Standard Time,
jamesmikedupont at googlemail.com writes:
> My point is we should index them ourselves. We should have the pages
> used as references first listed in an easy to use manner and if
> possible we should cache them. If they are not cacheable because of
> some restrictions, the references should be marked somehow as not as
> good and people might find better references. In the end, like
> citeseer you will find that pages that are available and open and
> cachable will be cited and used more than pages that are not.
>
> Right now, I dont know of a simple way to even get this list of
> references from wp. There is alot of work to do, and if we do this, it
> will benefit the wikipedia. Another thing to do is to translate the
> pages referenced.
>
> mike
>
I understand your point, but you're avoiding answering the points I raised.
They are archived at archive.org by permission. You tell archive.org to
archive your site, and they do. You tell them to stop, and they do.
What advantage would we have to repeat the caching yet again that
archive.org is already doing? You haven't answered that.
No matter what occurs, you're going to have trouble retrieving the list of
refs from a WP page (or any web page), without knowing some programming
language like PHP. Using PHP it's a fairly trivial parsing request. It's
that's your only problem, I can write you a script to do it, for twenty bucks.
You cannot translate a work, which is under copyright protection, without
violating their copyright. Copyright extends to any effort that
substantially mimics the underlying work. A translation is found to violate copyright.
You could however make a parody :)
W
More information about the foundation-l
mailing list