Well, lets backtrack.
The original question was, how can we exclude wikipedia clones from the search.
my idea was to create a search engine that includes only refs from
wikipedia in it.
then the idea was to make our own engine instead of only using google.
lets agree that we need first a list of references and we can talk
about the details of the searching later.
thanks,
mike
On Fri, Dec 10, 2010 at 11:02 PM, <WJhonson(a)aol.com> wrote:
In a message dated 12/10/2010 1:31:20 PM Pacific
Standard Time,
jamesmikedupont(a)googlemail.com writes:
If we prefer pages that can be cached and translated, and mark the
others that cannot, then by natural selection we will in long term
replaces the pages that are not allowed to be cached with ones that
can be.
My suggestion is for a wikipedia project, something to be supported
and run on the toolserver or similar.
I think if you were to propose that we should "prefer" pages that "can be
cached and translated" you'd get a firestorm of opposition.
The majority of our refs, imho, are still under copyright. This is because
the majority of our refs are either web pages created by various authors who
do not specify a free license (and therefore under U.S. law automatically
enjoy copyright protection). Or they are refs to works which are relatively
current, and are cited, for example in Google Books Preview mode, or at
Amazon look-inside pages.
I still cannot see any reason why we would want to cache anything like
this. You haven't addressed what benefit it gives us, to cache refs.
My last question here is not about whether we can or how, but how does it
help the project?
How?
W
--
James Michael DuPont
Member of Free Libre Open Source Software Kosova and Albania
flossk.org flossal.org