I know this is not new, but I'm rather pissed off to see Google returns several commercial sites featuring all Wikipedia articles on a request explicitly specifying "Wikipedia" as search term, and on top of the real thing.
I figure more and more of these sites will pop up when people realize how easy it is to make money this way, and probably loads of it.
E.g. "rembrandt wikipedia" http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&am... a+rembrandt shows Wikipedia only behind two other sites, one of which uses "Rembrandt - Wikipedia" even in the <title>..</title> tags, which is shown prominently in the Google response.
I know GDFL is very permissive, but is there nothing we can do about this? Even if we don't want the money, which is also an old discussion, we might at least attract more contributors if parasitic sites (not all mirrors are in this category) were less succesful.
Could we not strike a deal with Google similar to the one with Yahoo by which Google favours the original content instead of outdated copies, should be in their interest too.
Erik Zachte
ErikZ-
Could we not strike a deal with Google similar to the one with Yahoo by which Google favours the original content instead of outdated copies, should be in their interest too.
That seems like the most reasonable course of action. Note, however, that Google is notoriously unresponsive to any queries regarding search result ranking (and who could blame them). Taking this from the bottom up could be difficult. Do we have anyone who has contacts to reasonably high level Google personnel? Has anyone at Google ever said something about Wikipedia?
There are other options aside from downranking Wikipedia mirrors. Google does have a deal with reference.com, whereby when you enter a word like "elephant", it provides a link to a dictionary [definition]. I'm not sure if reference.com pays them for that, and if so, how much. I could imagine a similar link [encyclopedia article] for searches which match Wikipedia titles.
We do need to enforce the FDL requirements more systematically. phatnav.com, for example, leeches our images, but does *not* provide a backlink to Wikipedia articles and even calls itself "a Wikipedia". This is a violation of the FDL and of our trademark.
While these mirrors do hurt our rank, they have the effect of getting multiple Wikipedia articles into many search queries. I hope this will not be perceived as search engine spamming.
Regards,
Erik
. . . . . . . . . . . . . . . . . . . . . . . . . . . till we *) . . .
Hi,
We do need to enforce the FDL requirements more systematically. phatnav.com, for example, leeches our images, but does *not* provide a backlink to Wikipedia articles and even calls itself "a Wikipedia". This is a violation of the FDL and of our trademark.
BTW: I wrote them a short notice that I don't like them using my user- page without proper GFDL, and I only got back a one-sentence-mail stating they will update their encyclopedia part in summer.
__ . / / / / ... Till Westermayer - till we *) . . . mailto:till@tillwe.de . www.westermayer.de/till/ . icq 320393072 . Habsburgerstr. 82 . 79104 Freiburg . 0761 55697152 . 0160 96619179 . . . . .
On Apr 11, 2004, at 11:04 PM, Erik Moeller wrote:
There are other options aside from downranking Wikipedia mirrors. Google does have a deal with reference.com, whereby when you enter a word like "elephant", it provides a link to a dictionary [definition]. I'm not sure if reference.com pays them for that, and if so, how much. I could imagine a similar link [encyclopedia article] for searches which match Wikipedia titles.
I've thought about this possibility for a while. Personally, I think Reference.com is becoming a much poorer resource as it tries to become a richer company. First it was advertising, which was a pain, but not too bad. Now it seems to often favor its "premium" definitions over the standard ones, meaning if you don't subscribe you can't find definitions for certain words. I'd like to see Wiktionary one day take over this function in Google. Obviously, if Reference.com pays for the placement, which they probably do, this is pretty far-fetched, at least for a while.
Adding an encyclopedia link might be feasible/plausible, since we're not trying to replace the dictionary link.
Peter
-- ---<>--- -- A house without walls cannot fall. Help build the world's largest encyclopedia at Wikipedia.org -- ---<>--- --
I have a suggestion:
While our main page has a high pagerank, individual articles probably don't (depending on how well they're linked to). So, it would help to link to Special:Allpages from the main page (the static cached version is fine). That way, it would take at most 3 links from the main page to any article. Further, the second level pages should also be made static and updated once in a while, because crawlers don't request dynamic pages (I think).
Arvind
On Mon, Apr 12, 2004 at 03:49:37AM +0200, Erik Zachte wrote:
I know this is not new, but I'm rather pissed off to see Google returns several commercial sites featuring all Wikipedia articles on a request explicitly specifying "Wikipedia" as search term, and on top of the real thing.
I figure more and more of these sites will pop up when people realize how easy it is to make money this way, and probably loads of it.
E.g. "rembrandt wikipedia" http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&am... a+rembrandt shows Wikipedia only behind two other sites, one of which uses "Rembrandt - Wikipedia" even in the <title>..</title> tags, which is shown prominently in the Google response.
I know GDFL is very permissive, but is there nothing we can do about this? Even if we don't want the money, which is also an old discussion, we might at least attract more contributors if parasitic sites (not all mirrors are in this category) were less succesful.
Could we not strike a deal with Google similar to the one with Yahoo by which Google favours the original content instead of outdated copies, should be in their interest too.
Erik Zachte
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Arvind Narayanan wrote:
While our main page has a high pagerank, individual articles probably don't (depending on how well they're linked to). So, it would help to link to Special:Allpages from the main page (the static cached version is fine). That way, it would take at most 3 links from the main page to any article.
It already takes at most 4 links from the Main Page to any article (via the "Special Pages" link in the side bar).
On Tue, Apr 13, 2004 at 12:41:30PM +0100, Timwi wrote:
Arvind Narayanan wrote:
While our main page has a high pagerank, individual articles probably don't (depending on how well they're linked to). So, it would help to link to Special:Allpages from the main page (the static cached version is fine). That way, it would take at most 3 links from the main page to any article.
It already takes at most 4 links from the Main Page to any article (via the "Special Pages" link in the side bar).
I'm looking at things from google's point of view. The difference between level 3 and level 4 can be huge. Currently, Special:Allpages is not indexed by google and so it is useless. That's why I feel it should be made more prominent.
Arvind
Arvind Narayanan wrote:
On Tue, Apr 13, 2004 at 12:41:30PM +0100, Timwi wrote:
Arvind Narayanan wrote:
While our main page has a high pagerank, individual articles probably don't (depending on how well they're linked to). So, it would help to link to Special:Allpages from the main page (the static cached version is fine). That way, it would take at most 3 links from the main page to any article.
It already takes at most 4 links from the Main Page to any article (via the "Special Pages" link in the side bar).
I'm looking at things from google's point of view. The difference between level 3 and level 4 can be huge. Currently, Special:Allpages is not indexed by google and so it is useless. That's why I feel it should be made more prominent.
Hm. I see that Special:Allpages has a <meta name="robots" content="noindex,follow"> tag. Special:Specialpages, however, has index,nofollow. That means that Google won't reach Allpages through this route.
However, upon closer inspection, I notice that the Main Page *does* have a link to Special:Allpages, entitled "All articles by title".
Timwi
Erik Zachte schrieb:
I know this is not new, but I'm rather pissed off to see Google returns several commercial sites featuring all Wikipedia articles on a request explicitly specifying "Wikipedia" as search term, and on top of the real thing.
[...]
I know GDFL is very permissive, but is there nothing we can do about this?
Last year, I played a lot with google's pagerank and the way new pages find their way into the index.
One of the monst important factors in the PageRank formula is freshness. If I copy a page from the source web site which is already in the google index, I have a certain chance to get ahead of if - for some time.
After a while, the ratio of age will balance and other factors (the global pagerank of a site, and it's update frequency) will catch up.
So the answer is time. Most pages won't stay ahead of wikipedia.org for long time - If the other comply to the license, they have to link to us, which is boosting wikipedia.org.
So in the long term (and that's what wikipedia is certainly good at: deep breath) we gain from all the parasits (which appears to me as a POV term)
By the way: Noone is searching for "wikipedia rembrand" if he doesn't know wikipedia yet. I use to use google as a full text search engine for wikipedia as long as we can't provide this yet by ourself.
Our aim is to get to position 1 for the search term "rembrand" and all the other lemmata which is a project for ages. Under these condiditions we can surely afford to let other sites to provide the wikipedia content in a GFDL compliant way.
Mathias
Erik Zachte wrote:
E.g. "rembrandt wikipedia" http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&am... a+rembrandt shows Wikipedia only behind two other sites, one of which uses "Rembrandt - Wikipedia" even in the <title>..</title> tags, which is shown prominently in the Google response.
I definitely think we need to be more aggressive with threatening legal action against trademark violation such as this.
Timwi
wikipedia-l@lists.wikimedia.org