On 27/01/07, wikien-l-request@lists.wikimedia.org wikien-l-request@lists.wikimedia.org
Date: Sat, 27 Jan 2007 22:15:01 +0100 From: "Oskar Sigvardsson" oskarsigvardsson@gmail.com Subject: Re: [WikiEN-l] An obscene example of remote loading If they follow the GFDL, then no. If they don't (which it appears that they don't, since they don't credit us; the history link doesn't work), we could sue, but really, do we even care that much?
If it's remote loading, as in, pulling content from our servers live, then we *can* stop it, and such mirrors should be reported to us on IRC in #wikimedia-tech (on Freenode, as usual) or else reported to our mailing list at wikitech-l@lists.wikimedia.org.
Remote loading is not helpful for us; it creates a small amount of additional, illegitimate load on our cache servers and it facilitates loading the site into frames and so forth...this is not necessarily illegal, but quite often, remote loading sites won't really care about the GFDL and it becomes so.
Legitimate mirrors and people who want to reuse our content are free, and encouraged, to download a database dump and process it for their needs. There is also an OAI live repository service, which I think we charge for (and I don't know how many parties actually use it).
If the licence our content is released under is violated, then individual contributors to that content have a right to sue for copyright infringement; Wikimedia doesn't actually *have* the copyright and so may not be able to do the actual litigation (but I'm not a lawyer, hate lawyers, and am not completely sure of that).
Rob Church
If it's remote loading, as in, pulling content from our servers live, then
we *can* stop it, and such mirrors should be reported to us on IRC in #wikimedia-tech (on Freenode, as usual) or else reported to our mailing list at wikitech-l@lists.wikimedia.org.
Remote loading is not helpful for us; it creates a small amount of
additional, illegitimate load on our cache servers and it facilitates loading the site into frames and so forth...this is not necessarily illegal, but quite often, remote loading sites won't really care about the GFDL and it becomes so.
Legitimate mirrors and people who want to reuse our content are free, and
encouraged, to download a database dump and process it for their needs. There is also an OAI live repository service, which I think we charge for (and I don't know how many parties actually use it).
If the licence our content is released under is violated, then individual
contributors to that content have a right to sue for copyright infringement; Wikimedia doesn't actually *have* the copyright and so may not be able to do the actual litigation (but I'm not a lawyer, hate lawyers, and am not completely sure of that).
Rob Church
I emailed info@Thagodz.com (which they say is their contact address) and asked how they get the content. --Mets501
It is remote loading. Just check your userpage at Thagodz.com, then modify it in Wikipedia, and do a full refresh at Thagodz.com again. At least when I tried (maybe 20 hours ago, at the same time I reported them at the tech IRC channel) they were doing so.
RB
OK. Well, I emailed already asking them to switch to a database dump and described how to download one; let's see what they have to say about it. --Mets501
-----Original Message----- From: wikien-l-bounces@lists.wikimedia.org [mailto:wikien-l-bounces@lists.wikimedia.org] On Behalf Of Roberto Alfonso Sent: Saturday, January 27, 2007 9:20 PM To: English Wikipedia Subject: Re: [WikiEN-l] An obscene example of remote loading
It is remote loading. Just check your userpage at Thagodz.com, then modify it in Wikipedia, and do a full refresh at Thagodz.com again. At least when I tried (maybe 20 hours ago, at the same time I reported them at the tech IRC channel) they were doing so.
RB
_______________________________________________ WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l
Rob Church wrote:
Legitimate mirrors and people who want to reuse our content are free, and encouraged, to download a database dump and process it for their needs.
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? The examples I've seen are all a bit disappointing, but perhaps that's just because the outrageous ones generate more attention.
Thanks,
William
Legitimate mirrors and people who want to reuse our content are free, and encouraged, to download a database dump and process it for their needs.
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? The examples I've seen are all a bit disappointing, but perhaps that's just because the outrageous ones generate more attention.
Yes, several sites do this quite well. See for example http://www.answers.com. --Mets501
On 28/01/07, William Pietri william@scissor.com wrote:
Rob Church wrote:
Legitimate mirrors and people who want to reuse our content are free, and encouraged, to download a database dump and process it for their needs.
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? The examples I've seen are all a bit disappointing, but perhaps that's just because the outrageous ones generate more attention.
answers.com is perhaps our most high-profile reuser, but we have an agreement with them for a live feed. I'm not offhand aware of a particularly shining example of a database-dump site, partly because we tend to outstrip them quite fast (and because enwiki dumps were iffy for quite a while, meaning most of them are long-stagnant)
There are certainly some decent offline projects using dumps, though, in one form or another.
I believe reference.com and about.com are, along with answers.com, the biggest reusers in profile and size.
RB
On 1/28/07, Andrew Gray shimgray@gmail.com wrote:
On 28/01/07, William Pietri william@scissor.com wrote:
Rob Church wrote:
Legitimate mirrors and people who want to reuse our content are free, and encouraged, to download a database dump and process it for their needs.
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? The examples I've seen are all a bit disappointing, but perhaps that's just because the outrageous ones generate more attention.
answers.com is perhaps our most high-profile reuser, but we have an agreement with them for a live feed. I'm not offhand aware of a particularly shining example of a database-dump site, partly because we tend to outstrip them quite fast (and because enwiki dumps were iffy for quite a while, meaning most of them are long-stagnant)
There are certainly some decent offline projects using dumps, though, in one form or another.
--
- Andrew Gray andrew.gray@dunelm.org.uk
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l
On 28/01/07, Roberto Alfonso rpgrca@gmail.com wrote:
I believe reference.com and about.com are, along with answers.com, the biggest reusers in profile and size.
Does Yahoo still take a live feed?
- d.
Yahoo has its own special file made for it. See http://download.wikimedia.org/enwiki/20070124/ Extracted page abstracts for Yahoo 1.3 GB!
On 1/28/07, David Gerard dgerard@gmail.com wrote:
On 28/01/07, Roberto Alfonso rpgrca@gmail.com wrote:
I believe reference.com and about.com are, along with answers.com, the biggest reusers in profile and size.
Does Yahoo still take a live feed?
- d.
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l
On 1/28/07, Andrew Gray shimgray@gmail.com wrote:
I'm not offhand aware of a particularly shining example of a database-dump site, partly because we tend to outstrip them quite fast (and because enwiki dumps were iffy for quite a while, meaning most of them are long-stagnant)
I'm unfamiliar with the technical aspects of creating a database dump, but is it the sort of thing that would be made better and faster by throwing more computing resources at it?
I expect the answer to this question is "yes, but the Foundation is $500,000 short, and those hypothetical servers went on the budget chopping block on January 16th." :-( <rhetorical> When's the next fundraiser? </rhetorical>
There are certainly some decent offline projects using dumps, though, in one form or another.
Like, say, Google Earth, which I expect would be ecstatic if it could get more frequent dumps.
On 29/01/07, Michael Noda michael.noda@gmail.com wrote:
I'm unfamiliar with the technical aspects of creating a database dump, but is it the sort of thing that would be made better and faster by throwing more computing resources at it?
I expect the answer to this question is "yes, but the Foundation is $500,000 short, and those hypothetical servers went on the budget chopping block on January 16th." :-( <rhetorical> When's the next fundraiser? </rhetorical>
You'd have to speak to Tim or Brion, but IIRC the problem was simply that the method of generating dumps had worked fine in the past, and just collapsed under the sheer *scale* of enwiki - note that de, fr, etc, all were being done fine. It seems to have been fixed now; there was a dump released late last year, but it was the first one for quite a while.
There are certainly some decent offline projects using dumps, though, in one form or another.
Like, say, Google Earth, which I expect would be ecstatic if it could get more frequent dumps.
GE has dealt directly with the Foundation at some point, and from the reports I've seen seems to update faster than the usual dump schedule - there was a brief flurry of "misplaced" articles reported just after they released it. They may be doing something clever with periodic crawling of selected pages - as long as it's not "live mirroring", and they run a local cache, we're okay with that.
William Pietri wrote:
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? The examples I've seen are all a bit disappointing, but perhaps that's just because the outrageous ones generate more attention.
Answers.com is an excellent example. They license Wikipedia content and also license content from many other more traditional sources, and offer it up to people who search on their site. They have always been a strong supporter of Wikimedia and have been traditionally the #1 sponsor of the annual Wikimania conference.
So they benefit from our work, and they give back to the community as well.
--Jimbo
Thanks all for the helpful answers.
Jimmy Wales wrote:
William Pietri wrote:
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? [...]
Answers.com is an excellent example. They license Wikipedia content and also license content from many other more traditional sources, and offer it up to people who search on their site. They have always been a strong supporter of Wikimedia and have been traditionally the #1 sponsor of the annual Wikimania conference.
So they benefit from our work, and they give back to the community as well.
Interesting. Are there examples of organizations who give back in other ways?
I ask with some ulterior motive. I and some pals are looking at doing a commercial startup that would involve a substantial amount of open content. That content would be narrower but deeper than Wikipedia, by which I mean it would cover a much smaller set of topics, but would include a fair bit of material that Wikipedia currently deletes for lack of notability.
As a startup, major cash donations are unlikely, at least for a few years. But where our material overlaps with Wikipedia, we wanted to find ways to collaborate. I think the only item currently in our product plan is a tool to compare related articles, so that editors of either site can easily diff and merge parts they like from the other. Ideally, we'd open-source that code so that it could be used to compare and sync between other open-content sites as well.
Do folks here have other ideas that would be mutually beneficial?
Thanks,
William
I first became interested in contributing to Wikipedia as a direct result of looking at answers.com articles.
On 1/29/07, Jimmy Wales jwales@wikia.com wrote:
William Pietri wrote:
Say, are there examples of people who do this well, contributing back to Wikipedia or to the general public? The examples I've seen are all a bit disappointing, but perhaps that's just because the outrageous ones generate more attention.
Answers.com is an excellent example. They license Wikipedia content and also license content from many other more traditional sources, and offer it up to people who search on their site. They have always been a strong supporter of Wikimedia and have been traditionally the #1 sponsor of the annual Wikimania conference.
So they benefit from our work, and they give back to the community as well.
--Jimbo
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l
Or was it some other reuser? My memory is quite spotty these days.
On 1/30/07, Deathphoenix originaldeathphoenix@gmail.com wrote:
I first became interested in contributing to Wikipedia as a direct result of looking at answers.com articles.
On 1/29/07, Jimmy Wales jwales@wikia.com wrote:
William Pietri wrote:
Say, are there examples of people who do this well, contributing back
to
Wikipedia or to the general public? The examples I've seen are all a
bit
disappointing, but perhaps that's just because the outrageous ones generate more attention.
Answers.com is an excellent example. They license Wikipedia content and also license content from many other more traditional sources, and offer
it up to people who search on their site. They have always been a strong supporter of Wikimedia and have been traditionally the #1 sponsor of the annual Wikimania conference.
So they benefit from our work, and they give back to the community as well.
--Jimbo
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l