Hi Everyone
I would like to know if there is someone who could setup the dumps on my site www.snoobab.com. I will pay for your time.
Thanks,
Trevor
What do you mean? Providing you the ability of make dumps of your data or ''installing'' wikipedia dumps on your site?
Hi
I would someone to install the dumps on my site. Basically I would like to feature an encyclopaedia on my site. I would of course give all the credit to Wikimedia. You are probably saying why not just link to them but this would give my site a nice feature. I am a new site and looking for new features to make my site be worthwhile for companies to advertise on. Any improvements will be paid using paypal.
Many thanks,
Trevor
www.snoobab.com
---- Original Message ----- From: "Platonides" Platonides@gmail.com To: wikitech-l@wikimedia.org Sent: Saturday, April 22, 2006 7:28 PM Subject: Re: [Wikitech-l] Dumps
What do you mean? Providing you the ability of make dumps of your data or ''installing'' wikipedia dumps on your site?
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Sat, Apr 22, 2006 at 08:23:19PM +0200, trevor wrote:
I would someone to install the dumps on my site. Basically I would like to feature an encyclopaedia on my site. I would of course give all the credit to Wikimedia. You are probably saying why not just link to them but this would give my site a nice feature. I am a new site and looking for new features to make my site be worthwhile for companies to advertise on. Any improvements will be paid using paypal.
Not being officially associated with WMF at all; I'm probably the best person to tell you that you're out of your fool mind.
Explaining to you why that is is such a large task that I almost despair of attempting it. But, short version: that's like saying "I'd like to put a 900 horse Indy engine in my mini-pickup, because I think I could haul a bigger trailer that way".
Cheers, -- jra
Jay R. Ashworth wrote:
Explaining to you why that is is such a large task that I almost despair of attempting it. But, short version: that's like saying "I'd like to put a 900 horse Indy engine in my mini-pickup, because I think I could haul a bigger trailer that way".
Why settle? Rolls Royce makes a 29,000hp jet engine called the E200.
That's almost a better comparison with the nearly half a million article WP database.
Jay R. Ashworth wrote: [...]
Explaining to you why that is is such a large task that I almost despair of attempting it. But, short version: that's like saying "I'd like to put a 900 horse Indy engine in my mini-pickup, because I think I could haul a bigger trailer that way".
Nonsense. As far as the raw amount of data goes, even the English Wikipedia can be handled by entry-level server hardware. What kills Wikipedia is the volume of requests, and there's no reason to believe that the OP will get even a fraction of Wikipedia's; there's nothing inherently wrong with his idea.
Mark Jaroski wrote:
That's almost a better comparison with the nearly half a million article WP database.
The English Wikipedia alone has close to 1.1 million articles.
On Sat, Apr 22, 2006 at 05:32:19PM -0400, Ivan Krstic wrote:
Jay R. Ashworth wrote: [...]
Explaining to you why that is is such a large task that I almost despair of attempting it. But, short version: that's like saying "I'd like to put a 900 horse Indy engine in my mini-pickup, because I think I could haul a bigger trailer that way".
Nonsense. As far as the raw amount of data goes, even the English Wikipedia can be handled by entry-level server hardware. What kills Wikipedia is the volume of requests, and there's no reason to believe that the OP will get even a fraction of Wikipedia's; there's nothing inherently wrong with his idea.
Yeah, but the *traffic* is part of the reason why the 'pedia is *useful*; it's Metcalfe's Law incarnate. Pinching it off, and particularly to be "just another feature" on some other website...?
Well, why bother?
Mark Jaroski wrote:
That's almost a better comparison with the nearly half a million article WP database.
The English Wikipedia alone has close to 1.1 million articles.
I guess at that scale, it doesn't matter as much if that's only the main namespace.
Cheers, -- jra
Jay R. Ashworth wrote:
Pinching it off, and particularly to be "just another feature" on some other website...? Well, why bother?
You're welcome to ask the Rhein Zeitung[0] or any of the other sites that do this and find it a rather popular feature. Either way, you gave an unnecessary, condescending reply to a very simple inquiry.
[0] http://lexikon.rhein-zeitung.de/
On Sat, Apr 22, 2006 at 06:18:10PM -0400, Ivan Krstic wrote:
Jay R. Ashworth wrote:
Pinching it off, and particularly to be "just another feature" on some other website...? Well, why bother?
You're welcome to ask the Rhein Zeitung[0] or any of the other sites that do this and find it a rather popular feature. Either way, you gave an unnecessary, condescending reply to a very simple inquiry.
I've never been afraid of being elitist, or wrong.
Though I don't think this reply was either.
But I'm stickin with "how does installing an always out of date copy of something already available somewhere else on the web do anyone any good?"
Cheers, -- jra
On Sat, 22 Apr 2006, Jay R. Ashworth wrote:
But I'm stickin with "how does installing an always out of date copy of something already available somewhere else on the web do anyone any good?"
I do not know about the original enquirer, but to answer your question :-
In South Africa, we install wikipedia at schools that do not even have modem access to the 'net.
It gives them a taste of what Internet is, without most of the cr*p, and they love it. Yes, I said Internet.
I am very excited by current efforts to provide straight HTML dumps of wikipedia, as I find the extra requirement of mysql causes downtime.
Cheers, Andy!
On Sun, Apr 23, 2006 at 11:54:15AM +0200, Andy Rabagliati wrote:
On Sat, 22 Apr 2006, Jay R. Ashworth wrote:
But I'm stickin with "how does installing an always out of date copy of something already available somewhere else on the web do anyone any good?"
I do not know about the original enquirer, but to answer your question :-
In South Africa, we install wikipedia at schools that do not even have modem access to the 'net.
And that's a perfectly good justification for an off-line copy.
The keyword is, of course, "off line".
It gives them a taste of what Internet is, without most of the cr*p, and they love it. Yes, I said Internet.
Well, I'm not sure that generalization really holds; the Internet is exceedingly uneven, and there are lots of things both cooler and crappier than Wikipedia.
I am very excited by current efforts to provide straight HTML dumps of wikipedia, as I find the extra requirement of mysql causes downtime.
And by all means; I wasn't denigrating the idea of dumping the site. I was merely questioning the wisdom of the use the OP intended to make of it. (OK, perhaps "questioning the wisdom" is a bit overly polite :-)
Cheers -- jra
You're looking at a few technical challenges in accomplishing this. The first is finding time to do a back up of the MySQL database.
Question: does that already happen? I would hope so. If so, you can produce a dump of the backup at no loss of time to the main server.
Second, there's the issue of getting the dump to the duplicate site. How big is the dump of 1.1 million articles? Probably too large for a normal download. This leaves you with a few options.
1. You can burn it to multiple DVD's once and ship it, and he can live with the increasingly out of date version. 2. He can pay for a subscription, and you can pay someone to burn DVD's of the backup on a regular basis. 3. If you wind up with enough demand for it to pay for a separate server, you can set up a bittorrent feed on a regular (weekly or monthly) basis.
Getting it loaded shouldn't be too much trouble. Essentially it would be no different than restoring a backup on crash. It would even be fairly easy to automate for a subscription basis.
Seriously, folks, the guy wants to hand someone cash money to get this done. Why are you psychoanalyzing him? I'm certain that the Wikipedia project could put any profit from said cash money to good use.
Well, here's "lexikon.rhein-zeitung.de" speaking ;-)
Jay R. Ashworth wrote:
Yeah, but the *traffic* is part of the reason why the 'pedia is *useful*; it's Metcalfe's Law incarnate. Pinching it off, and particularly to be "just another feature" on some other website...?
Our mirror is (I hope so) perfectly linked with the original Wikipedia. Editing article by hyperlink into Wikipedia and creating new articles by clicking on a "red" link is all realized. I think we are part of the *traffic* and therefore no obstruction of Metcalfes Law.
And that's a perfectly good justification for an off-line copy. The keyword is, of course, "off line".
I don't think so. Off line copies are expensive and slow. We load every recent "pages-articles.xml.bz2" dump from the mirror: one or twice a month (475 MB for the german version at the moment). We check our mirrored pictures against the recent images table and load only new(er) images down.
We don't do this because it's nice to have. Our goal was and still is a fully and performant integration of the encyclopaedia into the news environment and layout of our online magazine. Beside an ordinary search forms, our readers can retrieving the Wikipedia content by doubleclicking each word in each text (most suitable for nouns). For performance reasons and perfect layout integration, we need a mirror and not only links to wikipedia.org.
Our Wikipedia mirror is also part of the intranet of some schools in the county where internet access is restricted.
I hope that dumping wikis data will be handled as carefully in the future as in the past - and will be available *online* without snail mail or mounted messengers :-)
Cheers
jo
On Thu, Apr 27, 2006 at 12:44:14AM +0200, Jochen Magnus wrote:
Jay R. Ashworth wrote:
Yeah, but the *traffic* is part of the reason why the 'pedia is *useful*; it's Metcalfe's Law incarnate. Pinching it off, and particularly to be "just another feature" on some other website...?
Our mirror is (I hope so) perfectly linked with the original Wikipedia. Editing article by hyperlink into Wikipedia and creating new articles by clicking on a "red" link is all realized. I think we are part of the *traffic* and therefore no obstruction of Metcalfes Law.
Hmmm...
So, I go to your site, and I see a problem with a page, and I click on the Edit link, and I make the change, and I go back to...
1) not your site and the change is there.
2) your site, and the change is *not* there.
Neither of those seems perfect.
And that's a perfectly good justification for an off-line copy. The keyword is, of course, "off line".
I don't think so. Off line copies are expensive and slow. We load every recent "pages-articles.xml.bz2" dump from the mirror: one or twice a month (475 MB for the german version at the moment). We check our mirrored pictures against the recent images table and load only new(er) images down.
So, the only reason I *do* like for an offline mirror, you don't like.
Those people completely off the grid shouldn't have Wikipedia at all?
We don't do this because it's nice to have. Our goal was and still is a fully and performant integration of the encyclopaedia into the news environment and layout of our online magazine.
And my assertion, based on the either/or above, is that you don't actually *have* that.
I hope that dumping wikis data will be handled as carefully in the future as in the past - and will be available *online* without snail mail or mounted messengers :-)
And, again, I wasn't trying to agitate against this, by any means, and I hope that's clear.
Cheers, -- jra
So, I go to your site, and I see a problem with a page, and I click on the Edit link, and I make the change, and I go back to...
not your site and the change is there.
your site, and the change is *not* there.
Neither of those seems perfect.
I think this is a little cost for the mirror. Our users are informed that their editings are loaded back to us at next mirror time. I find this more useful than no links to original Wikipedia and better than no mirror. For us is no need off an daily (or hourly) up-to-date encyclopaedia. We find two to four updates per month adequate.
Look, we are presenting the Wipedia data in our own design. They are also integrated in our "E-Paper" which is an additional offer to our paper readers (see http://epaper.rhein-zeitung.de/06/04/23/ , load an article and doubleclick any noun)
Those people completely off the grid shouldn't have Wikipedia at all?
Surely not. For this purpose offline copies works perfect. But they are not XORed to online dumps.
jo
wikitech-l@lists.wikimedia.org