One fine day, Brion Vibber said:
Do feel free to ask other free projects and universities if they'd be interested in supporting the project...
I thought it might be a good idea to ask, so I sent out an email to ibiblio:
Greetings - my name is Nick, and I help out with a site called Wikipedia. (http://www.wikipedia.org). It is a free, multi-lingual project to create a complete, accurate, and more importantly open content encyclopedia. All of the content is licensed under the GNU FDL (GFDL), meaning that anybody has the freedom to copy and redistribute it, with or without modifications, either commercially or non-commercially - although they may not put in place technical measures to conceal the content.
The English language Wikipedia has over 117 thousand articles already, and by our calculations, we have about half the content of the Encyclopedia Britannica. We have many other languages, which are also quickly growing in size and diversity.
However, we are currently facing both a budget and capacity crunch. In short, we have neither. Since Wikipedia is a volunteer based program, it is hard for us to raise the funds to purchase additional hardware. Right now, we are running off of a dual Athlon 1800+ server with 2GB of RAM and 36GB of SCSI storage. Unfortunately, the system is being pushed to its absolute limits, with little relief in sight. We are installing a second system this weekend as a front end, but even with that, we are not sure how long we can hold out. The dual Athlon system runs at a typical load of 15-20 during normal US working hours.
Since our mission seems to be very much inline with ibiblio's mission, I was wondering if there was any way that Wikipedia could be hosted by ibiblio? It would be a great help to our project and the community.
Thanks!
Much to my surprise, they replied:
hi nick,
we would LOVE to host wikipedia.org. our only concern is with the additional load wikipedia might put on our mysql server. BUT... if you could possibly hold off moving the site for two weeks or so john and fred will have us up on our new hardware - we're moving to a web cluster and will have a much more powerful database machine.
if this is acceptable to you, please check out http://www.ibiblio.org/faq/ for more information about our setup, and just drop me a list of what accounts, dbs, unix groups, web directories, etc and we'll go from there. just let me know?
thanks, donald www.ibiblio.org formerly known as SunSITE 919.843.8215 and stoof.
Seems like a good deal to me. We should probably tell them that they should keep our database on a seperate MySQL server, as it will absolutely demolish just about anything they make available.
Anyways, if somebody who knows the server requirements, layout, and other what-not wants to let Donald and/or the list know, that would be awesome (assuming that Brion is still interested in having someone else host the site). Having ibiblio make all the outlays seems like a good deal to me (plus, they're a non-profit org, so you could likely make tax-deductable donations to them).
Anyways, that's that. :)
[ibiblio] would LOVE to host wikipedia.org...
I don't think that's an option, but certainly they might be useful as a mirror, or storage for backups, etc.
I'm not the least bit interested in moving wikipedia off my servers.
The claims of budget crisis are wildly exagerrated. I'm donating one and possibly two more servers to the project (one for a www frontend and one for a mailserver and auxiliary front end).
On Thu, May 01, 2003 at 01:01:06AM -0500, Lee Daniel Crocker wrote:
[ibiblio] would LOVE to host wikipedia.org...
I don't think that's an option, but certainly they might be useful as a mirror, or storage for backups, etc.
That's something I was thinking - perhaps, for example, the database goes down. At that point, we make a quick Apache change, and all requests URLs are re-written to point at a read-only copy of Wikipedia @ ibiblio.
However, is there a reason why we couldn't have ibiblio host the entire Wikipedia? It seems that is what Brion was asking for. If they're willing to supply three, four (or more) servers to keep Wikipedia running, why isn't that a good thing?
Nick Reinking wrote:
However, is there a reason why we couldn't have ibiblio host the entire Wikipedia? It seems that is what Brion was asking for. If they're willing to supply three, four (or more) servers to keep Wikipedia running, why isn't that a good thing?
There are a lot of reasons, but the main reason is that we shouldn't make ourselves dependent on anyone else.
--Jimbo
(Nick Reinking nick@twoevils.org):
However, is there a reason why we couldn't have ibiblio host the entire Wikipedia? It seems that is what Brion was asking for. If they're willing to supply three, four (or more) servers to keep Wikipedia running, why isn't that a good thing?
Because they aren't us. We have a specific vision, and goals to accomplish. Even if ibiblio totally supported those same goals now, there's no guarantee they will in the future. Ultimately, physical ownership of the servers is our guarantee that Wikipedia's operation will serve our vision.
Also, my experience with ibiblio is that their servers tend to run close to capacity anyway: try downloading Gentoo Linux, for example.
On Thu, May 01, 2003 at 11:39:19AM -0500, Lee Daniel Crocker wrote:
(Nick Reinking nick@twoevils.org):
However, is there a reason why we couldn't have ibiblio host the entire Wikipedia? It seems that is what Brion was asking for. If they're willing to supply three, four (or more) servers to keep Wikipedia running, why isn't that a good thing?
Because they aren't us. We have a specific vision, and goals to accomplish. Even if ibiblio totally supported those same goals now, there's no guarantee they will in the future. Ultimately, physical ownership of the servers is our guarantee that Wikipedia's operation will serve our vision.
Also, my experience with ibiblio is that their servers tend to run close to capacity anyway: try downloading Gentoo Linux, for example.
Fair enough, was just wondering. :) Still, I think it would still be a good idea for us to use them as a read-only mirror for when the database is down (for server maintence, database maintenence, etc.)
Lee-
Also, my experience with ibiblio is that their servers tend to run close to capacity anyway: try downloading Gentoo Linux, for example.
Correct. Ibiblio does a lot of good, but because they do so much, they're not very fast. I think it's nice to have this option should Jimbo ever decide to shut down Wikipedia because some vandal insulted his mother, but right now we should concentrate working with the assets we have, and expanding them where necessary. The Nupedia Foundation should help greatly with that.
Regards,
Erik
(Nick Reinking nick@twoevils.org):
Fair enough, was just wondering. :) Still, I think it would still be a good idea for us to use them as a read-only mirror for when the database is down (for server maintence, database maintenence, etc.)
I agree that read-only mirrors are probably a good idea in our future plans. It would require some software: namely, something that ran maybe once or twice a day and generated a set of static HTML pages from the "cur" table. This could be done incrementally based on timestamps, so it wouldn't be such a burden). Things like edit links would point to the live Wikipedia or to an intermediate page that explained what was going on and that then pointed to the live site. We could then have other sites rsync those static pages. Ideally, the search links would point to a search function on the other site as well, built by indexing the static pages rather than going to the live database.
Felt it would be fair to forward on the most recent message I've received from ibiblio:
Nick,
We ran across the wikitech-l thread about ibiblio hosting and found it quite interesting. I hope you don't read this as an intrusion, but I'd like to offer more details on the systems side of ibiblio. I'm one of the systems engineers here and can give a techie perspective.
Many large, independent sites have similar discussions when considering moving to ibiblio. Some moved, some decided to stay on their own systems. Either way is fine with us as long as everyone's happy. Some sites move their web/db content and retain their own mail servers. A drawback for some is that we manage the systems and don't provide co-lo services.
Concerning performance of the ibiblio site, one person mentioned gentoo downloads. Part of the challenge of having so much content available is the outgoing bandwidth. We use traffic shaping software to keep it under control. As a result, Linux distribution download traffic is given a lower priority than the main web site content. When judging performance, http page load speed for www.ibiblio.org might be a better example. db.etree.org is heavily mysql driven, so you could check that out, too.
You might find this set of pages interesting: http://www.ibiblio.org/systems/ The system status page has a real-time (almost) network traffic graph.
Also, we're close to moving the site to an LVS cluster and upgrading our database systems. That should greatly improve capacity.
But you guys should certainly do what best meets your needs. We're happy to help provide useful and interesting content in any capacity.
Thanks,
-jrr
wikitech-l@lists.wikimedia.org