Wikipedia has a problem
Sorry! This site is experiencing technical difficulties.
Try waiting a few minutes and reloading.
(Can't contact the database server: All servers busy)
---- Haven't seen this one before - doesn't have any link to fundraising or anything either.
Steve
Sorry! This site is experiencing technical difficulties. Try waiting a few minutes and reloading. (Can't contact the database server: All servers busy)
AFAIK, it means exactly what it says. I've been seeing it more and more lately, particularly at around 15:00 UTC weekdays, and for some reason a whole lot tonight. It may be some strange database problem, or it may just be evergrowing load outstripping server capacity again..
On 12/4/06, Steve Summit scs@eskimo.com wrote:
AFAIK, it means exactly what it says. I've been seeing it more and more lately, particularly at around 15:00 UTC weekdays, and for some reason a whole lot tonight. It may be some strange database problem, or it may just be evergrowing load outstripping server capacity again..
I blame [[List of big-bust models and performers]]. Apparently our most popular list.
Steve
On Mon, Dec 04, 2006 at 01:15:21PM +1100, Steve Bennett wrote:
On 12/4/06, Steve Summit scs@eskimo.com wrote:
AFAIK, it means exactly what it says. I've been seeing it more and more lately, particularly at around 15:00 UTC weekdays, and for some reason a whole lot tonight. It may be some strange database problem, or it may just be evergrowing load outstripping server capacity again..
I blame [[List of big-bust models and performers]]. Apparently our most popular list.
And rightfully so. :-)
Cheers, -- jra
An order has been placed for an additional six database servers. The servers should be delivered Tuesday or Wednesday this week.
On 12/3/06, Steve Summit scs@eskimo.com wrote:
Sorry! This site is experiencing technical difficulties. Try waiting a few minutes and reloading. (Can't contact the database server: All servers busy)
AFAIK, it means exactly what it says. I've been seeing it more and more lately, particularly at around 15:00 UTC weekdays, and for some reason a whole lot tonight. It may be some strange database problem, or it may just be evergrowing load outstripping server capacity again.. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Brad Patrick wrote:
An order has been placed for an additional six database servers. The servers should be delivered Tuesday or Wednesday this week.
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second disk channels. They use standard off the shelf hardware. We could certainly cut WMF a break on pricing if you are interested in the future.
Jeff
On 12/3/06, Steve Summit scs@eskimo.com wrote:
Sorry! This site is experiencing technical difficulties. Try waiting a few minutes and reloading. (Can't contact the database server: All servers busy)
AFAIK, it means exactly what it says. I've been seeing it more and more lately, particularly at around 15:00 UTC weekdays, and for some reason a whole lot tonight. It may be some strange database problem, or it may just be evergrowing load outstripping server capacity again.. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
Ivan Krstić wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
I also have proxy servers setup and I get about 2000 visitors a day and over 300,000 hits per day.
Jeff
On 12/4/06, Jeff V. Merkey jmerkey@wolfmountaingroup.com wrote:
Ivan Krstić wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
I also have proxy servers setup and I get about 2000 visitors a day and over 300,000 hits per day.
Are there links for the capacity planning and performance testing that's been done so far on production servers?
George Herbert wrote:
On 12/4/06, Jeff V. Merkey jmerkey@wolfmountaingroup.com wrote:
Ivan Krstić wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
I also have proxy servers setup and I get about 2000 visitors a day and over 300,000 hits per day.
Are there links for the capacity planning and performance testing that's been done so far on production servers?
I have no links. Most of my hits are from Google, Inktomi, Answers.com, and other search engines. The server is under constant pummeling from these robots. Users only account for about 23% of my daily hits.
Jeff
Jeff V. Merkey wrote:
George Herbert wrote:
On 12/4/06, Jeff V. Merkey jmerkey@wolfmountaingroup.com wrote:
Ivan Krstić wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
I also have proxy servers setup and I get about 2000 visitors a day and over 300,000 hits per day.
Are there links for the capacity planning and performance testing that's been done so far on production servers?
I have no links. Most of my hits are from Google, Inktomi, Answers.com, and other search engines. The server is under constant pummeling from these robots. Users only account for about 23% of my daily hits.
Jeff
These numbers do not take into account the constant DOS attacks and hacker attempts I get from mainland China. I have the servers setup on dynamic IP ranges which switch between several IP subnets every other week to allow folksin China to get access. Most of the traffic, unfortunately, are attacks designed to kill my servers from China. They have been having a hard time of late since I setup the SSH ports to jump around between ranges other than port 22 after 3 or more failed attempts to login via SSH.
Jeff
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Mon, Dec 04, 2006 at 12:46:31PM -0700, Jeff V. Merkey wrote:
Ivan Krsti?? wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
I also have proxy servers setup and I get about 2000 visitors a day and over 300,000 hits per day.
Just as a comparison: Wikipedia serves 22'000 requests per second and over 100 Million visitors per month.
Regards,
jeluf
On 12/5/06, Jens Frank jf@mormo.org wrote:
On Mon, Dec 04, 2006 at 12:46:31PM -0700, Jeff V. Merkey wrote:
Ivan Krsti?? wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
I also have proxy servers setup and I get about 2000 visitors a day and over 300,000 hits per day.
Just as a comparison: Wikipedia serves 22'000 requests per second and over 100 Million visitors per month.
Is that 22,000 requests per second page requests, or total hits per second?
Is there an edits-per-second statistic available? I went wandering but didn't find one...
On Mon, Dec 04, 2006 at 01:35:12PM -0500, Ivan Krsti?? wrote:
Jeff V. Merkey wrote:
You may wish to consider ordering more powerful platforms. I run 6 instances of the entire Wikipedia on a single Solera Appliance. These servers have throughput of 875 megabyte per second
I run a Wikipedia instance on the 366MHz OLPC XO-1 (http://dev.laptop.org/~krstic/xo); it works because I'm the only user. Neither storage nor disk throughput help when your problem is sheer load, and you're computationally-bound.
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms. :-) That he mentioned throughput is really a red-herring here...
Cheers, -- jra
On 04/12/06, Jay R. Ashworth jra@baylink.com wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms. :-) That he mentioned throughput is really a red-herring here...
Let's filch a couple supercomputers.
Rob Church
Rob Church wrote:
On 04/12/06, Jay R. Ashworth jra@baylink.com wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms. :-) That he mentioned throughput is really a red-herring here...
Let's filch a couple supercomputers.
:-)
Jeff
Rob Church _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On 12/4/06, Jeff V. Merkey jmerkey@wolfmountaingroup.com wrote:
Rob Church wrote:
On 04/12/06, Jay R. Ashworth jra@baylink.com wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms. :-) That he mentioned throughput is really a red-herring here...
Let's filch a couple supercomputers.
:-)
Jeff
Why filch? Call up and order...
On 05/12/06, George Herbert george.herbert@gmail.com wrote:
Why filch? Call up and order...
I'd like my money tree dispatched via airmail, please.
Rob Church
On 12/4/06, Rob Church robchur@gmail.com wrote:
On 05/12/06, George Herbert george.herbert@gmail.com wrote:
Why filch? Call up and order...
I'd like my money tree dispatched via airmail, please.
I would expect that a little effort would result in Wikipedia finagleing something like the educational institutiton discounts out of most vendors. Do you have any idea how deep those discounts typically are?
Rob Church wrote:
On 04/12/06, Jay R. Ashworth jra@baylink.com wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms. :-) That he mentioned throughput is really a red-herring here...
Let's filch a couple supercomputers.
Who needs supercomputers when you can use distributed computing? Wikipedia@Home anyone?
On 12/4/06, Alphax (Wikipedia email) alphasigmax@gmail.com wrote:
Rob Church wrote:
On 04/12/06, Jay R. Ashworth jra@baylink.com wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms. :-) That he mentioned throughput is really a red-herring here...
Let's filch a couple supercomputers.
Who needs supercomputers when you can use distributed computing? Wikipedia@Home anyone?
<brief response> "No" </brief response>
<computer architect hat> The problem with distributed computing for general computational problems requiring interactive updates between computational nodes is that the amount of traffic between nodes increases both with the amount of updates and with the number of nodes.
The various @home type projects, and batch job type grid computing, are efficient network-wise; there is little to no network interaction other than submission and results. Where data has to flow laterally, as in a database system, the scaling increases as described above, rapidly bringing the distributed application to its knees. This is a known problem in the design of MPI type supercomputer clusters. As the required interconnect communications bandwidth increases, the optimial solution moves from highly distributed, towards larger nodes in a higher speed network, towards a few nodes in a very very high speed network, and finally towards a single large SMP computer. If you are pushing the interconnect very hard, you're having to pay for extremely high performance interconnects, and ultimately there's no reason not to just buy large SMP systems with single system image and more compact footprints. The highest speed system to system interconnects cost as much as the SMP system components do.
One current commercial example of this particular problem is the throttled Oracle RAC problem, where you spread a big database out over a large pile of Linux nodes, but the database update rates overwhelm the private network interconnect for the shared cache updates. There are plenty of HPC computer horror stories of equivalent problems out there in the field for other workloads.
There is no single right answer; even the large labs who have large quantities of professional computer architects and code wranglers working on optimized highly multi-processor projects don't agree on how to do most work, which is why there are still a bunch of competing supercomputer/supercluster vendors out there, in many cases individual centers or labs buying from multiple vendors as they have different types of work in progress.
For Wikipedia's database problems, I would have to have about 10x more info than I currently do to make particularly useful actual advice or suggestion, but I can guarantee you that WikiDB@home would be a disaster of heroic porportions 8-) </computer architect hat>
Jay R. Ashworth wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms.
Wikipedia has so far followed the approach of scaling out instead of up, and that with commodity hardware. If you believe this approach is mistaken, you'll need to explain what you know about scaling server farms that Google doesn't.
On 12/4/06, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
Jay R. Ashworth wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms.
Wikipedia has so far followed the approach of scaling out instead of up, and that with commodity hardware. If you believe this approach is mistaken, you'll need to explain what you know about scaling server farms that Google doesn't.
You do realize that Google has spent the better part of a half billion dollars on engineering a completely ground-up distributed system software architecture, working with a problem that (unusually among largescale enterprise data management) can theoretically be efficiently partitioned?
The reason not everyone is using Google-like IT architectures is A) That they can't afford to develop it, and B) That in most cases, that architecture would be worse than what they have now, given that most IT problems don't partition so well...
George Herbert wrote:
You do realize that Google has spent the better part of a half billion dollars on engineering a completely ground-up distributed system software architecture, working with a problem that (unusually among largescale enterprise data management) can theoretically be efficiently partitioned?
I mentioned Google because they're a well-known example, but it certainly isn't the case that one needs to invest an inordinate sum to be able to reap benefits from scaling out instead of up. Many other sites with nowhere the engineering talent or financial budget of Google are doing the same thing. In fact, sites with small budgets that choose to scale up and succeed are few and far between, to my knowledge.
If you prefer a non-Google example of out over up, look at LiveJournal, as the evolution of their software and hardware is well-documented and more transparent than the operation of most comparable sites.
Jay R. Ashworth wrote:
My point was merely that he suggested more powerful hardware, and received as a reply, "no, the problem is that we need more powerful hardware".
I really wasn't trying to get involved in a polemic about engineering practices. Jeff seems to believe that buying one powerful appliance is clearly a better approach than buying six commodity servers, and my point was that -- for the Wikipedia use case -- that's at best very unclear, and at worst very wrong.
-- Ivan Krstić krstic@solarsail.hcs.harvard.edu | GPG: 0x147C722D
On Tue, Dec 05, 2006 at 12:52:50AM -0500, Ivan Krsti?? wrote:
Jay R. Ashworth wrote:
My point was merely that he suggested more powerful hardware, and received as a reply, "no, the problem is that we need more powerful hardware".
I really wasn't trying to get involved in a polemic about engineering practices. Jeff seems to believe that buying one powerful appliance is clearly a better approach than buying six commodity servers, and my point was that -- for the Wikipedia use case -- that's at best very unclear, and at worst very wrong.
For what it's worth, *my* perception of what jeff said parsed as "buy *six* powerful appliances". :-) Or, more generally, COTS is wonderful, but is it cost effective now to spec one or two levels higher of COTS than we have been?
Cheers, -- jra
On 12/4/06, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
George Herbert wrote:
You do realize that Google has spent the better part of a half billion dollars on engineering a completely ground-up distributed system
software
architecture, working with a problem that (unusually among largescale enterprise data management) can theoretically be efficiently
partitioned?
I mentioned Google because they're a well-known example, but it certainly isn't the case that one needs to invest an inordinate sum to be able to reap benefits from scaling out instead of up. Many other sites with nowhere the engineering talent or financial budget of Google are doing the same thing. In fact, sites with small budgets that choose to scale up and succeed are few and far between, to my knowledge.
That depends on how you define "scale up"; when database limits start to be the problem, quite a large number of sites scale up to centralized large SMP systems running Oracle or something of that ilk quite successfully. I have done website systems and network architecture work for some very large websites (WebEx and Blockbuster.com among others) and sold hardware to people building others.
For the most part, web applications have a highly effectively paralleliseable app and web layer, but the database on many of them doesn't scale horizontally as well. It's not unusual around here to see sites buy a clustered pair of big Sun boxes (or more rarely, IBM or HP) and switch to Oracle as they grow past what MySQL and Linux servers can handle, if they're DB limited.
All of that said, I really don't numerically understand the loads on the Wikimedia Foundation servers, or the details of the architecture well enough now to give specific advice.
There are large websites where the actual sustained DB load is low enough that a farm of Linux/MySQL servers is an adequate, reliable solution. And despite having worked at a Sun / Oracle VAR I have also deployed several thousand linux boxes in horizontal scaled website farms.
If you prefer a non-Google example of out over up, look at LiveJournal,
as the evolution of their software and hardware is well-documented and more transparent than the operation of most comparable sites.
Or for a counterexample, Friendster. I know the poor guy who was doing site architecture there for a while, screaming at his bosses that they needed to get off MySQL and get a Sun/Oracle box in, and doing unholy things to MySQL to try and keep it going, until he just walked away. Their site performance implosion is near-legendary...
George Herbert wrote:
For the most part, web applications have a highly effectively paralleliseable app and web layer, but the database on many of them doesn't scale horizontally as well.
Wikipedia can partition the database across languages (as it does already with the largest ones), and when individual languages grow to be too large for a single server to deal with, there are other partitioning schemes to look at. So it's a bit simpler here, as it's not one monolithic data store that's growing without bound.
Or for a counterexample, Friendster. I know the poor guy who was doing site architecture there for a while, screaming at his bosses that they needed to get off MySQL and get a Sun/Oracle box in, and doing unholy things to MySQL to try and keep it going, until he just walked away.
And yet an even larger social networking site continues to happily churn along with MySQL. Clearly there are examples each way, but in the case of WMF, there are also principles that factor into the equation.
On 12/5/06, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
George Herbert wrote:
For the most part, web applications have a highly effectively paralleliseable app and web layer, but the database on many of them
doesn't
scale horizontally as well.
Wikipedia can partition the database across languages (as it does already with the largest ones), and when individual languages grow to be too large for a single server to deal with, there are other partitioning schemes to look at. So it's a bit simpler here, as it's not one monolithic data store that's growing without bound.
Or for a counterexample, Friendster. I know the poor guy who was doing
site
architecture there for a while, screaming at his bosses that they needed
to
get off MySQL and get a Sun/Oracle box in, and doing unholy things to
MySQL
to try and keep it going, until he just walked away.
And yet an even larger social networking site continues to happily churn along with MySQL. Clearly there are examples each way, but in the case of WMF, there are also principles that factor into the equation.
It's not one monolithic data store, but in the current model, the en.wikipedia database is a useful test case. It's a large chunk (I don't know, guessing a half? a third?) of the total data in play.
If we start partitioning the per-wiki database, then quite a large number of potential technologies come into play. The $640,000 question is whether the developer effort to partition the database effectively and efficiently will be more expensive than a single large central server.
I have no problem with people whose principles are to use open source. I am all for open source. I also know, from experience, that there are limits to the scalability of many workloads beyond which large SMP systems are better database server choices.
If the WMF workload is one of those types of workload, then the principle to prefer open source should not be a suicide pact.
If the workload isn't that type, is trivially partitionable, or is partitionable more affordably than the cost of SMP servers, then it should be managed that way anyways.
The devil is in the details.
Separate question:
Has anyone developed a MediaWiki test suite, a standard set of web operations which can be run as a load generator for benchmarking purposes?
Hi!
That depends on how you define "scale up"; when database limits start to be the problem, quite a large number of sites scale up to centralized large SMP systems running Oracle or something of that ilk quite successfully. I have done website systems and network architecture work for some very large websites (WebEx and Blockbuster.com among others) and sold hardware to people building others.
Indeed, selling big iron is more profitable for vendors than those commodity pizza boxen.
For the most part, web applications have a highly effectively paralleliseable app and web layer, but the database on many of them doesn't scale horizontally as well. It's not unusual around here to see sites buy a clustered pair of big Sun boxes (or more rarely, IBM or HP) and switch to Oracle as they grow past what MySQL and Linux servers can handle, if they're DB limited.
It all depends on data layout, structure, application design. Any application out of the box may have scaling issue, but sometimes it is possible to do some horizontal magic and you're given lots of fresh air to breathe. Oddly enough, I've heard quite a lot of stories where maintaining oracle environment was too expensive and switching to mysql was the solution.
All of that said, I really don't numerically understand the loads on the Wikimedia Foundation servers, or the details of the architecture well enough now to give specific advice.
There's nothing really interesting - lots of requests are being served even in front of application layer. Out of what is left is being handled by lots of apaches and quite distributed data system. We don't do much edits, so there're no more than few hundred update queries per second (really active updating happens in lossy databases, e.g. memcached). Replication works, and we send MySQL read queries to slaves. We do up to 40000 db requests per second.
There are large websites where the actual sustained DB load is low enough that a farm of Linux/MySQL servers is an adequate, reliable solution. And despite having worked at a Sun / Oracle VAR I have also deployed several thousand linux boxes in horizontal scaled website farms.
We do not have enough of funding to have thousands of linux boxen, so we end up doing what we can with current resources.
Or for a counterexample, Friendster. I know the poor guy who was doing site architecture there for a while, screaming at his bosses that they needed to get off MySQL and get a Sun/Oracle box in, and doing unholy things to MySQL to try and keep it going, until he just walked away. Their site performance implosion is near-legendary...
*shrug*, they had performance issues, but IIRC they solved them.
On Mon, Dec 04, 2006 at 11:21:00PM -0500, Ivan Krsti?? wrote:
Jay R. Ashworth wrote:
Yes, Ivan; that's why Jeff suggested provisioning more powerful platforms.
Wikipedia has so far followed the approach of scaling out instead of up, and that with commodity hardware. If you believe this approach is mistaken, you'll need to explain what you know about scaling server farms that Google doesn't.
My point was merely that he suggested more powerful hardware, and received as a reply, "no, the problem is that we need more powerful hardware". Essentially.
Cheers, -- jra
On 12/5/06, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
Wikipedia has so far followed the approach of scaling out instead of up, and that with commodity hardware. If you believe this approach is mistaken, you'll need to explain what you know about scaling server farms that Google doesn't.
I think you meant "that WikiMedia doesn't". We don't have access to Google's technology. We don't have [[MapReduce]].
Steve
On 05/12/06, Steve Bennett stevagewp@gmail.com wrote:
I think you meant "that WikiMedia doesn't". We don't have access to Google's technology. We don't have [[MapReduce]].
Let's arm Brion so he can storm Mountain View, CA, and demand access to Google's l33t technologies...
Rob Church
On 12/5/06, Rob Church robchur@gmail.com wrote:
On 05/12/06, Steve Bennett stevagewp@gmail.com wrote:
I think you meant "that WikiMedia doesn't". We don't have access to Google's technology. We don't have [[MapReduce]].
Let's arm Brion so he can storm Mountain View, CA, and demand access to Google's l33t technologies...
If there's anything currently in Google which would seriously benefit WMF, other than cash, it is likely that we could approach the right people and get access to it. I know some midlevel people to start asking questions of; Jimmy and others at the board level probably have more appropriate high level contacts.
MapReduce doesn't seem relevant to the largescale MediaWIki deployment problem. Is there any Google tech which anyone specifically thinks would help? Or generalizing, is there any commercial or semi-commercial tech in the world which would seriously help?
George Herbert wrote:
If there's anything currently in Google which would seriously benefit WMF, other than cash, it is likely that we could approach the right people and get access to it.
WMF has, up to this point, been committed to building Wikipedia only with free software. While it's exceedingly likely that we can obtain some kind of privileged access to Google's technologies, it's also almost certain those technologies would not be opened up to the public, thus going against the principles at play here.
That said, there definitely are Google technologies that would be useful to Wikipedia. BigTable and GFS are examples.
On 12/5/06, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
George Herbert wrote:
If there's anything currently in Google which would seriously benefit WMF, other than cash, it is likely that we could approach the right people and get access to it.
WMF has, up to this point, been committed to building Wikipedia only with free software. While it's exceedingly likely that we can obtain some kind of privileged access to Google's technologies, it's also almost certain those technologies would not be opened up to the public, thus going against the principles at play here.
That said, there definitely are Google technologies that would be useful to Wikipedia. BigTable and GFS are examples.
Google has emitted quite a bit of IP as freeware; it depends on whether it's a key commercial competitive advantage for them or not.
I hadn't been familiar with BigTable, which is sort of annoying since I'm rather familiar with [[Sybase IQ]], the commercial column-based database, and I've done a bit of evangelizing of column-based databases since I heard of the idea.
I do know of GFS rather well, from the technical papers level on up.
On first glance... GFS doesn't seem relevant to Wikimedia Foundation work. GFS is all about giving common access to a very large, petabytes scale disk store across a wide sea of systems. As I understand it, one whole database+static content en.wikipedia dump is less than a terabyte, and can fit on the local disks of a single 1-2U rack server. There's no reason for there to be a giant shared filesystem if the dataset fits on one system's local disk. Am I missing something?
A column based database (generalizing here) seems like a possible match for our needs; in general, as I understand it, the wikipedia servers are getting hit 99% plus reads, many fewer edits, in terms of the database access? If that's true, then the database contents are more like a data warehousing job (few updates, predominantly read operations), and column based databases seem to be around 10x faster for data warehousing work.
The question I would have is whether the database scale is large enough to justify that. If you can keep the indexes for the key tables in RAM all the time, and given the only-few-hundred-GB database sizes now, I would hope that we could, then the index in RAM beats disk access to a column in terms of performance, and the database's disk layout is therefore a second order effect.
Can WMF admins confirm that the DB servers are effectively keeping the DB indexes cached in RAM now?
George Herbert wrote:
On first glance... GFS doesn't seem relevant to Wikimedia Foundation work.
I brought it up because the BigTable design outsources certain fault-tolerance worries to GFS, if I recall correctly. It's not a necessity by itself, no.
On 12/5/06, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
George Herbert wrote:
On first glance... GFS doesn't seem relevant to Wikimedia Foundation
work.
I brought it up because the BigTable design outsources certain fault-tolerance worries to GFS, if I recall correctly. It's not a necessity by itself, no.
Well, obviously BigTable depends on GFS, now that I'm looking at it, but it's not a "we could use GFS for GFS'es sake" situation, unless I am missing something.
And as I mentioned, BigTable seems not useful as long as our indexes fit in RAM, which I hope is the case.
Poking around some more, I found myself in blissful ignorance of [[C-Store]]. I lose track of Stonebreaker for a year, and he's off doing what I had been hoping he'd do...
Has anyone looked into C-Store and the largescale MediaWiki deployment problem?
Hi!
Well, obviously BigTable depends on GFS, now that I'm looking at it, but it's not a "we could use GFS for GFS'es sake" situation, unless I am missing something.
If BigTable (whatever it is) would be some magic silver bullet, Google would not use InnoDB or MySQL or whatever.
And as I mentioned, BigTable seems not useful as long as our indexes fit in RAM, which I hope is the case.
We have few hundred gigabytes of RAM available in cluster. I hope we fit. :)
On 12/5/06, Domas Mituzas midom.lists@gmail.com wrote:
Hi!
Well, obviously BigTable depends on GFS, now that I'm looking at it, but it's not a "we could use GFS for GFS'es sake" situation, unless I am missing something.
If BigTable (whatever it is) would be some magic silver bullet, Google would not use InnoDB or MySQL or whatever.
And as I mentioned, BigTable seems not useful as long as our indexes fit in RAM, which I hope is the case.
We have few hundred gigabytes of RAM available in cluster. I hope we fit. :)
The DB clusters are systems running slave copies of the same master MySQL databases; what matters is RAM per system (or, per database)....
Hi!!!!!
The DB clusters are systems running slave copies of the same master MySQL databases; what matters is RAM per system (or, per database)....
Well, this is why we end up with different slave copies of different masters. Oh, and wait, we even end up with different load patterns. Different storage patterns. And we still don't buy 32GB machines. \o/
On Tue, Dec 05, 2006 at 12:30:32PM -0800, George Herbert wrote:
And as I mentioned, BigTable seems not useful as long as our indexes fit in RAM, which I hope is the case.
Well, 64-bit architectures relax one limit there, but RAM's still not *that* cheap; I suspect our curve will grow faster than the price drop curve of the memory...
Cheers, -- jra
On 05/12/06, George Herbert george.herbert@gmail.com wrote:
If there's anything currently in Google which would seriously benefit WMF, other than cash, it is likely that we could approach the right people and get access to it. I know some midlevel people to start asking questions of; Jimmy and others at the board level probably have more appropriate high level contacts.
This is where I confess to embedding some sanity in my otherwise ignorable message...
Rob Church
On 04/12/06, Steve Bennett stevagewp@gmail.com wrote:
Haven't seen this one before - doesn't have any link to fundraising or anything either.
You've really never seen that one before?
It's the class of error message MediaWiki throws up when something happens that it can't wrap up in a nice, beautified error. :)
Rob Church
Haven't seen this one before - doesn't have any link to fundraising or anything either.
Having a link to fundraising in this error message wouldn't be such a bad idea...
Bence Damokos
On 12/4/06, Rob Church robchur@gmail.com wrote:
On 04/12/06, Steve Bennett stevagewp@gmail.com wrote:
Haven't seen this one before - doesn't have any link to fundraising or anything either.
You've really never seen that one before?
It's the class of error message MediaWiki throws up when something happens that it can't wrap up in a nice, beautified error. :)
Rob Church _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On 04/12/06, Bence Damokos bdamokos@gmail.com wrote:
Having a link to fundraising in this error message wouldn't be such a bad idea...
It wouldn't necessarily come up in circumstances where more cash would be helpful....to be honest, users shouldn't see it as often as the other error messages, which come from the Squids, etc.
Rob Church
On 12/5/06, Rob Church robchur@gmail.com wrote:
On 04/12/06, Steve Bennett stevagewp@gmail.com wrote:
Haven't seen this one before - doesn't have any link to fundraising or anything either.
You've really never seen that one before?
It's the class of error message MediaWiki throws up when something happens that it can't wrap up in a nice, beautified error. :)
Heh. I really ought to make a photo gallery of all the different errors I've seen at Wikipedia. I must have seen half a dozen really different error screens by now.
For my own curiosity, what does "can't" mean in this context - is it a "can't spare the resources to do so", or "doesn't have enough information", or "happens at too low a level"?
Steve
On 12/4/06, Steve Bennett stevagewp@gmail.com wrote:
For my own curiosity, what does "can't" mean in this context - is it a "can't spare the resources to do so", or "doesn't have enough information", or "happens at too low a level"?
Doesn't have enough information to render the site layout because it can't access the database, so it has to just print out text without any interface widgets. Substantial parts of the interface are stored in the database, like the sidebar and so forth, and rendering the interface without them would be a) weird and b) too much work for an error message, which people don't expect to be pretty anyway.
On 12/5/06, Simetrical Simetrical+wikitech@gmail.com wrote:
Doesn't have enough information to render the site layout because it can't access the database, so it has to just print out text without any interface widgets. Substantial parts of the interface are stored in the database, like the sidebar and so forth, and rendering the interface without them would be a) weird and b) too much work for an error message, which people don't expect to be pretty anyway.
Ah, that makes sense, thanks.
Steve
On 05/12/06, Simetrical Simetrical+wikitech@gmail.com wrote:
Doesn't have enough information to render the site layout because it can't access the database, so it has to just print out text without any interface widgets. Substantial parts of the interface are stored in the database, like the sidebar and so forth, and rendering the interface without them would be a) weird and b) too much work for an error message, which people don't expect to be pretty anyway.
To a lesser extent, also, the inability to be certain that the message cache is working...which explains why it's not particularly internationalisable.
Rob Church
wikitech-l@lists.wikimedia.org