Can I ask that developers use LiveJournal - http://www.livejournal.com/community/wikitech/ - to keep people informed about problems on the Wikimedia servers and what's being tried to fix it. People are much happier about something being broken when they know why and what's being done about. We also need one central low-noise source for information, and I feel the LJ is probably the best bet for that.
Ta,
Dan100
Dan100 wrote in gmane.science.linguistics.wikipedia.technical:
Can I ask that developers use LiveJournal - http://www.livejournal.com/community/wikitech/ - to keep people informed about problems on the Wikimedia servers and what's being tried to fix it.
I was under the impression that we were all incompetent and had no idea how the site works. What possible information could we provide to other people apart from meaningless rambling about our wild guesses and fumbling around in the dark that somehow, by chance create a vague resemblance to an almost-working website?
Or more to the point, why should I waste my time writing up status reports and trying to let you know what's going on when you have nothing better to do than throw it back in my face peppered with utterly uninformed and unhelpful remarks about how worthless I am and how we should all be fired, and without so much as lifting a finger to try to help? You have no right to ask us to do anything at all, or talk about what you "need".
I created the wikitech LJ to try to let people know what we are doing, and maybe even to let them offer some suggestions on how to help. I understand that a broken site is frustrating; we've all been there, WP is hardly the first unreliable service ever created. If I can offer some insight into why it's like that, maybe it helps people to understand. But if all you're going to do is read it and offer snide remarks, why should I bother? I don't recall ever seeing you idle in #mediawiki, let alone try to offer any ideas about what we're doing so wrongly. Maybe we should just let the site rot and see how much better it works.
I don't want sympathy or "oh well, they aren't getting paid for it." Obviously not paying doesn't magically make something good. But when you have no clue at all about what's actually going on to keep the site running, you have no room to lay blame about where the problem is.
Kate.
Kate Turner wrote:
Dan100 wrote in gmane.science.linguistics.wikipedia.technical:
Can I ask that developers use LiveJournal - http://www.livejournal.com/community/wikitech/ - to keep people informed about problems on the Wikimedia servers and what's being tried to fix it.
I was under the impression that we were all incompetent and had no idea how the site works. What possible information could we provide to other people apart from meaningless rambling about our wild guesses and fumbling around in the dark that somehow, by chance create a vague resemblance to an almost-working website?
Or more to the point, why should I waste my time writing up status reports and trying to let you know what's going on when you have nothing better to do than throw it back in my face peppered with utterly uninformed and unhelpful remarks about how worthless I am and how we should all be fired, and without so much as lifting a finger to try to help? You have no right to ask us to do anything at all, or talk about what you "need".
I created the wikitech LJ to try to let people know what we are doing, and maybe even to let them offer some suggestions on how to help. I understand that a broken site is frustrating; we've all been there, WP is hardly the first unreliable service ever created. If I can offer some insight into why it's like that, maybe it helps people to understand. But if all you're going to do is read it and offer snide remarks, why should I bother? I don't recall ever seeing you idle in #mediawiki, let alone try to offer any ideas about what we're doing so wrongly. Maybe we should just let the site rot and see how much better it works.
I don't want sympathy or "oh well, they aren't getting paid for it." Obviously not paying doesn't magically make something good. But when you have no clue at all about what's actually going on to keep the site running, you have no room to lay blame about where the problem is.
Kate.
Kate and all others, Considering that everything is done on such a LOW budget it is sheer magic. I am one of those people that "lurks in the background" offering little or no advice. What people do not realise is that in a "professional" setting solutions are often bought. You buy the latest doda, you hire that magic expert who comes to do his trick again. With the current budget you have a lot on your plate not only PHP but also Apache, not only Apache but also Squid, not only Squid but also MySQL, not only MySQL but also ..
As you all are not waiting for generalisations from me about how to run a datacentre, I keep quiet. I am honestly amazed at how well it works and how well lessons are learned. Live journal is a great initiative and I thank you for it.
What I appreciate is that knowing the problem is 50% of the solution but many people think that the symptoms IS the problem. The server running slowly or not at all is a symptom and its cause is often not clear. With hindsight, symptoms can be explained in terms of understanding a problem. When symptoms are known to a greater (knowledgable) audience there is a chance that someone has seen them and can make "choclate" out of it (IT is a dark art). There will also be a trade-off between spending time on reporting and working on a situation. There is also the group of people that see it (again with hindsight) as being SOOO obvious. They are trolls and they deserve a hot place in hell
When technology is done on a shoestring, you do push the enveloppe you max out what is in machines, software and people. This is technology at its best, it is not necessarily always the best performance and availability, but it leads to the best solution given people, software and hardware.
Finally blaming is a loose loose situation. You do not get better service and you piss off the people that you want this service from. It is much better to analyse the symptoms of a previous and see what can be learned from it. Then again you only find time and have the inclination to do that when the situation is tranquil and often at such a time you need a breather. So I wish us all tranquil times and lots of fun.
Thanks, GerardM
Gerard Meijssen wrote:
Kate and all others, Considering that everything is done on such a LOW budget it is sheer magic. I am one of those people that "lurks in the background" offering little or no advice. What people do not realise is that in a "professional" setting solutions are often bought. You buy the latest doda, you hire that magic expert who comes to do his trick again. With the current budget you have a lot on your plate not only PHP but also Apache, not only Apache but also Squid, not only Squid but also MySQL, not only MySQL but also ..
I said on the village pump that I thought we are doing a great job, but I don't think any of the developers I was aiming that comment at read it. Let me take this opportunity to thank everyone and recognise what we have acheived.
We've been working in adverse circumstances. During peak times, the site has been extremely heavily loaded and unstable. Any slight error in misconfiguration, or inaction at a particular time when action should have been taken, causes the site to crash. I've lost count of how many problems we've identified and fixed just over the last few weeks.
Many users say "the developers are doing a great job", or "we all know that the developers are very busy", but the fact is that 99% of users don't have a clue what we are doing. They don't know what our achievements have been and they don't know the challenges we face. Gerard's comments are certainly refreshing in this regard, but I think I can add to them. Assume I am speaking on behalf of the users, since I'm sure every user would agree with me if they only knew who to thank.
Big thanks go to JamesDay, who almost single-handedly administers 8 database servers, a task requiring constant monitoring and work on the order of hours per day. James's advice to the MediaWiki developers and other system administrators is invaluable.
Also on the topic of database administration, Kate's servmon and WikiServices bots which have kept the site running when otherwise it would have been choked with long-running queries.
Thanks to Med and Submarine for their work in network and hardware administration of the Paris squids. Well done Mark and Kate for getting PowerDNS up and running and thus getting the Paris squids into service.
Domas's setproctitle() patch is amazing and we all know it. Of course his other system administration and development work is greatly appreciated.
Thanks to Brion, JeLuF and Hashar for their tireless and usually unrecognised work in fixing MediaWiki bugs.
Thanks to Kate for setting up Pen and Perlbal. This is the third time I'm thanking Kate and that's no coincidence - if she left us we'd be left with a dozen pieces of software that no-one else knows how to use.
I know you're all stressed, all we seem to get is complaints despite what we've acheived. I decided after Caroline Ewen's post on wikipedia-l that I can't afford to answer every single question asking "why is the site slow" or every report of "I'm getting backtrace errors!" The fact is that the site has grown to such a size that every time something goes wrong, we can expect a flood of complaints and queries. My advice would be to answer only some of them, and let alert users distribute the information to everyone who asks. Or just ignore them -- remember your time is valuable. Think of what you could have achieved in the time it took you to put a single user out of their ignorance.
When the public forums are too noisy with uninformed speculation, let's organise what we need to make the site better in less visible, more constructive forums, and work with the Board to make it happen.
-- Tim Starling
Just to take this a bit further. I thought I'd compare Wikipedia with one of the "well-run sites" that we are supposed to be competing with. Google is a good direct comparison, because of its dynamic content, with cachable frequent queries.
Looking at the difference in traffic between Google and Wikipedia on Alexa shows that: * Wikipedia has 300 page views per million * Google has 16,000 page views per million Thus, Google serves roughly 53 times the number of page views compared to Wikipedia.
However, * Wikipedia currently has 39 servers * Google has an estimated 50,000 - 100,000 servers in its worldwide farm of clusters Thus, Google has roughly 1250 - 2500 times as many servers as Wikipedia [Source: http://www.tnl.net/blog/entry/How_many_Google_machines for an estimate for April last year, and allowing for more recent expenditure]
Thus, we might regard Wikipedia as being roughly 24 to 48 times more "efficient" in its use of hardware than Google. Given that Google has spent over $250M on hardware, to obtain reasonable parity for our developers to be expected to compete with Google at our current traffic we should have around 1000 - 2000 high-performance servers, at a cost of several million US$.
So, a reasonable answer to critics seems to be: * the developers are already doing very well indeed coping with the combination of extremely high demand and very limited resources * they already know there are big growth and capacity problems, and are working hard on scalability and reliability * send money, rather than complaining
-- Neil
I'm sorry that I upset you Kate. As the performance of the site became worse and worse with time, I and many others became most fustrated in the information vacuum. I put voice to those fustrations. They were unfounded and unwarranted. If I'd known the extent of the 'behind-the-scenes' work attempting to rectify the problems I wouldn't have said it.
I'm just suggesting that to avoid such situations in future, people let us know what's happening.
Dan100
As a neophyte, I found your comparisons very interesting. I think journalists commenting on our speed might find it interesting as well ;-) Thanks for this.
Anthere
Neil Harris a écrit:
Just to take this a bit further. I thought I'd compare Wikipedia with one of the "well-run sites" that we are supposed to be competing with. Google is a good direct comparison, because of its dynamic content, with cachable frequent queries.
Looking at the difference in traffic between Google and Wikipedia on Alexa shows that:
- Wikipedia has 300 page views per million
- Google has 16,000 page views per million
Thus, Google serves roughly 53 times the number of page views compared to Wikipedia.
However,
- Wikipedia currently has 39 servers
- Google has an estimated 50,000 - 100,000 servers in its worldwide farm
of clusters Thus, Google has roughly 1250 - 2500 times as many servers as Wikipedia [Source: http://www.tnl.net/blog/entry/How_many_Google_machines for an estimate for April last year, and allowing for more recent expenditure]
Thus, we might regard Wikipedia as being roughly 24 to 48 times more "efficient" in its use of hardware than Google. Given that Google has spent over $250M on hardware, to obtain reasonable parity for our developers to be expected to compete with Google at our current traffic we should have around 1000 - 2000 high-performance servers, at a cost of several million US$.
So, a reasonable answer to critics seems to be:
- the developers are already doing very well indeed coping with the
combination of extremely high demand and very limited resources
- they already know there are big growth and capacity problems, and are
working hard on scalability and reliability
- send money, rather than complaining
-- Neil
On Thu, Jan 20, 2005 at 05:41:15PM +0000, Neil Harris wrote:
[...]
Thus, we might regard Wikipedia as being roughly 24 to 48 times more "efficient" in its use of hardware than Google.
No, we can't. It would be so for a linear system. Which is not the case.
[...]
So, a reasonable answer to critics seems to be:
- the developers are already doing very well indeed coping with the
combination of extremely high demand and very limited resources
True.
- they already know there are big growth and capacity problems, and are
working hard on scalability and reliability
True.
- send money, rather than complaining
Always.
P.S. Would it be possible to restart wprc10 (recent changes bot for #ru.wikipedia)? It is down for quite some time now.
On Thu, Jan 20, 2005 at 05:41:15PM +0000, Neil Harris wrote:
[...]
Thus, we might regard Wikipedia as being roughly 24 to 48 times more "efficient" in its use of hardware than Google.
No, we can't. It would be so for a linear system. Which is not the case.
[...]
So, a reasonable answer to critics seems to be:
- the developers are already doing very well indeed coping with the
combination of extremely high demand and very limited resources
True.
- they already know there are big growth and capacity problems, and are
working hard on scalability and reliability
True.
- send money, rather than complaining
Always.
P.S. Would it be possible to restart wprc10 (recent changes bot for #ru.wikipedia)? It is down for quite some time now.
Tim Starling a écrit:
I said on the village pump that I thought we are doing a great job, but I don't think any of the developers I was aiming that comment at read it. Let me take this opportunity to thank everyone and recognise what we have acheived.
Good idea for commenting it here, as I do not think everyone reads the village pump :-)
We've been working in adverse circumstances. During peak times, the site has been extremely heavily loaded and unstable. Any slight error in misconfiguration, or inaction at a particular time when action should have been taken, causes the site to crash. I've lost count of how many problems we've identified and fixed just over the last few weeks.
Many users say "the developers are doing a great job", or "we all know that the developers are very busy", but the fact is that 99% of users don't have a clue what we are doing. They don't know what our achievements have been and they don't know the challenges we face. Gerard's comments are certainly refreshing in this regard, but I think I can add to them. Assume I am speaking on behalf of the users, since I'm sure every user would agree with me if they only knew who to thank.
Big thanks go to JamesDay, who almost single-handedly administers 8 database servers, a task requiring constant monitoring and work on the order of hours per day. James's advice to the MediaWiki developers and other system administrators is invaluable.
Also on the topic of database administration, Kate's servmon and WikiServices bots which have kept the site running when otherwise it would have been choked with long-running queries.
Thanks to Med and Submarine for their work in network and hardware administration of the Paris squids. Well done Mark and Kate for getting PowerDNS up and running and thus getting the Paris squids into service.
Domas's setproctitle() patch is amazing and we all know it. Of course his other system administration and development work is greatly appreciated.
Thanks to Brion, JeLuF and Hashar for their tireless and usually unrecognised work in fixing MediaWiki bugs.
Thanks to Kate for setting up Pen and Perlbal. This is the third time I'm thanking Kate and that's no coincidence - if she left us we'd be left with a dozen pieces of software that no-one else knows how to use.
I know you're all stressed, all we seem to get is complaints despite what we've acheived. I decided after Caroline Ewen's post on wikipedia-l that I can't afford to answer every single question asking "why is the site slow" or every report of "I'm getting backtrace errors!" The fact is that the site has grown to such a size that every time something goes wrong, we can expect a flood of complaints and queries. My advice would be to answer only some of them, and let alert users distribute the information to everyone who asks. Or just ignore them -- remember your time is valuable. Think of what you could have achieved in the time it took you to put a single user out of their ignorance.
When the public forums are too noisy with uninformed speculation, let's organise what we need to make the site better in less visible, more constructive forums, and work with the Board to make it happen.
-- Tim Starling
Well, as one reporting problems from time to time on #mediawiki and basically little able to understand what your job involves, I must say I appreciate these comments very much Tim.
One of the way I would like that we help you (but this is difficult when you are busy and when we are not tech people) is to relieve you of having to inform people one by one, or channel by channel, or public forum by public forum. It takes time and energy.
One thing I learned in my job is that a customer tolerate much better a delay when he is "warned" of the delay in advance, and understands much better a "problem" when the source of the problem is explained to him. As you say, it avoids speculation, as well as anger or nervousness, which may results in one being under fire of requests for immediate explanation.
I know not really how we could achieve this, as precisely when there are problems, you are all very busy; but if one of you could make a report, and some of us could attempt to have it published in many places, so that most people read this report instead of asking you over and over what is going on, I think that might be worth it. As it was, I did not really have the feeling to see much feedback from "alert users". Perhaps should it be more clear that you expect from them to distribute the information ?
Ant
wikitech-l@lists.wikimedia.org