Wikitech-l December 2003

wikitech-l@lists.wikimedia.org

84 participants
155 discussions

Re: [Wikitech-l] Pliny SCSI issues?
by Brion Vibber 28 Dec '03

28 Dec '03

It's continuing to get these errors after a reboot. Seems to be the primary controller; I'm not a SCSI guru, I don't know what's what. I'll get all the data backed up somewhere if I can... sigh -- brion vibber (brion @ pobox.com)

1 0

Memory testing
by Brion Vibber 28 Dec '03

28 Dec '03

memtest86, the quintessential memory tester app, requires physical console access and a reboot. In the meantime, I've run some tests on geoffrin from within Linux with a program called 'memtester' which was recommended to me: http://www.qcc.ca/~charlesc/software/memtester/ It only seemed willing to lock about 2 gigs at a time, so I ran two memtester processes, which between them covered most of the available memory. I didn't tell it to test _all_ memory as I was afraid Linux might start killing sshd processes for daring to ask for more memory to send me the results. ;) Well, it's spewing out errors right and left; 19 failures caught between all the tests on the first pass. At this point we can't be sure if these are actual physical RAM defects, or artifacts of kernel bugs or other problems (or even bugs in memtester, which might not have been thoroughly tested on amd64), but it's not a good sign. I'll post the logs once it's run a while longer. -- brion vibber (brion @ pobox.com)

2 2

Database moved back to pliny
by Brion Vibber 28 Dec '03

28 Dec '03

The database is back on pliny for now. Replication isn't currently set up, mainly because pliny is short on disk space, and the binlogs accumulate at a rate approaching a gig per day. Once we're satisfied with geoffrin, we can move everything back, yay... :P We waste a lot of disk space on the old table; gzipping old text should save a fair chunk of space relatively easily at minor cpu cost, without the complications of trying to make consistent diff-based storage. Even if it were only 50% savings, that's several gigabytes. I may try to throw something together for this... -- brion vibber (brion @ pobox.com)

1 0

Re: Crashy opteron
by Jimmy Wales 27 Dec '03

27 Dec '03

Damned computers. Well, Tim is out of town. Jason is a couple hours away and there's no way I'm asking him to drive down on Christmas. The colo does not have 24x7 staffing, so no one is there to reboot it for us today. Therefore, we're doomed until tomorrow morning. Jason has a family obligation tomorrow at 10AM, so he _could_ go down tomorrow afternoon, but only if we can't get someone at the colo to do the reboot. At least the website is serving cached pages, maybe if someone has a chance, the cached-pages message could be updated to explain the time we expect editing to resume. >From now on when we think about how to spend money, we should think about this morning, and think about redundancy, which is expensive but probably necessary as we get bigger and bigger. --Jimbo Brion Vibber wrote: > While the new server is wonderfully fast, it's crashing way *way* too > often. :( > > Has the memory been tested? What kind of warranty do we have? Will > Penguin replace any defective parts? How long would this take? > > We *really* need a way to reboot it remotely. Somebody's going to have > to go in on Christmas day just to push the reset button, and that ain't > cool. > > sigh... > > -- brion vibber (brion @ pobox.com)

5 11

Opteron's temperature?
by Nikos-Optim 27 Dec '03

27 Dec '03

I am wondering why Geoffrin is so unstable. For how many hours was it down? how about the Opteron's temperatures? I suppose it is installed in an air-conditioned room. and how about the Opteron's motherboard BIOS? is it updated with a stable revision? Some Opteron motherboards had problems with the BIOS. for the remote reset issue somebody suggested this on http://openfacts.berlios.de/index-en.phtml?title=Wikipedia_Status : "Type "remote power management" into Google, and you'll get listings for lots of network-controlled power-distribution devices, which will allow you to remotely power-cycle locked-up boxes, but without the cost of a UPS, and more importantly, with the ability to power cycle on a per-power socket basis. (wikipedian, the Anome)" -- Optim __________________________________ Do you Yahoo!? New Yahoo! Photos - easier uploading and sharing. http://photos.yahoo.com/

4 4

Design document for new layouts
by tarquin 27 Dec '03

27 Dec '03

The servers being down gave me a chance to do some more work on layout design for MediaWiki: http://meta.wikipedia.org/wiki/Layout_design_document

1 0

One domain with several languages
by Daniel Mayer 27 Dec '03

27 Dec '03

Yann wrote: >I suppose that we want that the language changes acccording >to users' settings with a "user_language" field in the user table. A user pref to override the default UI language would be nice. Another thing that can be done in addition to that is have something like &lang=fr in the URL to force an interface change for anybody who has not set an override language in their UI (to French in this example). >I see a problem with the cached pages. We will need to have >a cache for each language. Language category tags (via the MediaWiki category system which is waiting for some more bug fixing before it is implemented) would be a way to solve that. Then when anons (who view cached pages by default) visit a page with a French language category tag in it their UI would switch to French (&lang=fr could then be added to history and talk page links so the French UI follows the anon). -- Daniel Mayer (aka mav)

1 0

memcached
by erik_moeller＠gmx.de 27 Dec '03

27 Dec '03

Jimbo got the machine rebooted and I restored the settings, but I have no idea how to get the memcached stuff working right. I disabled it for now. Regards, Erik

2 1

Geoffrin up
by Jimmy Wales 27 Dec '03

27 Dec '03

Geoffrin is back up. I am trying to get the website back to normal, but I really don't know what I'm doing, so I'm just poking around looking at things. Probably someone could fix this pretty quickly. I left a voicemail for Brion, but anyone else who knows what to do could probably do it as well.

3 2

Why we matter...
by Jimmy Wales 27 Dec '03

27 Dec '03

An email I received... ------- Hi, I am Sujith, 29 years, Male. I visit your esteemed website on regular basis.I was shoked to know that ur Database server has crashed down. I have one of important exams comming up for which my study material is only wikipedia.org. Pl. pull up ur socks and also help us. Best wishes for a very happy new year ahead. Thanks, SUJITH __________________________________ So, yeah, I guess I better pull up my socks. --Jimbo

1 0

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l December 2003