An update regarding Will (one of the new 1Us): due to previous overheating problems, jeronim had shut down will yesterday. We suspected a disconnected fan or other airflow problems. Earlier today, Jimbo visited the colo and found that the fans were working very well, and that there were no visible airflow issues. After he left, we continued having temperature problems:
PU1: Temperature above threshold CPU0: Temperature above threshold CPU0: Running in modulated clock mode CPU1: Running in modulated clock mode
The CPU was running at about 62 celsius, which while not horrible is not temperature that we want to run a production server on. I suggested that the machine be powered off until we can look at it further, and jeronim did so at about 7:15PM EST. There's some consensus that the next most likely problem is an unseated (or improperly seated) heatsink, while the less likely option is a bad CPU.
Jimbo mentioned a person in Florida who might be able to come to the colo and help us out, so that's our plan of action for the moment.
Cheers, Ivan.
wikitech-l@lists.wikimedia.org