2009/7/13 Domas Mituzas <midom.lists(a)gmail.com>om>:
8. ms1 was spending lots of CPU in
zfs`metaslab_alloc() tree (full
tree, if anyone is interested, is at
http://flack.defau.lt/ms1.svg -
use FF3.5, then you can search for metaslab_alloc in it.
This sounds *very* like a ZFS bug in Solaris 10 that we struck at work
a while ago:
Precis: if the file system is very busy (being hammered) *and* it's
over 85% full, the block allocator can get stuck trying to work out
the *very best* allocation rather than one that'll do and let it get
on with other work. To the point where you see CPU go through the
roof, with 80% system CPU and a very unresponsive system. You can't
stop this without rebooting the box.
Sun acknowledged it as a bug and it'll be fixed in a future release;
they gave us a hotpatch. The workaround? Keep the ZFS filesystem in
question under 70% full ...
This is an obscure bug and isn't reason to avoid ZFS in general - the
bug only gets tickled in particular circumstances, when ZFS is having
the heck beaten out of it. I'd still happily recommend ZFS for almost
anything, because it really is *that cool*.
- d.