2009/7/13 Domas Mituzas midom.lists@gmail.com:
- ms1 was spending lots of CPU in zfs`metaslab_alloc() tree (full
tree, if anyone is interested, is at http://flack.defau.lt/ms1.svg - use FF3.5, then you can search for metaslab_alloc in it.
This sounds *very* like a ZFS bug in Solaris 10 that we struck at work a while ago:
Precis: if the file system is very busy (being hammered) *and* it's over 85% full, the block allocator can get stuck trying to work out the *very best* allocation rather than one that'll do and let it get on with other work. To the point where you see CPU go through the roof, with 80% system CPU and a very unresponsive system. You can't stop this without rebooting the box.
Sun acknowledged it as a bug and it'll be fixed in a future release; they gave us a hotpatch. The workaround? Keep the ZFS filesystem in question under 70% full ...
This is an obscure bug and isn't reason to avoid ZFS in general - the bug only gets tickled in particular circumstances, when ZFS is having the heck beaten out of it. I'd still happily recommend ZFS for almost anything, because it really is *that cool*.
- d.