David Gerard wrote:
And today I happened to be chatting to another Solaris 10 administrator who has seen precisely the same bug manifest itself on a terabyte zpool getting lots of I/O. So that's three cases. I certainly hope Sun get to the bottom of this one sooner rather than later ... and that btrfs gets out of alpha sooner than later.
Here's an anonymized extract of what I sent Brion privately back then:
... experts on AFS, IFS, NFS, and Linux (as in, official Linux kernel commit/maintainers).... Keep in mind that I'm an Internet guy, not an application or storage guy, I'm just passing along....
[http://wikitech.wikimedia.org/view/Ms1_troubles]
This is typical for that type of file system. NetApp filers have similar behavior after exceeding a certain relative capacity (maybe 80%). This is the trade-off for the advanced features like snapshots.
If you provision for keeping your primary storage at 60% capacity or less, you should continue to be just fine. You can use more storage capacity on secondary storage (such as backup storage), since lower tiers in the storage hierarchy are generally less latency sensitive.
[http://wikitech.wikimedia.org/view/User:River/Storage]
Sounds reasonable. Aligns well with what others do.
If acquisition cost is a concern, you might also consider building the hardware from scratch. Solaris 10 runs on most x86 hardware, but you have to be careful about selecting hardware (especially network and storage) that is on the Solaris HCL.
...
If the single server is still working well enough, migration to the two server version is probably your best bet for rolling your own. One to write and the other to distribute files and snapshots. You can even do it over Gb ethernet without FC at much lower cost. Network is still faster than disk access.
...
If you have to buy now, and are unlikely to upgrade for years, the current gold plated performance version is Sun ZFS over NetApp filers.
...