[snip]
Kscape system is designed to be able to expand as new drives are added (up to the server max) but not designed to shrink as drives are removed. Kscape system expect you replace the failed drives immediately. In real world situation, although not impossible, it is highly unlikely you will have 3 drives fail the same moment. So the system is designed to recover from up to 3 drive failures. 3U Server is designed to continue to operate as normal with 1 drive fail. With the second drive failed, it will offer limited operations to reduce the likely hood of additional drive fails. In either situation, it is not expected to use your system in this condition for long duration of time. If you continue to use your system with failed drives, the system will limit even more functions to protect itself.
[snip]
This is not quite accurate.
The Kaleidescape filesystem can only continue to operate with 1 -- and only one -- failed disk at any given time. If another disk fails while there is already a failed or rebuilding disk in the filesystem, that is a so-called "double disk failure" and the filesystem is, for all intents and purposes, lost. A 3U server's filesystem could, in most cases, survive 2 disk failures but not 3 and those disk failures must not occur concurrently.
In the following sequence of events, we'll assume that no failed disks are physically removed from the server chassis and replaced.
When disk 1 fails, it is immediately replaced by the hot spare (assuming that the hot spare is large enough) and the missing data from disk 1 is rebuilt onto that replacement disk. If another disk fails while that rebuild is ongoing, that's a double disk failure and the filesystem is lost. Once the disk finishes rebuilding (which can take days), the filesystem is back to a healthy state but it is now operating without a hot spare.
When disk 2 fails, there is no hot spare to replace it and thus the filesystem remains in a degraded state unless the failed disk is removed and replaced with a new drive of equal or greater size. Once again, if another disk fails -- either while disk 2 is failed or while the replacement for disk 2 is rebuilding -- it's a double disk failure and the filesystem is lost.
The general rule of thumb is that there can only be one failed or rebuilding disk in the filesystem at any one time in order for the system to continue operating.
There is one exception to the information above:
If an empty disk in the filesystem fails
without ever having had any data written to it before then the failed disk is dropped from the filesystem and the filesystem remains in a healthy state but now has less available storage.