![]() ![]() It is unlikely that a large number of hardware components will fail at the same time. A system that tolerates machine failure can be patched one node at a time, without downtime of the entire system ( rolling upgrade). There is a move towards systems that tolerate the loss of entire machines. As data volumes increase, more applications use a larger number of machines, proportionally increasing the rate of hardware faults. Until recently redundancy of hardware components was sufficient for most applications. You should generally prefer tolerating faults over preventing faults. Systems that anticipate faults and can cope with them are called fault-tolerant or resilient.Ī fault is usually defined as one component of the system deviating from its spec, whereas failure is when the system as a whole stops providing the required service to the user. ![]()
0 Comments
Leave a Reply. |