Tolerance to failures, or failover, refers to the ability that has a system to recover from an error without affecting the availability of the service.
As always, a picture is better than thousand words. The important thing is that at the end there is one that is still running.
Be in position…
We want to create a tolerant to failure or failover system. What do we have to do?
Obviously, to be able to generate a fault-tolerant redundant system, we need sufficient capital. To greater capital, greater resistance to failure. It is not possible to create failover without investing capital.
To have a tolerant system must configure commonly called "Redundant system", i.e., two or more systems configured so in spite of being connected and they have the same information, can operate independently if the other fails. Through this system, even in the worst case (a hard break, a buffer overflow that Matt a vital process, or even that someone hit a kick to the cable) can continue to operate
There are many ways to get a redundant system. We can, as we have explained to use multiple complete systems that work always, so also, we reduce the burden of the set of systems. (If we have one machine charged to 80%, if we put a redundant system of two machines in the same place, each machine will have an approximate load of 40%). We can also have the system of "inactive team", where a team is waiting to which it another fails to activate as soon as get you the notification.