Thanks, what confuses me though is why you must retain n/2 hosts, and not be abl...

scottdw2 · on May 28, 2015

In a leader election, a node won't vote for a candidate that doesn't have a record it believes to be committed. Thus, writing to a majority ensures that when the leader dies, the new leader will have all the committed dats. Thus, up to n/2 - 1 nodes can be lost with a guarantee that no committed data is lost.

If you don't need strong consistency (after a write commits, all future reads will see the write), you can use simpler replication strategies.

Redis, for example, using async replication, that is not guaranteed to succeed. A redis master may ack a write, and a subsequent read may see it, but a loss of the master before replication occurs can result in data loss. The failover is not guaranteed to contain all writes. Some times weak consistency is ok. Sometimes it's not, but eventual consistency is ok. Other times strong consistency is needed.

Raft is useful when you need strong consistency.

Raft does support membership changes (adding and removing nodes from the cluster), so it can support losing more than n / 2 - 1 nodes, just not simultaneously.

joshuak · on May 28, 2015

(click) Now I finally get it! Thanks so much, very helpful.

justinsb · on May 28, 2015

How do you propose to differentiate between the 'network is segmented' vs 'servers are simply down' cases?

joshuak · on May 28, 2015

I meant to refer to the case in which there are potentially conflicting writes (segmented) vs no writes (offline).