Sunday, 15 February 2015

Eventual Consistency

Introduction
Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value.[1]
To ensure high availability and scalability, distributed system keeps copies of its data across multiple machines (sitting in different data centers). When a change happened to a data item on one machine, that change has to be propagated to the other replicas. The change propagation will not be happened instantly since there is a network delay. This interval of time allows window of inconsistency during which some of the copies will have the most recent change, but others won't. In other words, the copies will be mutually inconsistent. However, the change will eventually be propagated to all the copies. Hence it is called eventual consistency.
When we talk about eventual consistency, we also need to mention CAP Theorem.
CAP Theorem
CAP Theorem was presented by Eric Brewer in a keynote address to PODC (Principles of Distributed Computing) in 2001. CAP Theorem identifies three important properties of distributed system: Consistency, Availability and Partition Tolerance. Out of these three properties, only two can be achieved at a given time.
Since it is impossible simultaneously to achieve always-on experience (availability) and reading the latest written version of data from a distributed database (consistency) in the presence of partial failure (partitions), distributed system architects sacrifice "strong" consistency to ensure availability and partition tolerance. In other way it can be said that they use weaker models and eventual consistency is the most notable one.
Examples
  • DNS
  • Asynchronous master/slave replication on an RDBMS
  • Caching in front of relational databases
  • NoSQL databases
Variations of Eventual Consistency
  • Causal consistency If process A has communicated to process B that it has updated a data item, a subsequent access by process B will return the updated value, and a write is guaranteed to supersede the earlier write. Access by process C that has no causal relationship to process A is subject to the normal eventual consistency rules. Eventual Consistency does not say anything about the ordering of operations where as causal consistency ensures that operations appear in the order the user intuitively expects. It enforces a partial order over operations.
  • Read-your-writes consistency This is an important model where process A, after it has updated a data item, always accesses the updated value and will never see an older value. This is a special case of the causal consistency model.
  • Session consistency This is a practical version of the previous model, where a process accesses the storage system in the context of a session. As long as the session exists, the system guarantees read-your-writes consistency. If the session terminates because of a certain failure scenario, a new session needs to be created and the guarantees do not overlap the sessions.
  • Monotonic read consistency If a process has seen a particular value for the object, any subsequent accesses will never return any previous values.
  • Monotonic write consistency In this case the system guarantees to serialize the writes by the same process. Systems that do not guarantee this level of consistency are notoriously hard to program.
A big advantage of Eventual Consistency is that it is fairly straightforward to implement. To ensure convergence, replicas must exchange information with one another about which writes they have seen. This information exchange process is often called anti-entropy. There are different ways to achieve this. One simple solution is to use an asynchronous all-to-all broadcast. When a replica receives a write to a data item, it immediately responds to the user, then, in the background, sends the write to all other replicas, which in turn update their locally stored data items. In the event of concurrent writes to a given data item, replicas deterministically choose a "winning" value, often using a simple rule such as "last writer wins" (using a clock value embedded in each write).
Even though Eventual Consistency does not make any safety guarantee, eventual consistent data store are widely deployed. Because it is "good enough", given its latency and availability benefits.
References
  • http://en.wikipedia.org/wiki/Eventual_consistency
  • http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
  • Don't Settle for Eventual Consistency- Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen
  • Eventual Consistency Today: Limitations, Extensions, and Beyond - Peter Bailis and Ali Ghodsi