Consistent Hashing: Algorithmic Tradeoffs

 server := serverList[hash(key) % N]
  • When adding or removing servers, only 1/nth of the keys should move.
  • Don’t move any keys that don’t need to move.
It scrolls on like this for a while
func (m *Map) Add(nodes ...string) {
for _, n := range nodes {
for i := 0; i < m.replicas; i++ {
hash := int(m.hash([]byte(strconv.Itoa(i) + " " + n)))
m.nodes = append(m.nodes, hash)
m.hashMap[hash] = n
}
}
sort.Ints(m.nodes)
}
func (m *Map) Get(key string) string {
hash := int(m.hash([]byte(key)))
idx := sort.Search(len(m.keys),
func(i int) bool { return m.keys[i] >= hash }
)
if idx == len(m.keys) {
idx = 0
}
return m.hashMap[m.keys[idx]]
}

A Brief Digression on Ketama

Ketama is a memcached client that uses a ring hash to shard keys across server instances. I needed a compatible Go implementation and came across this problem.

unsigned int k_limit = floorf(pct * 40.0 * ketama->numbuckets);
float floorf(float x);
unsigned int numbuckets;
float pct;
limit := int(float32(float64(pct) * 40.0 * float64(numbuckets)))

“Are we done?” OR “Why Is This Still a Research Topic?”

Ring hashing presents a solution to our initial problem. Case closed? Not quite. Ring hashing still has some problems.

Jump Hash

In 2014, Google released the paper “A Fast, Minimal Memory, Consistent Hash Algorithm” known as “Jump Hash”. The algorithm was actually included in the 2011 release of the Guava libraries and indicates it was ported from the C++ code base.

func Hash(key uint64, numBuckets int) int32 {
var b int64 = -1
var j int64
for j < int64(numBuckets) {
b = j
key = key*2862933555777941757 + 1
j = int64(float64(b+1) *
(float64(int64(1)<<31) / float64((key>>33)+1)))
}
return int32(b)
}

“Are we done?” OR “Why Is This Still a Research Topic?” (2)

Ring hashing provides arbitrary bucket addition and removal at the cost of high memory usage to reduce load variance. Jump Hash provides effectively perfect load splitting at the cost of reduced flexibility when changing the shard counts.

Multi-Probe Consistent Hashing

Another paper from Google “Multi-Probe Consistent Hashing” (2015) attempts to address this. MPCH provides O(n) space (one entry per node), and O(1) addition and removal of nodes. The catch? Lookups get slower.

Rendezvous Hashing

Another early attempt at solving the consistent hashing problem is called rendezvous hashing or “highest random weight hashing”. It was also first published in 1997.

func (r *Rendezvous) Lookup(k string) string {
khash := r.hash(k)

var midx int
var mhash = xorshiftMult64(khash ^ r.nhash[0])

for i, nhash := range r.nhash[1:] {
if h := xorshiftMult64(khash ^ nhash); h > mhash {
midx = i + 1
mhash = h
}
}

return r.nstr[midx]
}

Maglev Hashing

In 2016, Google released Maglev: A Fast and Reliable Software Network Load Balancer. One section of the paper described a new consistent hashing algorithm which has come to be known as “maglev hashing”.

Replication

Replication is using the consistent hash to choose secondary (or more) nodes for a given key. This can be either to protect against node failure, or simply as a second node to query to reduce tail latency. Some strategies use full node replication (i.e, having two full copies of each server), while others replicate keys across the servers.

Weighted Hosts

Consistent hashing algorithm vary in how easy and effective it is to add servers with different weights. That is, send more (or less) load to one server as to the rest. With a ring hash, you can scale the number of replicas by the desired load. This can increase memory usage quite considerably. Jump Hash and Multi-Probe consistent hashing are trickier to use and maintain their existing performance guarantees. You can always add a second “shadow” node that refers back to an original node, but this approach fails when the load multiple is not an integer. One approach would be to scale all node counts by some amount, but this increases both memory and lookup time.

Load Balancing

Using consistent hashing for load balancing seems like an appealing idea. But depending on the algorithm this can end up no better than random assignment which leads to unbalanced distribution.

Benchmarks

And now what you’ve all been waiting for. Hopefully you didn’t just skip down to the bottom of the article and ignore all the caveats and tradeoffs that each consistent hashing function has.

Benchmarks for a single lookup with different node counts; timings in nanoseconds

Conclusion

As you can see, there is no perfect consistent hashing algorithm. They all have trade-offs. There are many others I haven’t covered here. But like the above they’re all struggling to balance distribution, memory usage, lookup time, and construction time (including node addition and removal cost).

Source Code

Here are all the repositories implementing the algorithms I discuss:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store