Consistent hashing KV store
An eventually consistent in memory KV store that uses a hash ring to distribute data accross nodes
Introduction
During one of my vacation days, I wanted to experiment with some concepts from the Designing Data Intensive Applications book. I decided to make a consistent hashing KV store. This project is 70% vibe coded. The more complicated logic caused the AI to fail.
Design
To start with I defined a high level architecture. It looks like this:
handler.clj
└── ConsistentHashingKVStore (routes keys via hash ring)
├── HashRing (sorted virtual-node ring, wrap-around lookup)
├── LocalKVStore (atom-backed in-process store)
└── NodeCommunicator (transport protocol)
├── InMemoryCommunicator (direct fn calls - tests)
└── HttpCommunicator (HTTP forwarding - multi-node)
The nodes communicate amongst themselves over the in memory or http communicator. This way I can test in memory, and iterate quickly.
There is no gossip protocol (yet), all the nodes need to be configured in the configuration of each other node. The replication factor and write quorum can be updated to adjust. Replication factor is the number of nodes that store the data. Write quorum is the number of writes required to track the write as a success.
For a set of nodes n, when replication factor and write quorum are n then the kv store will be strongly consistent. When replication factor is greater than write quorum the database is eventually consistent. This is how cassandra DB works under the hood. It allow for high throughput writes that can also be distributed.
Each node has an api like so:
(defn- kv-routes [store nodes]
(routes
(POST "/kv/get" {:keys [body]}
(if-let [k (get-key body)]
(let [{:keys [ok err]} (ls/kv-get store k)]
(if err
{:status 500 :body {:error err}}
(if (nil? ok)
{:status 404 :body {:error "not found"}}
{:status 200 :body {:value ok}})))
{:status 400 :body {:error "key required"}}))
(POST "/kv/set" {:keys [body]}
(let [k (get-key body)
v (:value body)]
(if (nil? k)
{:status 400 :body {:error "key required"}}
(do (ls/kv-set store k v)
{:status 200 :body {:ok true}}))))
(POST "/kv/delete" {:keys [body]}
(if-let [k (get-key body)]
(do (ls/kv-delete store k)
{:status 200 :body {:ok true}})
{:status 400 :body {:error "key required"}}))
(POST "/internal/kv/set" {:keys [body]}
(let [k (get-key body)
v (:value body)]
(if (nil? k)
{:status 400 :body {:error "key required"}}
(do (ls/kv-internal-set store k v)
{:status 200 :body {:ok true}}))))
(POST "/internal/kv/delete" {:keys [body]}
(if-let [k (get-key body)]
(do (ls/kv-internal-delete store k)
{:status 200 :body {:ok true}})
{:status 400 :body {:error "key required"}}))
The public get set delete endpoints are the main external endpoints. Set and delete are internal only endpoints.
Method Details
KV Get
KV get receives a single key argument and returns the value. First it checks if the current node has that key. If it doesn’t then it checks the hash ring to see who has that key, and sends the requests to those values. It returns the first result from the peers.
KV Set
- identifies all nodes that match
- send internal set on each node
- if it has the data it sets its own data
- Ensures that all quorum writes succeeded returns an error (this should rollback but it doesn’t right now)
- if one doesn’t succeed but quorum writes succeed its still a pass
KV Delete
- same as kv set but removes the value
Conclusion
This was an interesting experiment with consistent hashing. There’s a lot of detail that I ignored, for the sake of having a functional prototype.