A horizontally scaling object store based on the CRUSH placement algorithm.

  • By null
  • Last update: Dec 19, 2022
  • Comments: 0


A horizontally scaling object store based on the CRUSH placement algorithm.

How it works

All clients and nodes in the storage cluster have a copy of the cluster configuration, when a request for an object arrives, we are able to locate the subset of servers it should be stored on without network communication - conceptually it is similar to a distributed hash table lookup.

Periodically (and after configuration changes), all storage nodes will 'scrub' their data directory looking for corrupt objects and ensuring the storage placement requirements are met, replicating data to other nodes if they are not.

If storage nodes disagree on the current cluster configuration, replication requests will be rejected and retried until the configurations are in agreement again. It is the responsibility of the administrator to keep the cluster configurations consistent in a timely way (This can be done with NFS/rsync+cron/git + cron/watching a consul key/...).

Use cases and limitations

CrushStore is designed to be an operationally simple object store that you use as part of a larger storage system. Generally you would store keys in some other database and use this to lookup items.

Like a hash table, it supports the following operations:

  • Save an object associated with a key.
  • Get an object associated with a given key.
  • Delete an object associated with a key.
  • Unordered listing of keys.

Unlike a hash table, CrushStore has an additional constraint:

Writes to a key are eventually consistent - upload conflicts are resolved by the create timestamp and reading from a different server you just wrote to may return a different value if the correct value has not been replicated.

Getting started


$ git clone https://github.com/andrewchambers/crushstore
$ cd crushstore
$ go build ./cmd/...
$ mkdir bin
$ cp $(find ./cmd -type f -executable) ./bin
$ ls ./bin

Running a simple cluster

Create a test config - crushstore-cluster.conf:

storage-schema: host
    - select host 2
    - 100 healthy
    - 100 healthy
    - 100 healthy

Create and run three instances of crushstore:

$ mkdir data0
$ ./crushstore -listen-address -data-dir ./data0
$ mkdir data1
$ ./crushstore -listen-address -data-dir ./data1
$ mkdir data2
$ ./crushstore -listen-address -data-dir ./data2

Upload objects:

$ echo hello |  curl -L -F [email protected] ''
echo hello | ./bin/crushstore-put testkey2 -

Download objects:

$ curl -L '' -o -
$ ./bin/crushstore-get testkey2

List objects:

$ ./bin/crushstore-list

Delete objects:

$ curl -L -X POST  ''
$ ./bin/crushstore-delete testkey2

Experiment with data rebalancing

Create some test keys:

for i in $(seq 10)
	echo data | ./bin/crushstore-put $(uuidgen) -

Change a 'healthy' line in the config to 'defunct' and change the relative weight of one of the storage nodes, then wait for crushstore to reload the config:

    - 50 healthy
    - 100 defunct
    - 100 healthy

The crushstore scrubber job will rebalance data to maintain the desired placement rules and weighting.