Multi-node Cassandra Cluster Made Easy with Kubernetes

Cassandra cluster in terminal window

I’ve just shared an easy way to launch a Cassandra cluster on Kubernetes.

The Kubernetes project has a Cassandra example that uses a custom seed provider for seed discovery. The example makes use of a Cassandra Docker image from gcr.io/google_containers.

However, I wanted a solution based on the official Cassandra Docker image. This is what I came up with.

First, I created a headless Kubernetes service that provides the IP addresses of Cassandra peers via DNS A records. The peer service definition looks like this:

<br />apiVersion: v1
kind: Service
metadata:
  labels:
    name: cassandra-peers
  name: cassandra-peers
spec:
  clusterIP: None
  ports:
    - port: 7000
      name: intra-node-communication
    - port: 7001
      name: tls-intra-node-communication
  selector:
    name: cassandra

Then I extended the official Cassandra image with the addition of dnsutils (for the dig command) and a custom entrypoint that configures seed nodes for the container. The new entrypoint script is pretty straight forward:

my_ip=$(hostname --ip-address)

CASSANDRA_SEEDS=$(dig $PEER_DISCOVERY_DOMAIN +short | 
    grep -v $my_ip | 
    sort | 
    head -2 | xargs | 
    sed -e 's/ /,/g')

export CASSANDRA_SEEDS

/docker-entrypoint.sh "$@"

Whenever a new Cassandra pod is created, it automatically discovers seed nodes through DNS.

comments powered by Disqus