Multi-node Cassandra Cluster Made Easy with Kubernetes
I’ve just shared an easy way to launch a Cassandra cluster on Kubernetes.
The Kubernetes project has a Cassandra example that uses a custom seed provider for seed discovery. The example makes use of a Cassandra Docker image from gcr.io/google_containers.
However, I wanted a solution based on the official Cassandra Docker image. This is what I came up with.
First, I created a headless Kubernetes service that provides the IP addresses of Cassandra peers via DNS A records. The peer service definition looks like this:
<br />apiVersion: v1
kind: Service
metadata:
labels:
name: cassandra-peers
name: cassandra-peers
spec:
clusterIP: None
ports:
- port: 7000
name: intra-node-communication
- port: 7001
name: tls-intra-node-communication
selector:
name: cassandra
Then I extended the official Cassandra image with the addition of dnsutils (for the dig command) and a custom entrypoint that configures seed nodes for the container. The new entrypoint script is pretty straight forward:
my_ip=$(hostname --ip-address)
CASSANDRA_SEEDS=$(dig $PEER_DISCOVERY_DOMAIN +short |
grep -v $my_ip |
sort |
head -2 | xargs |
sed -e 's/ /,/g')
export CASSANDRA_SEEDS
/docker-entrypoint.sh "$@"
Whenever a new Cassandra pod is created, it automatically discovers seed nodes through DNS.