In this post, we will discuss Stateful applications in Kubernetes – StatefulSet, which is a resource provided by Kubernetes. StatefulSet is a resource that controls and manages one or more Pods in Kubernetes. But why not use a Deployment, you might be wondering? So, before we proceed, it’s better for you to understand the differences between Stateful and Stateless architectures.
Difference between StatefulSet and Deployment in Kubernetes
To understand why to use a StatefulSet instead of a Deployment for an object, we first need to grasp the basic differences between Stateless and Stateful.
A stateless application is one that has no state, doesn’t care about the network it’s running on, or doesn’t require permanent storage; its information can be ephemeral and contained within its Pod. Examples of this can include Tomcat, Apache, or Nginx. This type is the most commonly used within Kubernetes.
On the other hand, we have all those stateful applications, which have state, care about the network or the instance they are running on, and usually correspond to databases. Some examples are PostgreSQL, MongoDB, Elasticsearch, Kafka, etc.
Let’s try to explain a Stateful application with an example of PostgreSQL running in high availability (HA). Imagine that we have a single master node (for writing) with one or more replicas (for reading). Since each node we deploy needs to be identified to determine the master and replica and to know their state, we need them to be deployed in a stateful manner. Therefore, we will use a StatefulSet for the deployment.
In general, we choose a StatefulSet over a Deployment because we require knowledge of the Pod to which a unique identity is assigned, and a Deployment allows for interchangeability while also providing persistence to the application.
How does a StatefulSet work in Kubernetes?
When we decide to use the StatefulSet object type in Kubernetes, it is generally because the pods we deploy need to be unique and identifiable. We create this type of object, it automatically generates Pods with unique identities
When working with Kubernetes, it’s common to create a service, which we can describe as an abstraction layer for our Pods. People make requests to the service, and it redirects them to a specific Pod. However, when working with a StatefulSet object type, the call’s destination to a particular Pod becomes important.. This implies the use of a Headless Service. This service doesn’t have a cluster of IPs assigned to Pods but instead has multiple endpoints with DNS records, each pointing to a different Pod.
So, behind a Kubernetes StatefulSet object, the associated service layer will have a set of endpoints to directly target the required Pod.
How to deploy a Stateful application in Kubernetes using a StatefulSet object?
Next, we are going to define a StatefulSet object. To do this, we will use an example with Cassandra images.
Desplegar cassandra en HA en kubernetes
The first thing we are going to do is define a Headless Service, which, as we mentioned before, will not have an associated cluster IP:
apiVersion: v1 kind: Service metadata: labels: app: cassandra name: cassandra spec: clusterIP: None ports: - port: 9042 selector: app: cassandra
Once the file is created, we execute it as follows:
kubectl apply -f cassandra-service.yml
Next, we create the StatefulSet object with which we will create 2 Cassandra Pods, as we set the replicas to 2.
apiVersion: apps/v1 kind: StatefulSet metadata: name: cassandra labels: app: cassandra spec: serviceName: cassandra replicas: 2 selector: matchLabels: app: cassandra template: metadata: labels: app: cassandra spec: terminationGracePeriodSeconds: 1800 containers: - name: cassandra image: gcr.io/google-samples/cassandra:v13 imagePullPolicy: Always ports: - containerPort: 7000 name: intra-node - containerPort: 7001 name: tls-intra-node - containerPort: 7199 name: jmx - containerPort: 9042 name: cql resources: limits: cpu: "500m" memory: 1Gi requests: cpu: "500m" memory: 1Gi securityContext: capabilities: add: - IPC_LOCK lifecycle: preStop: exec: command: - /bin/sh - -c - nodetool drain env: - name: MAX_HEAP_SIZE value: 512M - name: HEAP_NEWSIZE value: 100M - name: CASSANDRA_SEEDS value: "cassandra-0.cassandra.default.svc.cluster.local" - name: CASSANDRA_CLUSTER_NAME value: "K8Demo" - name: CASSANDRA_DC value: "DC1-K8Demo" - name: CASSANDRA_RACK value: "Rack1-K8Demo" - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP readinessProbe: exec: command: - /bin/bash - -c - /ready-probe.sh initialDelaySeconds: 15 timeoutSeconds: 5 volumeMounts: - name: cassandra-data mountPath: /cassandra_data volumeClaimTemplates: - metadata: name: cassandra-data spec: accessModes: [ "ReadWriteOnce" ] storageClassName: fast resources: requests: storage: 1Gi
Next, we are going to create the StorageClass. In this case, we will specify the type of storage we need, which is provided by our cloud. In our case, since we are using Minikube, we are provided with ‘k8s.io/minikube-hostpath,’ and for the storage parameter, we add ‘pd-ssd,’ although it can also be ‘standard.’
If you want to check the type of StorageClass you have, you can run the following command line: ‘kubectl get sc.’
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: fast provisioner: k8s.io/minikube-hostpath parameters: type: pd-ssd
Next, we will see how the Pods have been numerically assigned using the ‘describe’ command:
We can see how the pods have been assigned as cassandra-0 and cassandra-1.
Once finished, we can perform tests with Cassandra using a Spring Boot application, as explained in a previous article.
Conclusion
In this blog post “Stateful Applications in Kubernetes – StatefulSet” we have seen how and why to create a StatefulSet application in Kubernetes.
Let’s summarize what we’ve learned in the article:
- A StatefulSet object is a type of resource within Kubernetes designed for managing Pods, specifically tailored for stateful applications.
- Stateful applications in Kubernetes require Pods with unique identities.
- To ensure the proper functioning of our stateful applications in Kubernetes, we need a Headless Service, which exposes multiple endpoints.
- A StatefulSet object requires storage, and we will use Kubernetes objects that provide this functionality.