Know how to rollout and rollback deployments in Kubernetes

We often come across situations where app deployment to Production fails due to some breaking change. At this point, we have two options - either revert or fix it. Generally, we go for revert. In order to revert, we have to revert a code from production, wait for the CI to complete, and create the deployment again. And, this entire process usually takes us a while, leading to a disruption in app functioning, followed by significant monetary loss.

To handle these risks associated with the deployments, we need to have definite strategies to handle them, for example.

We need to make sure the new version should be available to the users as early as possible.
And, in case of failure, we should be able to roll back the application to the previous version in no time.

There are mainly two strategies when it comes to deploying apps into production with zero downtime:-

Blue/Green Deployment:- It reduces downtime and risk by running two identical production environments called Blue and Green. So, instead of updating the current production (blue) environment with the new application version, a new production (green) environment is created. When it’s time to release the new application, version traffic is routed from the blue environment to the green environment. So, if there are any problems, deployment can be easily rolled back.

Rolling Updates:- This is a basic strategy that is about adding a new code to the existing deployment. The existing deployment becomes a heterogeneous pool of all the old versions and a new version, with the end goal of slowly replacing all old instances with new instances.

Kubernetes accommodates all the above-mentioned deployment strategies. We will look into the Rolling Updates because it guarantees a safe rollout while keeping the ability to revert, if necessary. Rolling Updates also have first-class support in Kubernetes that allow us to phase in a new version gradually.

Rolling Updates

In Kubernetes, rolling updates are the default strategy to update the running version of our app. So, Kubernetes runs a cluster of nodes, and each node consists of pods. The rolling update cycles the previous Pod out and brings the newer Pod in incrementally.

This is how rolling updates work.

This is our Kubernetes deployment file which specifies replica as 3 for demo-app and the container image is pointing to AWS ECR.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-app
  labels:
    app: demo-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: demo-app
  template:
    metadata:
      labels:
        app: demo-app
    spec:
      containers:
        - name: image
          image: 73570586743739.dkr.ecr.us-west-2.amazonaws.com/demo-app:v1.0
          imagePullPolicy: Always
          ports:
            - containerPort: 3001

Now when we run

kubectl create -f demo-app.yml

This will create the deployment with 3 pods running and we can see the status as running.

kubectl get pods

   NAME                                READY     STATUS    RESTARTS   AGE
  demo-app-1564180363-khku8            1/1       Running   0          14s
  demo-app-1564180363-nacti            1/1       Running   0          14s
  demo-app-1564180363-z9gth            1/1       Running   0          14s

Now, if we need to update the deployment or need to push the new version out, assuming CI has already pushed the new image to ECR, we can just copy and update the image URL.

kubectl set image deployment.apps/demo-app image=73570586743739.dkr.ecr.us-west-2.amazonaws.com/demo-app:v2.0

The output is similar to:

  >> deployment.apps/demo-app image updated

We can also add the description to the update, so we would know what has changed.

kubectl annotate deployment.apps/demo-app kubernetes.io/change-cause="demo version changed from 1.0 to 2.0"

We can always check the status of the rolling update.

kubectl rollout status deployment.apps/demo-app

The output is similar to:

  >> Waiting for rollout to finish: 2 out of 3 new replicas have been updated...
  >> deployment "demo-app" successfully rolled out

We can see here that 2 out of 3 new pods are created and the old 2 pods are decommissioned. And, once all the pods are replaced, it shows a success message.

And finally, running get pods should now show only the new Pods:

kubectl get pods

   NAME                                READY     STATUS    RESTARTS   AGE
  demo-app-1564180365-khku8            1/1       Running   0          14s
  demo-app-1564180365-nacti            1/1       Running   0          14s
  demo-app-1564180365-z9gth            1/1       Running   0          14s

A rolling update offers a way to gradually deploy the new version of our application across the cluster. It replaces the pods during several phases. For example, we may replace 25% of the pods during the first phase, then another 25% during the next, and so on, until all are upgraded. Since the pods are not replaced all at once, this means that both versions will be live, at least for a short time, during the rollout.

So, we can achieve zero-downtime deployment using Kubernetes and we can deploy as many times as we want and our users will not be able to notice the difference.

However, even if we use Rolling updates, there is still a risk that our application will not work the way we expect it at the end of the deployment and in such a case, we need a rollback.

Rolling back a deployment

Sometimes, due to a breaking change, we may want to rollback a Deployment and Kubernetes By default maintain Deployment’s rollout history so that we can rollback anytime we want.

Suppose, we pushed a breaking change to production and tried deploying it

kubectl set image deployment.apps/demo-app image=73570586743739.dkr.ecr.us-west-2.amazonaws.com/demo-app:v3.0 kubectl annotate deployment.apps/demo-app kubernetes.io/change-cause="demo version changed from 2.0 to 3.0"

We can verify the rollout status:

kubectl rollout status deployment.apps/demo-app

The output is similar to this:

Waiting for rollout to finish: 1 out of 3 new replicas has been updated...

Looking at the Pods created, we can see that all the Pods are stuck.

kubectl get pods The output is similar to this:

  NAME                                READY     STATUS             RESTARTS   AGE
  demo-app-1564180366-70iae           0/1       Running            0          3s
  demo-app-1564180366-jbqqo           0/1       Running            0          3s
  demo-app-1564180366-hysrc           0/1       Running            0          3s

As we can see, it says 0 out of 1 ready.

And in this case, we need to rollback the deployment to a stable version. To rollback, we need to check the rollout history.

kubectl rollout history deployment.apps/demo-app

The output is similar to this:

  deployments "demo-app"

  REVISION    CHANGE-CAUSE
  1           "from file demo-app.yml"
  2           "demo version changed from 1.0 to 2.0"
  3           "demo version changed from 2.0 to 3.0"

It can be seen, that it shows revisions with change cause which we had added after updating the deployment, and in our case REVISION, 2 was stable.

We can rollback to a specific version by specifying it with --to-revision:

kubectl rollout undo deployment.apps/demo-app --to-revision=2

The output is similar to this:

  deployment.apps/demo-app rolled back

Check if the rollback was successful and the Deployment is running as expected, run:

kubectl get deployment demo-app

The output is similar to this:

  NAME               READY   UP-TO-DATE   AVAILABLE   AGE
  demo-app           3/3    3           3          12s

We can check the status of pods as well.

kubectl get pods

The output is similar to this:

  NAME                                READY     STATUS             RESTARTS   AGE
  demo-app-1564180365-khku8            1/1       Running           0          14s
  demo-app-1564180365-nacti            1/1       Running           0          14s
  demo-app-1564180365-z9gth            1/1       Running           0          14s

Know how to rollout and rollback deployments in Kubernetes

Rolling Updates

Rolling back a deployment

The Ultimate Guide to Gemfile and Gemfile.lock

Integrate Replicate in Rails Application

Next.js vs. Remix - A Developer's Dilemma