Automated Deployment Rollbacks

Harness Service Reliability Management (SRM) allows you to configure Service Level Objectives (SLOs) to automatically rollback deployments when predefined conditions are breached (e.g., exceeding your error budget). This feature helps maintain service reliability by quickly reverting problematic deployments before they significantly impact your users.

With SLO-driven automated rollbacks, you can:

Automatically roll back deployments when predefined conditions are breached
Prevent service degradation by quickly reverting to a stable version
Maintain error budgets by minimizing the impact of failed deployments
Implement progressive delivery practices with safety mechanisms

This topic explains how to configure automated deployment rollbacks based on SLO violations in your Harness pipelines.

Prerequisites

Before configuring automated deployment rollbacks with SLOs, ensure you have:

Configured your CD service. Set up your CD service and provide the necessary deployment configuration.

Here is an example of a sample NGINX Kubernetes Deployment:

NGINX Kubernetes Deployment

NGINX in Service definition

Set up your CD Environment and Infrastructure. Create the required environment and configure the underlying infrastructure for that environment.
Set up your Deployment Pipeline. Utilize the previously created CD service and environment to configure your deployment pipeline.

Configure automated rollbacks with SLOs

To set up automated rollbacks with SLOs, follow these steps:

Step 1: Setup a monitored service

Create a monitored service using service and environment pair created above. For more information, see Create a monitored service.

Setup monitored service

Step 2: Configure health source

Define a health source for your monitored service. Here's an example of how to configure a Prometheus query:

Configure health source

Assign SLIs to the health source by mapping the metrics to service level indicators:

Step 3: Set up SLO

Configure your SLO using the monitored service and the Prometheus metric you've set up.

Setup the rollback policy

As part of your error budget policy, you can add a rollback policy.

To set up the rollback policy:

Create a new notification rule.
Select Rollback Deployment as the Notification Method.
Specify the Rollback Window. This defines the valid time frame between a recent deployment and an SLO condition breach during which an automatic rollback will be triggered.
Provide the Environment and Infrastructure details.

note
Providing the correct Environment and Infrastructure is crucial for a successful rollback.
Create new notification rule, and select the Notification Method as Rollback Deployment, provide the rollback Window, this determines the validity of rollback if the SLO breaches the given condition. Provide the environment and Infrastructure.

info
Providing correct Environment/Infrastructure is important to carry out the rollback successfully.

Verifying rollback

If everything is configured correctly and your SLO notification conditions are met, you will observe that the deployment has been automatically rolled back.

Verify Rollback

For more information about SLOs and verification, see Create and manage SLOs and Verification overview.

Prerequisites​

Configure automated rollbacks with SLOs​

Step 1: Setup a monitored service​

Step 2: Configure health source​

Step 3: Set up SLO​

Setup the rollback policy​

Verifying rollback​

Related Content​