DevOps quality metrics¶

Estimated time to read: 7 minutes

As a DevOps engineer or manager, you already know the tremendous potential of DevOps and Agile methodologies. They streamline the software development process and make it more efficient, allowing your team to deliver high-quality software quickly. However, to truly unlock the power of DevOps, tracking the right metrics and continuously monitoring your team's and product's performance is crucial.

In this blog post, we will discuss two sets of DevOps metrics informed mainly by the book "Accelerate: The Science of Lean Software and DevOps" and the DORA program and, when necessary, reference other high-ranked resources. These metrics will help you monitor various aspects of your DevOps process, including cost, code quality, and technical debt, enabling you to make data-driven decisions and improve your team's performance.

Accelerate and DORA-inspired Metrics¶

The book "Accelerate" and the DORA program inspired the first set of metrics. These metrics focus on the core aspects of the DevOps process, such as deployment frequency, lead time, and incident management.

Deployment Frequency Monitoring deployment frequency allows your team to identify bottlenecks in the workflow and optimise the delivery process. Frequent deployments can result in faster feedback loops and reduced lead time.
Lead Time Measuring lead time helps you evaluate the efficiency of your development process from idea to production. Shorter lead times indicate faster response to user feedback and reduced time-to-market.
Mean Time to Recovery (MTTR) Tracking MTTR lets you measure your team's ability to quickly recover from incidents or failures. A shorter MTTR indicates better incident management and a more resilient system.
Change Failure Rate Monitoring the change failure rate helps you assess the stability of your application. A low change failure rate suggests that deployments occur quickly and regularly without compromising application stability.

Additional DevOps KPIs¶

The second set of KPIs offers a broader perspective on your DevOps process, covering aspects like code quality, infrastructure efficiency, and user experience.

Code Churn Tracking code churn helps identify planning, code quality, and team stability issues. High code churn may indicate a need for better code review processes or more effective communication within the team.
Code Coverage Monitoring code coverage allows you to measure the effectiveness of your testing efforts. High code coverage indicates better test coverage, leading to fewer defects and improved code quality.
Infrastructure Utilization Measuring infrastructure utilisation helps you optimise your infrastructure costs by identifying over-provisioned or under-utilised resources.
System Usability Scale (SUS) Tracking the SUS score provides insights into the perceived usability of your software, which can directly impact user satisfaction and adoption.

Core metrics¶

Metric	Category	Description	Estimation
Deployment Frequency	Deployment & Change	Measures how often new features or capabilities are launched.	\(\text{DF} = \frac{\text{Total Available Time}}{\text{Time per Deployment}}\)
Lead Time for Changes	Deployment & Change	Measures the time it takes for a change to go from code committed to code successfully running in production.	Time(production) - Time(commit)
Change Failure Rate	Deployment & Change	Refers to the extent to which releases lead to unexpected outages or other unplanned failures.	Failed changes/total changes
Mean Time to Recovery (MTTR)	Detection & Recovery	Measures the time it takes to address the problem and get back on track once a failed deployment or change is detected.	Sum(recovery times)/number of incidents
Defect Volume	Quality & Defects	Focuses on the actual volume of defects.	Total number of defects
Availability	Availability & Compliance	Highlights the extent of downtime for a given application, measured as complete (read/write) or partial (read-only) availability.	Uptime/(Uptime + Downtime)
Service-Level Agreement (SLA) Compliance	Availability & Compliance	Measures compliance with service-level agreements (SLAs) between providers and clients.	SLA goals met/total SLA goals
Unplanned Work Rate (UWR)	Work Management	Measures the time dedicated to unexpected efforts in relation to time spent on planned work.	(Unplanned Work Time/Total Work Time)*100
Rework Rate (RWR)	Work Management	Relates to the effort to address issues brought up in tickets.	Reworked tasks/total tasks
Customer Satisfaction	Customer Focus	Measures customers' satisfaction level with the software product or service.	Average customer satisfaction score
Employee Satisfaction/Engagement	Culture & People	Measures the level of satisfaction and engagement of the employees involved in the development, deployment, and maintenance processes.	Average employee satisfaction score

KPIs that can be useful for DevOps team to measure¶

Additional DevOps metrics, KPIs (Key Performance Indicators), and OKRs (Objectives and Key Results) that can be useful for a team to measure:

Metric	Category	Description	Estimation
Work in Progress (WIP)	Work Management	Measures the number of tasks or items currently progressing in development. Helps identify bottlenecks and optimise workflow.	Number of tasks in progress
Code Churn	Code Quality	Measures the amount of code that is added, modified, or deleted over a specific period. High code churn may indicate planning, code quality, or team stability issues.	(Lines added + Lines modified + Lines deleted)/time period
Code Coverage	Code Quality	Measures the percentage of code that is covered by automated tests. High code coverage indicates better test coverage and can help identify areas that need more testing.	(Lines covered by tests/total lines of code) * 100
Test Execution Time	Testing Efficiency	Measures the total time taken to execute a complete set of tests. It helps identify slow tests and areas where test optimisation is needed.	Total time for test execution
Automated Test Pass Rate	Testing Efficiency	Measures the percentage of automated tests that pass during a test run. A high pass rate indicates higher test reliability and quality.	(Number of tests passed/total tests) * 100
Mean Time Between Failures (MTBF)	Reliability	Measures the average time between system or application failures. A higher MTBF indicates greater system reliability.	Sum(time between failures)/number of failures
Incident Response Time	Incident Management	Measures the time it takes for the team to respond to an incident or issue reported. A shorter response time indicates better incident management.	Time(response) - Time(incident reported)
Incident Resolution Time	Incident Management	Measures the time it takes for the team to resolve an incident or issue after it has been reported. A shorter resolution time indicates better incident management.	Time(resolution) - Time(incident reported)
Deployment Rollback Rate	Deployment & Change	Measures the percentage of deployments that need to be rolled back due to issues or failures. A lower rollback rate indicates better deployment quality.	(Number of rollbacks/total deployments) * 100
System Usability Scale (SUS)	User Experience	Measures the perceived usability of the software or application. A higher SUS score indicates a better user experience.	Average SUS score
Infrastructure Utilisation	Infrastructure Efficiency	Measures the utilisation of infrastructure resources, such as CPU, memory, and storage. Helps identify over-provisioned or under-utilised resources and optimise infrastructure costs.	(Resource usage/total available resources) * 100

These additional metrics can provide valuable insights into the performance of a DevOps team and help identify areas for improvement. Keep in mind that the relevance of each metric may vary depending on your organisation's specific goals and context. It's important to focus on the metrics that are most relevant to your team and use them to drive continuous improvement in your DevOps processes.

Conclusion¶

Tracking these DevOps metrics will give you valuable insights into your team's performance, allowing you to make data-driven decisions and continuously improve your processes. By monitoring aspects like cost, code quality, and technical debt, you can ensure that your team delivers high-quality software rapidly, leading to increased customer satisfaction and a competitive edge in the market.

So, go ahead and start measuring the metrics that fit to your environment today to supercharge your DevOps performance. Unlock your team's and product's full potential!