FinOps metrics to start¶
Estimated time to read: 9 minutes
While working with Cloud providers, most provide detailed usage reports and charge you based on that. Most of them also provide you with forecasting and also with suggestions to reduce the cost. However, most of the time, you must add extra effort to create custom dashboards, reports, and insights and fully analyse your environment to understand your Total cost of Ownership of your products and where and how you can optimise the cost based on your value streams and funding streams. That's never easy and straightforward. A simple example is that you can terminate all the instances you have for the Disaster recovery procedure, which will definitely reduce the cost. However, you sacrifice your business continuity plan in case of any disaster in your main ecosystem. Therefore, before you start a cost optimisation procedure, you need to monitor and understand your cloud ecosystem. Below are a few of the most common metrics you need to look at. There is no one suit fits all, and you need to find what of the below suits you or create something based on your needs.
Metric | Description | Formula/Calculation Method |
---|---|---|
Service Usage and Costs | The amount and cost of each cloud service used | From the cloud service provider's dashboard (AWS Cost Explorer, Google Cloud Platform Console, Azure portal) |
Unused Resources | Resources that are provisioned but not used | Unused Resources/Total Resources |
Idle Instances | Instances that are running but not performing useful work | Idle Instances/Total Instances |
Spend Reduction | The reduction in spending achieved through cost optimization | (Original Spend - Current Spend)/Original Spend |
Right-Sizing Instances | Ensuring that instances are the correct size for the workload they are running | Current Instance Size/Recommended Instance Size |
Reserved Instance Utilization | The percentage of reserved instances that are used | Used Reserved Instances/Total Reserved Instances |
Gross Margin | The percentage of total sales revenue that the company retains after direct costs | (Total Revenue - Cost of Goods Sold)/Total Revenue |
Revenue Growth Rate | The rate at which a company's revenue increases | (Current Revenue - Previous Revenue)/Previous Revenue |
Load Time | The amount of time it takes for a web page or application to load | Measure via Performance Monitoring Tools |
Total Cost of Ownership (TCO) | is a financial estimate that helps consumers and enterprise managers determine direct and indirect costs of a product or system. | Direct Costs + Indirect Costs + Opportunity Cost. |
Return on Investment (ROI) | A measure of the profitability of an investment | (Net Profit/Cost of Investment) * 100% |
Cost per Acquisition (CPA) | The cost of acquiring a new customer | Total Marketing Spend/Number of New Customers Acquired |
Changes in Spending Habits | A measure of how a company's spending habits have changed over time | (Current Spending - Previous Spending)/Previous Spending |
Cost per Department/Project/Client | The amount of cloud costs attributed to a specific department, project, or client | Total Cloud Costs allocated to Department/Project/Client |
Deploy Frequency | How frequently new code is deployed | Number of Deploys/Time Period |
Lead Time for Changes | The amount of time it takes from when a change is proposed to when it is successfully deployed | Total Time for Changes/Number of Changes |
Mean Time to Recovery (MTTR) | The average time it takes to recover from a failure | Total Downtime/Number of Incidents |
Change Failure Rate | The percentage of changes that result in a failure | Failed Changes/Total Changes |
Note that some of these metrics (like CPA or ROI) aren't specific to cloud cost optimisation but are important financial metrics to understand when looking at the broader impact of cost optimisation efforts. Similarly, metrics like Load Time or MTTR aren't directly related to costs but can be improved by good cloud management and can indirectly impact costs.
Metric | Questions To Ask / Questions It Can Answer | Calculation Method |
---|---|---|
Cost per Feature | What will it cost to build this feature? How much incremental revenue will this feature bring in? Is it worth building now? How much are legacy decisions costing us? Which tech debt projects offer the most return? Which parts of the application disproportionately affect costs, and do they justify themselves via revenue? | Calculation depends on specific project scope, projected resources and tools needed, time for development, etc. |
Cost per Customer and Segment | How much does a new customer cost regarding cloud storage, computing, etc.? How much is each new customer worth in terms of revenue? Are we making or losing money on certain types or sizes of customers? Is there a better way to price offerings to account for costs? Are segments like enterprise customers higher or lower margins than other segments? | Total cloud costs attributable to a customer or customer segment/number of customers in that segment |
Cost per App (or Platform) | How much does it cost to run our app/platform daily, monthly, or yearly? How does that cost change with customer acquisition? Are we making money on the app? What are our margins? | Total cloud costs attributable to the app or platform/time period |
Cost per Team | How much is each team spending in the cloud? Are there opportunities for cost-sharing or economies of scale? | Total cloud costs attributable to the team |
Revenue | How much is the company making on the platform/app(s)? What is the projected annual revenue? | Sales figures, financial reports |
Cloud Cost | What is the annual cost of running our entire cloud? How is that expected to grow or shrink over time? | Total expenditure on cloud services over a given period |
Cost per Unit | What is our cost per unit? What is our revenue per unit? What are the margins? Are there opportunities to reduce cost per unit? Is our business model mapped well to our cost per unit? | Total cost associated with producing a unit/number of units produced |
Time to Market | How long does it take to get products/features to market? How much does that cost in terms of man-hours? Are there ways to decrease time to market and thus decrease the overall cost of delivery? | Total development time for a product or feature, including man-hours and resources used |
Cost per Cloud Service | How much are we spending on storage, computing, databases, and other cloud services? When cost spikes arise, which services are affected? | Total expenditure on a specific cloud service over a given period |
Cost of R&D | How much are we spending on R&D regarding man hours and technology costs? What is the potential value of outcomes (products, services, etc.)? What types of R&D efforts are worth our time? | Total expenditure on R&D, including personnel time and resources used |
Cost Deviations | When a cost spike occurs, what is the cause? Does increased revenue balance out the increased cost? If not, how can we put a stop to the spike? | Compare expenditure across different time periods, identify deviations and analyse corresponding circumstances |
Note that the exact method to calculate these metrics may vary based on your organisation's specific context and details. Some of these metrics may require significant analysis and tracking to calculate accurately.
Metric | Description | Formula/Calculation Method |
---|---|---|
Cost per Region | The amount of money that an organisation spends in each region. | Total cloud costs attributable to a region/number of regions |
Cost per Datacenter | The amount of money that an organisation spends in each data centre. | Total cloud costs attributable to a data centre/number of data centres |
Cost per Environment | The amount of money that an organisation spends in each environment, such as development, staging, production, etc. | Total cloud costs attributable to an environment/number of environments |
Cost per SLA | The amount of money that an organisation spends to meet a specific service level agreement (SLA). | Total cloud costs attributable to an SLA / number of SLAs |
Cost per Risk | The amount of money that an organisation spends to mitigate a specific risk. | Total cloud costs attributable to a risk/number of risks |
In addition to the metrics listed above, organisations should also consider tracking the following metrics.
Metric | Description |
---|---|
Cost of Compliance | The amount of money an organisation spends to comply with regulatory requirements. That could be aligned |
Cost of Security | The amount of money an organisation spends to protect its cloud environment from security threats. |
Cost of Disaster Recovery | The amount of money an organisation spends to protect its data and applications from disasters. |
Below you can find more information about the Total Cost of Ownership TCO because I think it is one of the metrics that most of us need to monitor.
Total Cost of Ownership (TCO) is a financial estimate that helps consumers and enterprise managers determine direct and indirect costs of a product or system. It is a management accounting concept that can be used to assess the overall investment in a product or service over its lifespan.
In the context of cloud computing, TCO includes the cost of migration, implementation, operation, service fees, and the ongoing costs of managing and maintaining the system.
Here are the key components:
Direct Costs¶
Hardware Includes servers, storage devices, network equipment, etc.
Software The costs of acquiring software licenses, upgrades, and maintenance.
IT Operations Costs associated with running the IT department include salaries, training, and benefits.
Facilities The cost of the location of the servers and storage. This includes things like electricity, cooling, and rental or depreciation if owned. That also applies to cloud providers and different running costs per region.
Indirect Costs¶
**Downtime: **This includes both planned (system maintenance, upgrades, etc.) and unplanned (system failures, network outages, etc.) downtime.
End-User Operations The costs incurred due to time spent by non-IT staff on IT-related issues.
Scalability The cost of upgrading or scaling the system to accommodate growth.
Security The cost of ensuring the data is secure, including software, hardware, and time spent.
Compliance Costs associated with meeting legal or industry-specific regulations.
Depreciation The reduction in the value of the hardware and software over time.
Opportunity Cost¶
These are costs associated with the next best alternative foregone. For example, if resources are spent on maintaining on-premise servers, they cannot be spent on a potentially more profitable activity like product development.
When calculating TCO for cloud services, typically compare it to the TCO of running the equivalent system on-premise or in a different cloud provider. The TCO of cloud services is generally more favourable because many of the direct and indirect costs are significantly lower, eliminated or included to direct costs, and you do not need to think about them. For example, hardware investment is typically unnecessary and less needed for IT operations, facilities, and associated costs or the operating system security is eliminated as offered without extra cost with the Image provided by the cloud providers to spin up your instances. However, each organisation will have a unique cost structure and should calculate its specific TCO even when migrating to the cloud or changing between cloud providers.