Cloud FinOps - Part 2: Tag Allocation Strategy

In the first post of this four-part series, we explored the Cloud FinOps principles and the key milestones in the journey to start walking in FinOps:

  • Tag Allocation Strategy
  • Cost Report
  • Usage Report

In this post, we’ll dig deeper into a crucial step in the Inform phase of FinOps culture: creating a tag allocation strategy.

Motivation

Some time ago, if we had an issue related to a spike in the billing account, tightening the resources by team/workload was a very tedious task. It required a lot of manual work there was no consistent tag allocation in place. One of the first steps in the Inform phase of FinOps culture is to create a good tag allocation strategy. This means establishing some mandatory tags and the tools to avoid unexpected, untagged resources.

Strategy Basics

Based on Cloud FinOps theory, building a tag and/or hierarchy-based allocation strategy within an organisation requires three key pillars:

Communicate the plan

  • The goal is to create a strategy that’s best for the whole company, not just for one or two teams. Everyone is involved.
  • Team, Service, Business Unit/Cost Centre and Organisation are some patterns recommended for financial splits. These divisions are used to group costs and resources to answer questions at each layer: team vs team, service vs service, and so on. Consistency is mandatory. The less important or more team-specific ones can be recommended but not required.

Keep it simple

  • Start with three to five obvious areas whose costs you want to understand.

Formulate questions

  • Which business unit within the organisation should this cost be charged to?
  • Which cost centres are driving your costs up?
  • How much does it cost to operate a product that a certain team is responsible for?
  • Are you able to establish which costs are non-production and safe to turn off?

Some tags that companies with successful FinOps implement:

  • A cost centre/business unit tag that clearly defines where the costs of the resource should be allocated within the organisation.
  • A service/workload name tag that identifies which business service the resource belongs to.
  • A resource owner tag to help identify the individual/team responsible for the resource.
  • A name tag to help identify the resource using a friendlier identifier than the one given to you by your cloud services provider.
  • An environment tag to help determine cost differences among development, test, staging, production, etc.

Some optional tags:

  • A tier tag to identify the part of the service layer the resource belongs to (like frontend, backend, or web).
  • A data classification tag to help identify the type of data contained on the resource.

Mandatory tags for Cloud Resources

Upper camel case (initial uppercase letter) validation.

Optional tags for cloud resources

Upper camel case (initial uppercase letter) validation.

Mandatory labels for Kubernetes resources

Kubernetes resources should be tagged properly too. In this case, Kubernetes labels will be used instead of tags.

Valid label values:

  • must be 63 characters or less (can be empty),
  • unless empty, must begin and end with an alphanumeric character, lowercase ([a-z0–9]),
  • could contain dashes (-), underscores (_), dots (.), and alphanumerics between.

Tag Hygiene

By maintaining tag hygiene, we eliminate the risk of working with inaccurate data, which would throw off the accuracy of the decisions. For instance, if a team uses the tag value ‘prod’ and another uses ‘production’, these tags would be grouped differently.

Tags should be done on code to ensure consistency and avoid human error. Besides, tagging correctly can avoid the usual scenario of ‘This resource doesn’t belong to me’ (CloudCustodian can help here, implemented some months ago to maintain tag hygiene).

In the next diagram, we can review the flow to maintain a strong tag allocation strategy using a GitOps approach.

Flowchart

Local Pipeline
Generic Pipeline

To sum up

Providing a good tag allocation strategy is fundamental for allocating the usage of resources and billing. It also allows us to identify possible undesired bills and provide the feedback necessary to reallocate the bill to workloads/resources.

A tag allocation strategy by itself is not enough; some other actions are essential to enforce and maintain tag hygiene. With these actions in place, it’s a good starting point to move on to the next topic, Cost Report.

Future

  • Mature tagging in next iterations
  • Enforce and maintain hygiene in Kubernetes workloads

Like tagging cloud resources, containers without any identifiers make it hard from the outside to determine what is what. In the container world, the containers should be labelled. Therese labels form the identification that we can use to allocate costs. Namespaces for teams, or for specific services, allow us to create a coarse-grained cost allocation group.

Stay tuned! In the third post in this four-part series, the focus will be the Cloud Report, from a cloud and a Kubernetes perspective.

P.S: This post has been part of HashiTalks Spain 2021 too. You can watch the video here.

References