Tags for operations and support - Best Practices for Tagging AWS Resources

Tags for operations and support

An AWS environment will have multiple accounts, resources, and workloads with differing operational requirements. Tags can be used to provide context and guidance to support operations teams to enhance management of your services. Tags can also be used to provide operational governance transparency of the managed resources.

Some of the main factors driving consistent definition of operational tags are:

  • To filter resources during automated infrastructure activities. For example, when deploying, updating or deleting resources. Another is the scaling of resources for cost optimization and out of hours usage reductions. See AWS Instance Scheduler solution for a working example.

  • Identifying isolated or deprecating resources. Resources that have exceeded their defined lifespan or have been flagged for isolation by internal mechanisms should be appropriately tagged so as to assist support personnel in their investigation. Deprecating resources should be tagged before isolation, archival and deletion.

  • Support requirements for a group of resources. Resources often have different support requirements, for example, these requirements could be negotiated between teams or set as part of an applications criticality. Further guidance on operating models can be found in the Operational Excellence Pillar.

  • Enhance the incident management process. By tagging resources with tags that offer greater transparency in incident management process, support teams and engineers as well as Major Incident Management (MIM) teams can more effectively manage events.

  • Backups. Tags can also be used to identify the frequency your resources need to be backed up, and where the backup copies need to go or where to restore the backups. Prescriptive guidance for Backup and recovery approaches on AWS.

  • Patching. Patching mutable instances running in AWS is crucial in both your overarching patching strategy and for the patching of zero-day vulnerabilities. Deeper guidance on the wider patching strategy can be found in the prescriptive guidance. Patching of zero-day vulnerabilities is discussed in this blog.

  • Operational observability. Having an operational KPI strategy translated to resource tags will help operations teams to better track whether targets are being met to enhance business requirements. Developing a KPI strategy is a separate topic, but tends to be focused on a business operating in a steady state or where to measure the impact and outcomes of change. The KPI Dashboards (AWS Well-Architected labs) and the Operations KPI Workshop (an AWS Enterprise Support proactive service) both address measure performance in a steady state. The AWS enterprise strategy blog article Measuring the Success of Your Transformation, explores KPI measurement for a transformation program, such as IT modernization or migrating from on premises to AWS.