Automate deletion of AWS resources by using aws-nuke - AWS Prescriptive Guidance

Automate deletion of AWS resources by using aws-nuke

Created by Sreenivas Ganesan (AWS)

Code repository: aws-nuke-account-cleanser-example

Environment: PoC or pilot

Technologies: Management & governance; Cloud-native; Cost management; DevOps; Serverless; Software development & testing

AWS services: AWS CloudFormation; AWS CodeBuild; Amazon SNS; AWS Step Functions; Amazon EventBridge

Summary

Warning: aws-nuke is an open-source tool that deletes nearly all resources in the target Amazon Web Services (AWS) account and AWS Regions. Make sure that you fully understand the impact the tool will have in the target environment before using it to delete resources. This solution is not intended for use in a production environment. We recommend implementing this solution only in sandbox or development environments. Perform a dry run to confirm that the solution doesn’t delete any resources that are still required. For more information, see the Caution section of the aws-nuke README (GitHub).

It’s quite common to accumulate unused resources in sandbox or development AWS accounts. Developers create and experiment with various services and resources as part of the normal development cycle, and then they don’t delete those resources when they’re no longer needed. Unused resources can incur unnecessary, and sometimes high, costs for the organization. Deleting these resources can reduce the costs of operating these environments.

This pattern provides an automated solution to periodically delete obsolete resources from development or sandbox accounts by using aws-nuke, AWS Step Functions, Amazon EventBridge, and AWS CodeBuild. In the target Regions, it restores the account to essentially a “Day 1” state, where it contains only the default resources and resources that AWS manages. First, you run this solution in dry-run (default) mode and confirm you want to delete all of the identified resources. Then, you turn off dry-run mode and run this solution to delete those resources.

You use an EventBridge rule to configure this automated solution to run on a scheduled basis. The EventBridge rule starts a Step Functions workflow. For scalability across Regions, the workflow invokes a separate CodeBuild project in each Region. The CodeBuild projects run in parallel and use aws-nuke to delete the resources in that Region. This solution is designed to reduce costs, provide scalability, reduce the time required to manage resources, and improve monitoring efficiency. To help you deploy this solution, you create and configure all of the required resources as a stack by using an AWS CloudFormation template that is included in the code repository for this pattern.

This solution provides the following features:

  • In EventBridge, you can customize your own schedule for running this automated solution. Typically, it is best to run this solution at off-peak hours, when most of the activities for the day are complete.

  • Orchestration through a Step Functions workflow provides scalability across all Regions in the account and reduces the overall time to delete the resources.

  • The Step Functions workflow waits for success in each Region. If a CodeBuild project has an error or doesn’t complete within the configured time period, the workflow retries that project. This helps make sure that the resources are deleted on schedule and without manual intervention.

  • Configuration of a separate CodeBuild project in each Region enables aws-nuke to run in parallel, or synchronously, in each Region.

  • Attributes in the aws-nuke configuration file, such as the regions attribute, are dynamically updated by using a custom Python filtering class inside the CodeBuild project. This provides flexibility to handle the resource filters and Region constraints based on the override parameters you provide.

  • The access and authorization approach in this pattern automatically refreshes and provides up to 8 hours for the aws-nuke binary to assume the role from within the CodeBuild project and finish running. After 8 hours, the session times out. This is longer than the standard session limit of 1 hour for AWS Identity and Access Management (IAM) role chaining. If there are many resources to delete in the Region, this additional time can help prevent a time out before the process is complete.

  • When the workflow is complete, it sends a detailed report to an Amazon Simple Notification Service (Amazon SNS) topic with an active email address subscribed. You receive a separate report for each AWS Region. The report includes a list of deleted resources and the completion state of the CodeBuild project. This report eliminates the need to traverse and parse the complex logs generated by aws-nuke. Also, at the end of the Step Functions state machine workflow, you receive a summarized email report with the completion status for each Region.

Prerequisites and limitations

Prerequisites

  • An active sandbox or development AWS account in which you want to delete all resources.

    Important: Do not deploy this solution in a production account. We recommend enabling the dry run option to verify the results before using this solution to delete resources.

  • Permissions to do the following in the AWS account:

    • Create the CloudFormation stack and the resources defined in the CloudFormation template.

    • Create and update IAM roles.

  • AWS Command Line Interface (AWS CLI), installed and configured. For instructions, see Installing the latest version of the AWS CLI in the AWS CLI documentation.

  • Experience with Python.

  • For the target account, an AWS account alias set up the IAM console. For more information, see Caution in the aws-nuke GitHub repo. For instructions, see Create account alias in the IAM documentation.

  • An active email address where you want to receive the reports when the solution runs. You subscribe this email address to an Amazon SNS topic that you deploy through the CloudFormation template provided with this pattern.

Limitations

  • This pattern does not cover scenarios where there are dependency violation errors. aws-nuke retries deleting all resources until they are deleted or only resources with errors remain. For more information, see Usage in the aws-nuke GitHub repo.

  • This solution is designed for sandbox and development environments. Do not use this solution in production environments.

  • This solution is deployed in a single account and deletes resources in only that account. For information about extending this solution to delete resources in multiple accounts, see Automation and scale in the Architecture section of this pattern.

  • This solution doesn’t provide an automated deployment pipeline that is linked to a code repository. You can customize this solution to host the source code in AWS CodeCommit and create a deployment pipeline in AWS CodePipeline.

Product versions

  • aws-nuke version 2.21.2 or later. When you update the aws-nuke version, make sure that you review the release notes to confirm that the new version of aws-nuke doesn’t delete any new types of resources that you don’t want to remove from your account.

Architecture

Target technology stack

  • EventBridge rule

  • Step Functions workflow

  • Amazon Simple Notification Service (Amazon SNS) topic

  • CodeBuild project

  • Amazon Simple Storage Service (Amazon S3) bucket

  • IAM roles in the target accounts

Target architecture

Architecture of a Step Functions workflow that deletes AWS resources by using aws-nuke.

The diagram shows the following process:

  1. The EventBridge rule invokes the Step Functions workflow on the configured schedule.

  2. The Step Functions state machine ingests the parameters, and from the map state, Step Functions invokes the CodeBuild project.

  3. The CodeBuild project uses the passed parameters to pull the nuke_generic_config.yaml file from the S3 bucket. CodeBuild then uses the nuke_config_update.py script to replace the placeholder attributes in the config file with the values for the target Region.

  4. The CodeBuild project assumes the nuke-auto-account-cleanser IAM role and starts aws-nuke in each target Region.

  5. If aws-nuke is in dry run mode (the default), it scans and identifies the resources to be deleted in the target Region.

    If aws-nuke is not in dry run mode, it scans and deletes the resources in the target Region.

  6. The Step Functions state machine loops and polls the CodeBuild job until it receives a success or failure status. If the job fails, Step Functions retries a configured number of times.

  7. After the CodeBuild project completes in all Regions, the Step Functions workflow uses Amazon SNS to email a detailed summary report that includes information about the build status in each Region. You also receive a separate email for each Region, and it lists the identified or deleted resources in that Region.

Automation and scale

Currently this pattern runs the aws-nuke binary in an automated and scalable fashion across multiple AWS Regions in a single account. By using map state in Step Functions, aws-nuke runs in parallel in each Region. This concurrent solution provides enough time to handle a potentially large number of resources and to independently handle failures and retry workflows.

To modify this solution to delete resources across multiple accounts, you would use a hub-and-spoke topology. You would define and use a CloudFormation template to configure cross-account IAM roles in your target spoke accounts. You would also modify the nuke-cfn-stack.yaml CloudFormation template to update the Step Functions definition to accept a list of accounts to iterate in the map state. You deploy the Step Functions workflow in the central hub account. aws-nuke would run from the CodeBuild project in the hub account and assume the cross-account IAM roles in the target spoke accounts in order to delete resources.

Tools

AWS services

  • AWS CloudFormation helps you set up AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle across AWS accounts and Regions.

  • AWS CodeBuild is a fully managed build service that helps you compile source code, run unit tests, and produce artifacts that are ready to deploy.

  • Amazon EventBridge is a serverless event bus service that helps you connect your applications with real-time data from a variety of sources.

  • Amazon Simple Notification Service (Amazon SNS) helps you coordinate and manage the exchange of messages between publishers and clients, including web servers and email addresses.

  • Amazon Simple Storage Service (Amazon S3) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.

  • AWS Step Functions is a serverless orchestration service that helps you combine AWS Lambda functions and other AWS services to build business-critical applications.

Other tools

  • aws-nuke is an open-source tool that helps you delete resources in target AWS accounts and Regions. It deletes all resources that are not defaults or managed by AWS.

  • Python is a general-purpose computer programming language.

Code repository

The code for this pattern is available in the GitHub AWS account cleanser framework using aws-nuke repository. It includes the following:

  • nuke_generic_config.yaml – This CloudFormation template is the config file required by the aws-nuke binary in order to scan and delete the resources across the target Regions. This file contains some placeholders that are dynamically updated at runtime by using a custom Python filtering class inside the CodeBuild project.

  • nuke-cfn-stack.yaml – This CloudFormation template defines all of the sample resources that are necessary to operate this solution. When you deploy this as a CloudFormation stack, it creates the following resources in the target account:

    • An EventBridge rule

    • A Step Functions state machine

    • A sample CodeBuild project, in the target Region

    • An S3 bucket with a randomly generated name and bucket policy, in the target Region

    • An Amazon SNS topic with an active email address subscribed for receiving email notifications

    • IAM roles and policies to support the solution

  • nuke_config_update.py – Also called the Python Config Parser, this Python script parses and dynamically updates the nuke_generic_config.yaml file for each Region at runtime based on the input parameters you define in the Step Functions workflow. The script includes custom filtering logic that is based on universal tags, which adds an additional layer of protection for filtering and handling any global and IAM exclusion lists. This script also verifies stack names based on critical tags and other metadata to prevent those resources from deletion. You can customize this file depending on your requirements for each Region.

Epics

TaskDescriptionSkills required

Clone the repository.

Clone the GitHub AWS account cleanser framework using aws-nuke repo by running the following command.

git clone https://github.com/aws-samples/aws-nuke-account-cleanser-example.git
DevOps engineer

Create the stack in the target account.

  1. Identify the target account and Region where you want to deploy this solution.

  2. In a text editor, open the nuke-cfn-stack.yaml CloudFormation template. In the EventBridgeNukeSchedule section, customize the schedule for running the AWS Step Functions workflow. For more information, see Cron expressions in the EventBridge documentation. Save and close the template.

  3. Create a CloudFormation stack by using the nuke-cfn-stack.yaml CloudFormation template. Enter the following command.

    aws cloudformation create-stack \ --stack-name NukeCleanser \ --template-body file://nuke-cfn-stack.yaml \ --region <Region> \ --capabilities CAPABILITY_NAMED_IAM \
AWS DevOps, Cloud administrator, AWS administrator

Modify the configuration file.

In the cloned repository, in the aws-nuke-account-cleanser-example folder, edit the nuke_generic_config.yaml file to customize it for your use case. For more information about how you can customize the aws-nuke configuration file, see Usage in the aws-nuke GitHub repo.

Important: Do not change the TARGET_REGION and ACCOUNT placeholder values. These are dynamically updated at runtime.

DevOps engineer, Cloud administrator

Prepare the S3 bucket.

  1. Upload the modified nuke_generic_config.yaml and nuke_config_update.py files to the S3 bucket that was created when you deployed the CloudFormation stack. Enter the following commands and replace the <Region> placeholder with your target Region.

    aws s3 cp \ config/nuke_generic_config.yaml --region <Region> \ s3://nuke-account-cleanser-config-{AWS::AccountId}-{AWS::Region}-{random-id-generated} aws s3 cp \ config/nuke_config_update.py --region <Region> \ s3://nuke-account-cleanser-config-{AWS::AccountId}-{AWS::Region}-{random-id-generated}
  2. Because CodeBuild downloads the aws-nuke binary from GitHub, make sure that you have sufficient network connectivity from the virtual private cloud (VPC) where you’re running this solution. If you’re running in a restricted environment or have insufficient bandwidth, upload the aws-nuke binary to the S3 bucket or source it from an internal repository.

Cloud administrator, DevOps engineer
TaskDescriptionSkills required

Manually start the Step Functions workflow.

This solution is configured to run automatically on the schedule that you configured in the EventBridge rule in the nuke-cfn-stack.yaml file. To run the solution manually, enter the following command, and replace the <Region> placeholders with the target Regions where you want to run the solution.

{ "InputPayLoad": { "nuke_dry_run": "true", "nuke_version": "2.21.2", "region_list": [ "<Region A>", "<Region B>", “global” ] } }

After entering this command, the Step Functions workflow starts a separate CodeBuild project in each Region.

AWS DevOps, DevOps engineer

(Optional) Monitor the progress.

You can monitor the progress by querying the log events in Amazon CloudWatch Logs. For sample queries, see the Additional information section of this pattern.

AWS DevOps, AWS systems administrator, DevOps engineer

Verify the results.

  1. Allow aws-nuke to finish scanning the target Regions. When the CodeBuild project completes in a Region, the workflow emails a detailed report of the results for that Region through the SNS topic you configured. You receive a separate report for each Region. When all Regions are complete and the Step Functions workflow is successful, the workflow sends another email that summarizes the completion status of each CodeBuild job. For a sample of this report, see the Additional information section.

  2. Review the results in the report.

AWS DevOps, DevOps engineer
TaskDescriptionSkills required

Exclude any resources you don’t want to delete.

  1. Review the output results of the test, where you ran the solution in dry-run mode.

  2. If you identify any resources that you want to keep, modify the nuke_generic_config.yaml file to exclude these resources.

    Note: This file is already configured to exclude the resources that are deployed by this solution.

  3. Upload the modified configuration file to the S3 bucket by entering the following command.

    aws s3 cp \ config/nuke_generic_config.yaml --region <Region> \ s3://nuke-account-cleanser-config-{AWS::AccountId}-{AWS::Region}-{random-id-generated}
AWS systems administrator, DevOps engineer, AWS administrator

Change the run mode.

After you have confirmed that you are ready to delete the identified resources and you have excluded any resources you want to retain, you can run the solution in production mode. Production mode deletes all resources that are not defaults, managed by AWS, or excluded in the aws-nuke-config.yaml file. To change to production mode and or disable dry run, you must change the AWSNukeDryRunFlag parameter to false. Modify the stack according to the instructions in Modifying a stack template in the CloudFormation documentation. This changes the input payload that is passed from the EventBridge rule to the Step Functions state machine target.

AWS administrator, AWS systems administrator, DevOps engineer

Manually start the Step Functions workflow.

Enter the following command to run the solution manually, and replace the <Region> placeholders with the target Regions where you want to run the solution.

{ "InputPayLoad": { "nuke_dry_run": "false", "nuke_version": "2.21.2", "region_list": [ "<Region A>", "<Region B>" ] } }
AWS DevOps, DevOps engineer

(Optional) Monitor the progress.

You can monitor the progress by querying the log events in Amazon CloudWatch Logs. For sample queries, see the Additional information section of this pattern.

AWS administrator, AWS systems administrator, DevOps engineer

Verify the results.

Wait for the workflow to complete. When you receive the reports, verify the results and confirm that the resources were successfully deleted. The solution will now run automatically on the schedule that you configured in the EventBridge rule.

DevOps engineer, AWS DevOps

Related resources

AWS documentation

GitHub repositories

Additional information

Monitoring queries

You can monitor the progress of aws-nuke by querying the log events in Amazon CloudWatch Logs.

The following is a sample query for the AWS CLI. For more information, see filter-log-events in the AWS CLI Command Reference.

aws logs filter-log-events \ --log-group-name AccountNuker-nuke-auto-account-cleanser \ --start-time <value> \ --end-time <value> \ --log-stream-names <value> \ --filter-pattern removed \ --no-interleaved \ --output text \ --limit 5

The following is a sample query command for CloudWatch Logs Insights. For more information, see Analyzing log data with CloudWatch Logs Insights in the CloudWatch documentation.

fields @timestamp, @message | filter userIdentity.sessionContext.sessionIssuer.userName = "nuke-auto-account-cleanser" and ispresent(errorCode) | sort @timestamp desc | limit 500 fields @timestamp, @message | filter ispresent(errorCode) and userIdentity.sessionContext.sessionIssuer.userName = "nuke-auto-account-cleanser" and errorCode != "AccessDenied" and eventName like "Delete" | sort @timestamp desc | limit 500 fields @timestamp, @message | filter ispresent(errorCode) and userIdentity.sessionContext.sessionIssuer.userName = "nuke-auto-account-cleanser" and errorCode == "AccessDenied" and eventName like "Delete" | sort @timestamp desc | limit 500

Email reporting

The Step Functions state map retries the CodeBuild job once for each Region. If any errors or retries occur, you receive separate emails for each job. The email contents and outputs are configured within the CodeBuild project’s buildSpec section. It uses AWS CLI commands and basic Linux scripting to extract relevant information from the log file that the aws-nuke binary generates. You can customize the email reporting template as needed.

Sample output

The following is a sample of the notification and report sent when the Step Functions workflow completes successfully in production mode.

Account Cleansing Process Completed; ------------------------------------------------------------------ Summary of the process: ------------------------------------------------------------------ DryRunMode : false Account ID : 000000000000 Target Region : us-west-1 Build State : JOB SUCCEEDED Build ID : AccountNuker-NukeCleanser:a0761233-578e-4f23-8a2d-c123215a1bef CodeBuild Project Name : AccountNuker-NukeCleanser Process Start Time : Tue Mar 28 18:20:13 UTC 2023 Process End Time : Tue Mar 28 18:20:48 UTC 2023 Log Stream Path : AccountNuker-NukeCleanser/a0761233-578e-4f23-8a2d-c123215a1bef ------------------------------------------------------------------ ################### Nuke Cleanser Logs #################### Number of Resources that is filtered by config: 2 ------------------------------------------ FAILED RESOURCES ------------------------------- SUCCESSFULLY NUKED RESOURCES ------------------------------- us-west-1 - S3Bucket - s3://samples3bucket-nuke - [CreationDate: "2023-03-27 21:24:59 +0000 UTC", Name: "samples3bucket-nuke"] - removed us-west-1 - S3Bucket - s3://test-nuke-s3-bucket - [CreationDate: "2023-03-28 14:27:06 +0000 UTC", Name: "test-nuke-s3-bucket"] - removed us-west-1 - SQSQueue - https://sqs.us-west-1.amazonaws.com/000000000000/sample-test-nuke-queue - removed us-west-1 - S3Bucket - s3://samples3bucket-nuke - [CreationDate: "2023-03-27 21:24:59 +0000 UTC", Name: "samples3bucket-nuke"] - removed us-west-1 - S3Bucket - s3://test-nuke-s3-bucket - [CreationDate: "2023-03-28 14:27:06 +0000 UTC", Name: "test-nuke-s3-bucket"] - removed us-west-1 - SQSQueue - https://sqs.us-west-1.amazonaws.com/000000000000/sample-test-nuke-queue - removed

The following is a sample of the notification and report sent when the Step Functions workflow completes successfully in dry run mode.

Account Cleansing Process Completed; ------------------------------------------------------------------ Summary of the process: ------------------------------------------------------------------ DryRunMode : true Account ID : 000000000000 Target Region : us-west-1 Build State : JOB SUCCEEDED Build ID : AccountNuker-NukeCleanser:69e0d2de-5f48-46cf-98f3-2df22d11991e CodeBuild Project Name : AccountNuker-NukeCleanser Process Start Time : Mon Mar 27 19:42:49 UTC 2023 Process End Time : Mon Mar 27 19:43:30 UTC 2023 Log Stream Path : AccountNuker-NukeCleanser/69e0d2de-5f48-46cf-98f3-2df22d11991e ------------------------------------------------------------------ ################### Nuke Cleanser Logs #################### Number of Resources that is filtered by config: 1 ------------------------------------------ RESOURCES THAT WOULD BE REMOVED: ----------------------------------------- 3 us-west-1 - SQSQueue - https://sqs.us-east-1.amazonaws.com/000000000000/test-nuke-queue - would remove us-west-1 - SNSTopic - TopicARN: arn:aws:sns:us-east-1: 000000000000:test-nuke-topic - [TopicARN: "arn:aws:sns:us-east-1: 000000000000:test-topic"] - would remove us-west-1 - S3Bucket - s3://test-nuke-bucket-us-west-1 - [CreationDate: "2023-01-25 11:13:14 +0000 UTC", Name: "test-nuke-bucket-us-west-1"] - would remove