AWSPremiumSupport-TroubleshootEKSCluster - AWS Systems Manager Automation runbook reference

AWSPremiumSupport-TroubleshootEKSCluster

Description

The AWSPremiumSupport-TroubleshootEKSCluster runbook diagnoses common issues with an Amazon Elastic Kubernetes Service (Amazon EKS) cluster, underlying infrastructure, and provides recommended remediation steps.

Important

Access to AWSPremiumSupport-* runbooks requires either an Enterprise or Business Support Subscription. For more information, see Compare AWS Support Plans .

If you specify a value for the S3BucketName parameter, the automation evaluates the policy status of the Amazon Simple Storage Service (Amazon S3) bucket you specify. To help with the security of the logs gathered from your EC2 instance, if the policy status isPublic is set to true , or if the access control list (ACL) grants READ|WRITE permissions to the All Users Amazon S3 predefined group, the logs are not uploaded. For more information about Amazon S3 predefined groups, see Amazon S3 predefined groups in the Amazon Simple Storage Service User Guide .

Run this Automation (console)

Document type

Automation

Owner

Amazon

Platforms

Linux, macOS, Windows

Parameters

  • AutomationAssumeRole

    Type: String

    Description: (Optional) The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role that allows Systems Manager Automation to perform the actions on your behalf. If no role is specified, Systems Manager Automation uses the permissions of the user that starts this runbook.

  • ClusterName

    Type: String

    Description: (Required) The name of the Amazon EKS cluster that you want to troubleshoot.

  • S3BucketName

    Type: String

    Description: (Optional) The name of the private Amazon S3 bucket where the report generated by the runbook should be uploaded.

Required IAM permissions

The AutomationAssumeRole parameter requires the following actions to successfully use the runbook.

  • ssm:StartAutomationExecution

  • ssm:GetAutomationExecution

  • ec2:DescribeInstances

  • ec2:DescribeInstanceTypes

  • ec2:DescribeSubnets

  • ec2:DescribeSecurityGroups

  • ec2:DescribeRouteTables

  • ec2:DescribeNatGateways

  • ec2:DescribeVpcs

  • ec2:DescribeNetworkAcls

  • iam:GetInstanceProfile

  • iam:ListInstanceProfiles

  • iam:ListAttachedRolePolicies

  • eks:DescribeCluster

  • eks:ListNodegroups

  • eks:DescribeNodegroup

  • autoscaling:DescribeAutoScalingGroups

In addition, the AWS Identity and Access Management (IAM) policy attached to the IAM user or role that starts the automation must allow the ssm:GetParameter operation to the following public AWS Systems Manager parameters to get the latest recommended Amazon EKS Amazon Machine Image (AMI) for the worker nodes.

  • arn:aws:ssm:::parameter/aws/service/eks/optimized-ami/*/amazon-linux-2/recommended/image_id

  • arn:aws:ssm:::parameter/aws/service/ami-windows-latest/Windows_Server-2019-English-Core-EKS_Optimized-*/image_id

  • arn:aws:ssm:::parameter/aws/service/ami-windows-latest/Windows_Server-2019-English-Full-EKS_Optimized-*/image_id

  • arn:aws:ssm:::parameter/aws/service/ami-windows-latest/Windows_Server-1909-English-Core-EKS_Optimized-*/image_id

  • arn:aws:ssm:::parameter/aws/service/eks/optimized-ami/*/amazon-linux-2-gpu/recommended/image_id

To upload the report generated by the runbook to an Amazon S3 bucket, the following permissions are required for the specified Amazon S3 bucket you specify.

  • s3:GetBucketPolicyStatus

  • s3:GetBucketAcl

  • s3:PutObject

Document Steps

  • aws:executeAwsApi - Gathers details for the specified Amazon EKS cluster.

  • aws:executeScript - Gathers details of the Amazon Elastic Compute Cloud (Amazon EC2) instances, Auto Scaling groups, AMIs, and Amazon EC2 GPU graphic instance types.

  • aws:executeScript - Gathers details of the virtual private cloud (VPC), subnets, network address translation (NAT) gateways, subnet routes, security groups and network access control lists (ACLs) of the Amazon EKS cluster.

  • aws:executeScript - Gathers details of attached IAM instance profiles and role policies.

  • aws:executeScript - Gathers details of the Amazon S3 bucket you specify in the S3BucketName parameter.

  • aws:executeScript - Classifies the Amazon VPC subnets as public or private.

  • aws:executeScript - Checks the Amazon VPC subnets for tags that are required as part of an Amazon EKS cluster.

  • aws:executeScript - Checks the Amazon VPC subnets for the tags that are required for Elastic Load Balancing subnets.

  • aws:executeScript - Checks if the worker node Amazon EC2 instances use the latest Amazon EKS optimized AMIs

  • aws:executeScript - Checks if the Amazon VPC security groups attached to worker nodes for the tags that are required.

  • aws:executeScript - Checks the Amazon EKS cluster and worker node Amazon VPC security group rules for the recommended ingress rules to the Amazon EKS cluster.

  • aws:executeScript - Checks the Amazon EKS cluster and worker node Amazon VPC security group rules for the recommended egress rules from the Amazon EKS cluster.

  • aws:executeScript - Checks the network ACL configuration of the Amazon VPC subnets.

  • aws:executeScript - Checks if the worker node Amazon EC2 instances have the required managed policies.

  • aws:executeScript - Checks if the Auto Scaling groups have the necessary tags for cluster autoscaling.

  • aws:executeScript - Checks if the worker node Amazon EC2 instances are connected to the internet.

  • aws:executeScript - Generates a report based on the outputs from the previous steps. If a value is specified for the S3BucketName parameter, the generated report is uploaded to the Amazon S3 bucket.