Troubleshooting
Known issue resolution provides instructions to mitigate known errors. If these instructions don’t address your issue, Contact AWS Support provides instructions for opening an AWS Support case for this solution.
Known issue resolution
Issue: You are using an existing VPC and your tests fail with a status of Failed, resulting in the following error message:
Test might have failed to run.
-
Resolution:
Ensure that the subnets exist in the VPC specified and that they have a route to the internet with either an internet gateway or a NAT gateway. AWS Fargate needs access to pull the container image from the public repository to successfully run tests.
Issue: Tests are taking too long to run or are stuck indefinitely running
-
Resolution:
Cancel the test and check AWS Fargate to ensure that all tasks have stopped. If they have not stopped, manually stop all Fargate tasks. Check the on-demand Fargate task limits on your account to ensure that you can launch the number of tasks desired. You can also check the CloudWatch logs for the Lambda task-runner function for more insight into failures when launching Fargate tasks. Check the CloudWatch ECS logs for details of what is happening in Fargate containers that are running.
Issue: Tests are starting but failing to complete or the state of the ECS tasks is unknown
-
Resolution:
If you selected the option to provide an existing VPC in the account where the solution has been deployed, ensure that the VPC being used by the ECS Tasks has enough free IP addresses to start the number of tasks provided in the test input. The ECS task definition uses the ECR image that needs an internet gateway or a route to the internet so that the ECS service can provision the tasks by downloading the solution ECR image from aws-solutions/distributed-load-testing-on-aws-load-tester
Issue: Tests need to use an endpoint which is private or not available through the internet gateway
-
Resolution:
When testing private API endpoints that aren’t accessible through the internet gateway, consider the following approaches:
-
Network Configuration: Ensure the subnet route tables used by the ECS tasks are updated with a route to the IP address range of the private endpoint being tested. This allows the test traffic to reach the private endpoint within your VPC.
-
DNS Resolution: For custom domains, configure the DNS settings in your VPC to resolve the private endpoint’s domain name. Refer to VPC DNS documentation for detailed instructions.
-
VPC Endpoints: If testing AWS services, consider using VPC endpoints (AWS PrivateLink) to establish private connectivity. For example, to test a private API Gateway, you can create a VPC endpoint for API Gateway. See Private API Gateway documentation.
-
VPC Peering: If the private endpoint is in a different VPC, establish VPC peering between the VPC where the solution is deployed and the VPC containing the private endpoint. Configure appropriate route tables in both VPCs. See VPC Peering documentation.
-
Transit Gateway: For more complex networking scenarios involving multiple VPCs, consider using AWS Transit Gateway to route traffic between the solution’s VPC and the VPC containing the private endpoint. See Transit Gateway documentation.
-
Security Groups: Ensure that the security groups associated with your ECS tasks allow outbound traffic to the private endpoint, and the security groups of the private endpoint allow inbound traffic from the ECS tasks.
For testing internal Application Load Balancers or EC2 instances, ensure that the VPC CIDR ranges don’t overlap and that the necessary routes are configured in the route tables.
Issue: Tests are completing but the results are not available on the UI
-
Resolution:
If the test has completed but the results are not available in the UI, the result files should still be available in the S3 Bucket from the ECS tasks which ran the tests. This is a known limitation in the solution. In the current architecture, the solution uses a result parsing Lambda function to summarize the results from multiple ECS tasks, which are then stored as an item in the DynamoDB table. The DynamoDB table has a limit of 400 KB maximum item size. This limitation is reached depending on the complexity of the test script, the concurrency, and the number of tasks being used. The error does not mean the test is failing; it indicates that the process to summarize the results and store them in the DynamoDB table for CRUD operations has failed. The results are still available in the S3 bucket for the test scenario.