Managing App Runner automatic scaling - AWS App Runner

Managing App Runner automatic scaling

AWS App Runner automatically scales compute resources (instances) up or down for your App Runner application. Automatic scaling provides adequate request handling when incoming traffic is high, and reduces your cost when traffic slows down. You can configure a few parameters to adjust auto scaling behavior for your service.

App Runner maintains auto scaling settings in a sharable resource called AutoScalingConfiguration. You can provide an auto scaling configuration resource when you create or update a service. The App Runner console creates one for you when you create a new App Runner service. Providing an auto scaling configuration is optional. If you don't provide one, App Runner provides a default auto scaling configuration with recommended values.

An auto scaling configuration has a name and a numeric revision. Multiple revisions of a configuration have the same name and different revision numbers. You can use different configuration names for different auto scaling scenarios, such as high availability or low cost. For each name, you can add multiple revisions to fine-tune the settings for a specific scenario.

You can share a single auto scaling configuration across multiple App Runner services to ensure they have the same auto scaling behavior. For more information, see Configuring service settings using sharable resources.

You can configure the following auto scaling settings:

  • Max concurrency – The maximum number of concurrent requests that an instance processes. When the number of concurrent requests exceeds this quota, App Runner scales up the service.

  • Max size – The maximum number of instances that your service scales up to. At most this number of instances are actively serving traffic for your service.

  • Min size – The minimum number of instances that App Runner provisions for your service. The service always has at least this number of provisioned instances. Some of them actively serve traffic. The rest of them (provisioned and inactive instances) stand by as a cost-effective compute capacity reserve, which is ready to be quickly activated. You pay for the memory usage of all provisioned instances. You pay for the CPU usage of only the active subset.

    App Runner temporarily doubles the number of provisioned instances during deployments, to maintain the same capacity for both old and new code.

Manage auto scaling

Manage auto scaling for your App Runner services using one of the following methods:

App Runner console

When you create a service using the App Runner console, or when you update its configuration later, you can choose an auto scaling configuration for your service. Look for the Auto scaling configuration section on the console page. You can use the default auto scaling configuration or a custom configuration. To use a custom configuration, either choose an existing configuration or provide a new name and settings. If it's a new configuration, App Runner creates a new auto scaling configuration resource for you, and then associates it with your new service.


              App Runner console configuration page showing auto scaling options
App Runner API or AWS CLI

When you call the CreateService or UpdateService App Runner API actions, you can use the AutoScalingConfigurationArn parameter to specify an auto scaling configuration resource for your service.

Use the following App Runner API actions to manage your auto scaling configuration resources.

  • CreateAutoScalingConfiguration – Creates a new auto scaling configuration or a revision to an existing one.

  • ListAutoScalingConfigurations – Returns a list of the auto scaling configurations that are associated with your AWS account, with summary information.

  • DescribeAutoScalingConfiguration – Returns a full description of an auto scaling configuration.

  • DeleteAutoScalingConfiguration – Deletes an auto scaling configuration. You can delete a specific revision or the latest active revision. You might need to delete unnecessary auto scaling configurations if you reach the auto scaling configuration quota for your AWS account.