Amazon Elastic Compute Cloud
User Guide (API Version 2013-02-01)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Auto Scaling and Load Balancing Your Instances

If you expect your application to have significant variability in usage, you might want to use Auto Scaling and Elastic Load Balancing, two features of Amazon EC2 that help manage variability.

Auto Scaling

Auto Scaling enables you to scale the number of instances that you are using up or down, based on parameters that you specify, such as traffic or CPU load.

Auto Scaling also monitors the health of each Amazon EC2 instance that it launches. If any instance terminates unexpectedly, Auto Scaling detects the termination and launches a replacement instance.

For a high degree of flexibility, you can organize Amazon EC2 instances into Auto Scaling groups, which enable you to scale different server classes (for example, web servers, back-end servers) at different rates. For each group, you specify the minimum number of instances, the maximum number of instances, and the parameters to increase and decrease the number of running instances.

For information about setting up Auto Scaling, see the Auto Scaling Developer Guide.

Load Balancing

Elastic Load Balancing enables you to automatically distribute the incoming traffic (or load) among all instances that you are running. The service also makes it easy for you to add new instances when you need to increase the capacity of your application.

Customers reach your web site through your web URL, such as www.mywebsite.com. This single address might actually represent several instances of your running web application. To always have an available web site, you need to run multiple instances. Otherwise, your customers might see delays when accessing your site, or worse, might not be able to access your site at all.

Elastic Load Balancing manages incoming requests by optimally routing traffic so that no one instance is overwhelmed. You can quickly add more instances to applications that are experiencing an upsurge in traffic, or remove capacity when traffic is slow.

For information about setting up Elastic Load Balancing, see the Elastic Load Balancing Developer Guide.