When you use Auto Scaling, you can automatically increase the number of EC2 instances you're using when the user demand goes up, and you can decrease the number of EC2 instances when demand goes down. As Auto Scaling dynamically adds and removes EC2 instances, you need to ensure that the traffic coming to your web application is distributed across all of your running EC2 instances. AWS provides the Elastic Load Balancing service to distribute the incoming web traffic (called the load) automatically among all the EC2 instances that you are running. Elastic Load Balancing manages incoming requests by optimally routing traffic so that no one instance is overwhelmed. Using Elastic Load Balancing with your auto-scaled web application makes it easy to route traffic across a dynamically changing fleet of EC2 instances. For more information about Elastic Load Balancing, see What Is Elastic Load Balancing? in the Elastic Load Balancing Developer Guide.
Elastic Load Balancing uses load balancers to monitor traffic and handle requests that come through the Internet. Your load balancer acts as a single point of contact for all incoming traffic to the instances in your Auto Scaling group. To use a load balancer with your Auto Scaling group, create the load balancer and then associate it with your Auto Scaling group. To associate your load balancer with your Auto Scaling group when you create it, see Tutorial: Set Up a Scaled and Load-Balanced Application. To associate your load balancer with an existing Auto Scaling group, see Attach a Load Balancer to Your Auto Scaling Group.
Elastic Load Balancing sends data about your load balancers and EC2 instances to Amazon CloudWatch. CloudWatch collects data about the performance of your resources and presents it as metrics. After registering one or more load balancers with your Auto Scaling group, you can configure your Auto Scaling group to use Elastic Load Balancing metrics (such as request latency or request count) to scale your application automatically. For more information about Elastic Load Balancing metrics, see Monitor Your Load Balancer Using Amazon CloudWatch in the Elastic Load Balancing Developer Guide. For information about using CloudWatch metrics to scale automatically, see Dynamic Scaling.
By default, the Auto Scaling group determines the health state of each instance by periodically checking the results of EC2 instance status checks. Elastic Load Balancing also performs health checks on the EC2 instances that are registered with the load balancer. After you've registered your Auto Scaling group with a load balancer, you can choose to use the results of the Elastic Load Balancing health check in addition to the EC2 instance status checks to determine the health of the EC2 instances in your Auto Scaling group. For more information, see Add an Elastic Load Balancing Health Check to Your Auto Scaling Group.
If connection draining is enabled for your load balancer, Auto Scaling waits for the in-flight requests to complete or for the maximum timeout to expire, whichever comes first, before terminating instances due to a scaling event or health check replacement. For more information, see Connection Draining in the Elastic Load Balancing Developer Guide.
You can take advantage of the safety and reliability of geographic redundancy by spanning your Auto Scaling groups across multiple Availability Zones within a region and then setting up load balancers to distribute incoming traffic across those Availability Zones. For more information, see Expand Your Scaled and Load-Balanced Application to an Additional Availability Zone.