Auto Scaling is an AWS service that allows you to increase or decrease the number of EC2 instances within your application's architecture. With Auto Scaling, you create collections of EC2 instances, called Auto Scaling groups. You can create these groups from scratch, or from existing EC2 instances that are already in production.
You can create as manyAuto Scaling groups as you need. For example, if an application consists of a web tier and an application tier, you can create two Auto Scaling groups—one for each tier. Each Auto Scaling group can contain one or more scaling policies—these policies define when Auto Scaling launches or terminated EC2 instances within the group.
Adding Auto Scaling to your network architecture is one way to maximize the benefits of the AWS cloud. With Auto Scaling, you can make your applications:
More fault tolerant. Auto Scaling can detect when an instance is unhealthy, terminate it, and launch a new instance to replace it.
More highly available. You can configure Auto Scaling to use multiple subnets or Availability Zones. If one subnet or Availability Zone becomes unavailable, Auto Scaling can launch instances in another one to compensate.
Increase and decrease in capacity only when needed. Unlike on-premises solutions, with Auto Scaling you can have your network scale dynamically. You also don't pay for Auto Scaling. Instead, you pay only for the EC2 instances launched, and only for as long as you use them.
To better demonstrate some of the benefits of Auto Scaling, consider a basic Web application running on AWS. This application allows employees to search for conference rooms that they might want to use for meetings. During the beginning and end of the week, usage of this application is minimal. During the middle of the week, more employees are scheduling meetings, so the demands on the application increases significantly.
The following graph shows how much of the application's capacity is used over the course of a week.
Traditionally, there are two ways to plan for these changes in capacity. The first option is to add enough servers so that the application always has enough capacity to meet demand. The downside of this option, however, is that there are days in which the application doesn't need this much capacity. The extra capacity remains unused and, in essence, raises the cost of keeping the application running.
The second option is to have enough capacity to handle the average demands on the application. This option is less expensive, because you aren't purchasing equipment that you'll only use occasionally. However, you risk creating a poor customer experience when the demands on the application exceeds its capacity.
By addding Auto Scaling to this application, you have a third option available. You can add new instances to the application only when necessary, and terminate them when they're no longer needed. And because Auto Scaling uses EC2 instances, you only have to pay for the instances you use, when you use them. You now have a cost-effective architecture that provides the best customer experience while minimize expenses.
The remaining topics in this section provide a more detailed look at how Auto Scaling works. If you're new to Auto Scaling, we recommend that you review the sections How Auto Scaling Works and Auto Scaling Group Lifecycle.
There are no additional fees with Auto Scaling, so it's easy to try and see how it can benefit your AWS architecture. To begin, review our Getting Started with Auto Scaling section to create a standalone Auto Scaling group and see how it responds when an instance in that group terminates. If you already have instances running in AWS, you can create an Auto Scaling group using an existing EC2 instance, and remove your instance from the group at any time. After you are familiar with how Auto Scaling works, review the topics Planning your Auto Scaling Group to learn how to make the most of Auto Scaling in your applications.