Before you get started with Elastic Load Balancing, you should understand the following concepts.
A load balancer is the destination to which all requests intended for your load balanced application should be directed. Each load balancer can distribute requests to multiple EC2 instances. A load balancer is represented by a DNS name and a set of ports. Load balancers can span multiple Availability Zones within an EC2 Region, but they cannot span multiple regions.
To create or work with a load balancer in a specific region, use the corresponding regional service endpoint. For information about regions and endpoints supported by Elastic Load Balancing, go to Regions and Endpoints.
Elastic Load Balancing automatically generates a DNS name for each load balancer instance
you create. Typically, the DNS name includes the name of the AWS region in which the
load balancer is created. For example, if you create a
load balancer named
myLB in the us-east-1a, your load balancer
might have a DNS name such as
use the DNS name generated by Elastic Load Balancing to connect with your load
balancer. To connect, paste the DNS name of your load balancer into the address
field of an Internet-connected web browser.
If you'd rather use a user-friendly domain name for your load balancer, such as
www.example.com, instead of the load balancer DNS name, you can
create a custom domain name and then associate the custom domain name with the load
balancer DNS name. When a request is placed to your load balancer using the custom
domain name that you created, it resolves to the load balancer DNS name.
For more information on creating and using a custom domain name for your load balancer, see Configure Custom Domain Name for Your Load Balancer.
When you create your load balancer, you must configure it to accept incoming traffic by specifying the configurations for your load balancer listeners. A listener is a process that listens for connections from incoming requests. It is configured with a protocol and a port number for front-end (load balancer) and back-end (back-end instance) connections. For more information on the ports and protocols supported by Elastic Load Balancing, see Listener Configurations for Elastic Load Balancing.
You can set up your Elastic Load Balancing to distribute incoming requests across EC2 instances in a single Availability Zone or multiple Availability Zones within a region. Your load balancer does not distribute traffic across regions.
For critical applications, we recommend that you distribute incoming traffic across more than one Availability Zone. To distribute traffic across multiple Availability Zones, launch your Amazon EC2 instances in all the Availability Zones you plan to use and then register the instances with your load balancer.
When you register your EC2 instances, Elastic Load Balancing provisions load balancer nodes in all the Availability Zones that has the registered instances. The load balancer node continuously monitors the health of all the registered instances and routes traffic to the healthy instances. If a load balancer node detects unhealthy or de-registered instances, it stops routing traffic to those instances. Instead, it sends requests to the remaining healthy instances.
You can always expand or shrink the availability of your instances after your initial set up. To expand the availability of your application, launch instances in an additional Availability Zone, register the new instances with your load balancer, and then add the new Availability Zone. After you've added the new Availability Zone, the load balancer begins to route traffic equally amongst all the enabled Availability Zones. To shrink the availability of your instances, remove an Availability Zone that was enabled for your load balancer. After you've removed the Availability Zone, the load balancer will stop routing the traffic to the disabled Availability Zone and continue to route traffic to the registered and healthy instances in the enabled Availability Zones.
For information see Add or Remove Availability Zones for Your Load Balanced Application.
Before a client sends a request to your load balancer, it first resolves the load balancer's domain name with the Domain Name System (DNS) servers. The DNS server resolves the load balancer's domain name by returning one or more IP addresses to the client. The client then uses DNS round robin to determine which load balancer node in a specific Availability Zone will receive the request.
The selected load balancer node then sends the request to instances within the same Availability Zone. To determine the instances, the load balancer node uses either the round robin (for TCP connections) or the least outstanding request (for HTTP/HTTPS connections) routing algorithm. The least outstanding request routing algorithm favors back-end instances with the fewest outstanding requests.
By default, the load balancer node routes traffic to back-end instances within the same Availability Zone. To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to have approximately equivalent numbers of instances in each zone. For example, if you have ten instances in Availability Zone us-east-1a and two instances in us-east-1b, the traffic will still be equally distributed between the two Availability Zones. As a result, the two instances in us-east-1b will have to serve the same amount of traffic as the ten instances in us-east-1a. As a best practice, we recommend that you keep an equivalent or nearly equivalent number of instances in each of your Availability Zones. So in the example, rather than having ten instances in us-east-1a and two in us-east-1b, you could distribute your instances so that you have six instances in each Availability Zone.
If you want the request traffic to be routed evenly across all back-end instances, regardless of the Availability Zone that they may be in, enable cross-zone load balancing on your load balancer. Cross-zone load balancing allows each load balancer node to route requests across multiple Availability Zones, ensuring that all zones receive an equal amount of request traffic. Cross-zone load balancing reduces the need to maintain equivalent numbers of back-end instances in each zone, and improves the application's ability to handle the loss of one or more back-end instances. However, we still recommend that you maintain approximately equivalent numbers of instances in each Availability Zone for higher fault tolerance. The traffic between Elastic Load Balancing and your EC2 instances in another Availability Zone will not incur any EC2 data transfer charges.
For environments where clients cache DNS lookups, incoming requests may prefer one of the Availability Zones. Using cross-zone load balancing, this imbalance in request load will be spread across all available back-end instances in the region, reducing the impact of misbehaving clients on the application.
Routing traffic to EC2 instances in EC2-Classic
To allow communication between Elastic Load Balancing and your back-end instances launched in EC2-Classic, create a security group ingress rule that applies to all of your back-end instances. The security group rule can either allow ingress traffic from all IP addresses (the 0.0.0.0/0 CIDR range) or allow ingress traffic only from Elastic Load Balancing. To ensure that your back-end EC2 instances can receive traffic only from Elastic Load Balancing, enable network ingress for the Elastic Load Balancing security group on all of your back-end EC2 instances. For more information about configuring security groups for EC2 instances launched in EC2-Classic, see Manage Security Groups in Amazon EC2-Classic.
Routing traffic to EC2 instances in Amazon VPC
If you are planning on deploying your load balancer within Amazon Virtual Private Cloud (Amazon VPC), be sure to configure the security group rules and network ACLs to allow traffic to be routed between the subnets in your VPC. If your rules are not configured correctly, instances in other subnets may not be reachable by load balancer nodes in a different subnet. For more information on deploying load balancer within Amazon VPC, see Load Balancers in a VPC. For information on configuring security groups for your load balancer deployed in VPC, see Manage Security Groups in a VPC.
For a procedure on enabling or disabling cross-zone load balancing for your load balancer, see Enable or Disable Cross-Zone Load Balancing for Your Load Balancer
After you've created your load balancer, you have to register your EC2 instances with the load balancer. Your EC2 instances can be within a single Availability Zone or span multiple Availability Zones within a region. Elastic Load Balancing routinely performs health check on all the registered EC2 instances and automatically distributes all incoming requests to the DNS name of your load balancer across your registered, healthy EC2 instances. For more information on the health check of your EC2 instances, see Health Check.
Make sure to install webserver, such as Apache or Internet Information Services (IIS), on all the EC2 instances you plan to register with your load balancer.
Stop and Start EC2 Instances
The instances are registered with the load balancer using the IP addresses associated with the instances. When an instance is stopped and then started, the IP address associated with your instance changes. This prevents the load balancer from routing traffic to your restarted instance. When you stop and then start your registered EC2 instances, we recommend that you de-register your stopped instance from your load balancer, and then register the restarted instance. Failure to do so may prevent the load balancer from performing health checks and routing the traffic to the restarted instance. For procedures associated with de-registering and then registering your instances with your load balancer, see Deregister and Register EC2 Instances.
Setting Keepalive On Your EC2 Instances
For HTTP and HTTPS listeners, we recommend that you enable the keep-alive option in your EC2 instances. This may be done in your web server settings and/or in the kernel settings for your back-end instance. Keep-alive option will allow the load balancer to re-use connections to your backend for multiple client requests. This reduces the load on your web server and improves the throughput of the load balancer. The keep-alive timeout should be at least 60 seconds to ensure that the load balancer is responsible for closing the connection to your instance.
Path MTU Discovery
Elastic Load Balancing supports Path Maximum Transmission Unit (MTU) Discovery. To ensure that Path MTU Discovery can function correctly, you must adjust your instance's security group rules. For more information, see Security Group Rules for Path MTU Discovery in the Amazon EC2 User Guide for Linux Instances.
To discover the availability of your EC2 instances, the load balancer periodically sends pings, attempts connections, or sends requests to test the EC2 instances. These tests are called health checks. Instances that are healthy at the time of the health check are marked as "InService" and the instances that are unhealthy at the time of the health check are marked as "OutOfService". The load balancer performs health checks on all registered instances, regardless of whether the instance is in a healthy or unhealthy state.
The load balancer routes traffic only to the healthy instances. When the load balancer determines that an instance is unhealthy, it stops routing traffic to that instance. The load balancer resumes routing traffic to the instance when it has been restored to a healthy state.
The load balancer checks the health of the registered instances using either the default health check configuration provided by Elastic Load Balancing (ELB) or a health check configuration that you specify.
Health Check Configuration
A health configuration contains the information that a load balancer uses to determine the health state of the registered instances. The following table describes the health check configuration fields:
The protocol to use to connect with the instance. The protocol can be TCP, HTTP, HTTPS, or SSL.
The port to use to connect with the instance. The ping port can be within the range of one (1) through 65535.
Specified as a
If you specify a TCP or SSL protocol, you only have to include the protocol and the port, such as
The load balancer attempts to open a TCP connection to the instance on the specified port. If the load balancer fails to connect with the instance at the specified port within the configured response timeout period, the instance is considered unhealthy.
The destination for sending the HTTP/HTTPS request.
If you specify an HTTP or HTTPS protocol, you have to include a ping port and a ping path, such as
Time to wait when receiving a response from the health check (2 sec - 60 sec).
Amount of time between health checks (5 sec - 300 sec).
Number of consecutive health check failures before declaring an EC2 instance unhealthy.
Number of consecutive health check successes before declaring an EC2 instance healthy.
The load balancer interprets the health check configuration as follows: Send a request to
registered instance at the ping port and ping path,
30 seconds. Allow a response timeout period of
seconds for the instance to respond. If the load balancer gets
consecutive failures, take the instance out of service. If the load balancer gets
5 consecutive successful responses, put the instance back in
Health Checks and Auto Scaling Groups
If you have associated your Auto Scaling group with a load balancer, you can use the load balancer health check to determine the health state of instances in your Auto Scaling group. By default, an Auto Scaling group periodically makes calls to the Amazon Elastic Compute Cloud (EC2) DescribeInstanceStatus action to determine the health state of each instance. If the Auto Scaling group is registered with a load balancer and associated with the load balancer health check, Auto Scaling will make calls to both the Amazon EC2 DescribeInstanceStatus action and the ELB DescribeInstanceHealth action to determine the health state of the instances. For more information, see Add an Elastic Load Balancing Health Check to your Auto Scaling Group in the Auto Scaling Developer Guide.
Troubleshooting Health Check
Your registered instances can fail the load balancer health check for several reasons. The most common reasons for failing a health check are where EC2 instances close connections to your load balancer or where the response from the EC2 instances times out. For information on potential causes and steps you can take to resolve failed health check issues, see Troubleshooting Elastic Load Balancing: Health Check Configuration.
Connection draining causes the ELB load balancer to stop sending new requests to a deregistering instance or an unhealthy instance, while keeping the existing connections open. This allows the load balancer to complete in-flight requests made to the deregistering or unhealthy instances.
Connection draining is a load balancer attribute and applies to all listeners for the load
balancer. You can enable or disable connection draining for your load balancer at
any time. You can check if connection draining is enabled for your load balancer by
using the DescribeLoadBalancerAtrributes action, the
elb-describe-lb-attributes command, or by using the AWS Management
Console and clicking the Instances tab in the bottom pane of
the selected load balancer.
When you enable connection draining for your load balancer, you can set a maximum time for the load balancer to continue serving in-flight requests to the deregistering instance before the load balancer closes the connection. The load balancer forcibly closes connections to the deregistering instance when the maximum time limit is reached.
While the in-flight requests are being served, the load balancer reports the instance state
of the deregistering instance as
InService: Instance deregistration currently
in progress. The load balancer reports the instance state as
OutOfService: Instance is not currently registered with the
LoadBalancer when the deregistering instance has completed serving all
in-flight requests or when the maximum timeout limit is reached, whichever comes
When an instance becomes unhealthy the load balancer reports the instance state as
OutOfService. If there are in-flight requests made to the unhealthy
instance, they get completed. The maximum timeout limit does not apply for the
connections to the unhealthy instance.
You can check the instance state of a deregistering instance or an unhealthy instance by using
the DescribeInstanceHealth action, or the
If your instances are part of an Auto Scaling group and if connection draining is enabled for your load balancer, Auto Scaling will wait for the in-flight requests to complete or for the maximum timeout to expire, whichever comes first, before terminating instances due to a scaling event or health check replacement. For information about using Elastic Load Balancing with Auto Scaling, see Use Elastic Load Balancing to Load Balance Your Auto Scaling Group.
For tutorials on how to enable or disable connection draining attribute for your load balancer, see Enable or Disable Connection Draining for Your Load Balancer.
For each request a client makes through a load balancer, the load balancer maintains two connections. One connection is with the client and the other connection is to the back-end instance. For each connection, the load balancer manages an idle timeout that is triggered when no data is sent over the connection for a specified time period. After this time period has elapsed, if no data has been sent or received, the load balancer closes the connection.
By default, Elastic Load Balancing maintains a 60-second idle timeout setting for the connections to the client and the back-end instance. You can change the idle timeout setting for your load balancer at any time.
If you use HTTP and HTTPS listeners, we recommend that you enable the keep-alive option for your EC2 instances. You can enable keep-alive in your web server settings and/or in the kernel settings for your EC2 instance. Keep-alive, when enabled, allows the load balancer to re-use connections to your back-end instance, which reduces the CPU utilization on your instances. To ensure that the load balancer is responsible for closing the connections to your back-end instance, make sure that the value set on your instance for the keep-alive time is greater than the idle timeout setting on your load balancer.
For a tutorial on configuring the idle timeout settings for your load balancer, see Configure Idle Connection Timeout.
By default, a load balancer routes each request independently to the application instance with the smallest load. However, you can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user's session to a specific application instance. This ensures that all requests coming from the user during the session will be sent to the same application instance.
The key to managing the sticky session is determining how long your load balancer should consistently route the user's request to the same application instance. If your application has its own session cookie, then you can set Elastic Load Balancing to create the session cookie to follow the duration specified by the application's session cookie. If your application does not have its own session cookie, then you can set Elastic Load Balancing to create a session cookie by specifying your own stickiness duration. You can associate stickiness duration for only HTTP/HTTPS load balancer listeners.
An application instance must always receive and send two cookies: A cookie that defines the stickiness duration and a special Elastic Load Balancing cookie named AWSELB, that has the mapping to the application instance.
The HTTP method (also called the verb) specifies the action to be performed on the resource receiving an HTTP request. The standard methods for HTTP requests are defined in RFC 2616, Hypertext Transfer Protocol-HTTP/1.1. Standard methods include GET, POST, PUT, HEAD, and OPTIONS. Some web applications require (and sometimes also introduce) new methods that are extensions of HTTP/1.1 methods. These HTTP extensions can be non-standard. Some common examples of HTTP extended methods include (but are not limited to) PATCH, REPORT, MKCOL, PROPFIND, MOVE, and LOCK. Elastic Load Balancing accepts all standard and non-standard HTTP methods.
When a load balancer receives an HTTP request, it performs checks for malformed requests and for the length of the method. The total method length in an HTTP request to a load balancer must not exceed 127 characters. If the HTTP request passes both the checks, the load balancer sends the request to the back-end EC2 instance. If the method field in the request is malformed, the load balancer responds with a HTTP 400: BAD_REQUEST error message. If the length of the method in the request exceeds 127 characters, the load balancer responds with a HTTP 405: METHOD_NOT_ALLOWED error message.
The back-end EC2 instance processes a valid request by implementing the method contained in the request and then sending a response back to the client. Your back-end instance must be configured to handle both supported and unsupported methods.
HTTPS Support is a feature that allows you to use the SSL/TLS protocol for encrypted connections (also known as SSL offload). This feature enables traffic encryption between the clients that initiate HTTPS sessions with your load balancer and also for connections between the load balancer and your back-end instances.
To enable HTTPS support for your load balancer, you'll have to install an SSL server certificate on your load balancer. The load balancer uses the certificate to terminate and then decrypt requests before sending them to the back-end instances.
For more information, see Using HTTP/HTTPS Protocol with Elastic Load Balancing. For information about creating a load balancer that uses HTTPS, see Create a HTTPS/SSL Load Balancer .
The Proxy Protocol header helps you identify the IP address of a client when you use a load balancer configured for TCP/SSL connections. Because load balancers intercept traffic between clients and your back-end instances, the access logs from your back-end instance contain the IP address of the load balancer instead of the originating client. When Proxy Protocol is enabled, the load balancer adds a human-readable format header that contains the connection information, such as the source IP address, destination IP address, and port numbers of the client. The header is then sent to the back-end instance as a part of the request. You can parse the first line of the request to retrieve your client's IP address and the port number.
The Proxy Protocol line is a single line that ends with a carriage return and line feed
"\r\n"). It takes the following form:
PROXY_STRING + single space + INET_PROTOCOL + single space + CLIENT_IP + single space + PROXY_IP + single space + CLIENT_PORT + single space + PROXY_PORT + "\r\n"
The following is an example of the IPv4 Proxy Protocol.
PROXY TCP4 198.51.100.22 203.0.113.7 35646 80\r\n
The Proxy Protocol line for IPv6 takes an identical form, except it begins with
TCP6 and the address is in IPv6 format.
The following is an example of the IPv6 Proxy Protocol.
PROXY TCP6 2001:DB8::21f:5bff:febf:ce22:8a2e 2001:DB8::12f:8baa:eafc:ce29:6b2e 35646 80\r\n
If the client connects with IPv6, the address of the proxy in the header will be the
public IPv6 address of your load balancer. This IPv6 address matches the IP address
that is resolved from your load balancer's DNS name that is prefixed with either
dualstack. If the client connects with IPv4,
the address of the proxy in the header is the private IPv4 address of the load
balancer, which is not resolvable through a DNS lookup outside the EC2-Classic network.
For information about enabling the Proxy Protocol header, see Enable or Disable Proxy Protocol Support.
The HTTP requests and HTTP responses use header fields to send information about the HTTP
message. Header fields are colon-separated name-value pairs that are separated by a
carriage return (CR) and a line feed (LF). A standard set of HTTP header fields is
defined in RFC 2616, Message Headers. There are also non-standard HTTP headers available
that are widely used by the applications. Some of the non-standard HTTP headers have
X-Forwarded prefix. Elastic Load Balancing supports the following
X-Forwarded-For request header helps you identify the IP address of a
client when you use HTTP/HTTPS load balancer. Because load balancers intercept
traffic between clients and servers, your server access logs contain only the IP
address of the load balancer. To see the IP address of the client, use the
X-Forwarded-For request header. Elastic Load Balancing stores the IP address of the
client in the
X-Forwarded-For request header and passes the header
along to your server.
X-Forwarded-For request header takes the following form:
The following example is an
X-Forwarded-For request header for a client with an IP address of
The following example is an
X-Forwarded-For request header for a client
with an IPv6 address of
If the request goes through multiple proxies, then the
clientIPAddress in the
X-Forwarded-For request header is followed by IP addresses of
each successive proxy that passes along the request before the request reaches
your load balancer. Thus, the right-most value is the IP address of the most
recent proxy (for your load balancer) and the left-most value is the IP address
of the originating client. In such cases, the
request header takes the following form:
X-Forwarded-Proto request header helps you identify the protocol (HTTP
or HTTPS) that a client used to connect to your server. Your server access logs
contain only the protocol used between the server and the load balancer; they
contain no information about the protocol used between the client and the load
balancer. To determine the protocol used between the client and the load balancer,
X-Forwarded-Proto request header. Elastic Load Balancing stores the protocol
used between the client and the load balancer in the
request header and passes the header along to your server.
Your application or website can use the protocol stored in the
X-Forwarded-Proto request header to render a response that
redirects to the appropriate URL.
X-Forwarded-Proto request header takes the following form:
The following example contains an
X-Forwarded-Proto request header for a request that originated from the client as an HTTPS request:
X-Forwarded-Port request header helps you identify the port a
HTTP/HTTPS load balancer uses to connect to the client.