Elastic Load Balancing
Developer Guide (API Version 2012-06-01)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Elastic Load Balancing Concepts

This topic introduces you to Elastic Load Balancing basics you need to understand before you create your load balancer.

Load Balancer

A load balancer is the destination to which all requests intended for your load balanced application should be directed. Each load balancer can distribute requests to multiple EC2 instances. A load balancer is represented by a DNS name and a set of ports. Load balancers can span multiple Availability Zones within an EC2 Region, but they cannot span multiple regions.

To create or work with a load balancer in a specific region, use the corresponding regional service endpoint. For information about regions and endpoints supported by Elastic Load Balancing, go to Regions and Endpoints.

Elastic Load Balancing automatically generates a DNS name for each load balancer instance you create. Typically, the DNS name includes the name of the AWS region in which the load balancer is created. For example, if you create a load balancer named myLB in the us-east-1a, your load balancer might have a DNS name such as myLB-1234567890.us-east-1.elb.amazonaws.com. For information on what happens when you request connection to your load balancer, see Architectural Overview of Elastic Load Balancing.

If you'd rather use a user-friendly domain name, such as www.example.com, instead of the load balancer DNS name, you can create a custom domain name and then associate the custom domain name with the load balancer DNS name. When a request is placed to your load balancer using the custom domain name that you created, it resolves to the load balancer DNS name.

For more information on creating and using a custom domain name for your load balancer, see Configure Custom Domain Name for Your Load Balancer.

When you create your load balancer, you must configure it to accept incoming traffic by specifying the configurations for your load balancer listeners. A listener is a process that listens for connections from incoming requests. It is configured with a protocol and a port number for front-end (load balancer) and back-end (back-end instance) connections. For more information on the ports and protocols supported by Elastic Load Balancing, see Listener Configurations for Elastic Load Balancing.

Availability Zones and Regions

You can set up your Elastic Load Balancing to distribute incoming requests across EC2 instances in a single Availability Zone or multiple Availability Zones within a region. Your load balancer does not distribute traffic across regions.

For critical applications, we recommend that you distribute incoming traffic across more than one Availability Zone. To distribute traffic across multiple Availability Zones, launch your Amazon EC2 instances in all the Availability Zones you plan to use and then register the instances with your load balancer.

When you register your EC2 instances, Elastic Load Balancing provisions load balancer nodes in all the Availability Zones that has the registered instances. The load balancer node continuously monitors the health of all the registered instances and routes traffic to the healthy instances. If a load balancer node detects unhealthy or de-registered instances, it stops routing traffic to those instances. Instead, it sends requests to the remaining healthy instances.

You can always expand or shrink the availability of your instances after your initial set up. To expand the availability of your application, launch instances in an additional Availability Zone, register the new instances with your load balancer, and then add the new Availability Zone. After you've added the new Availability Zone, the load balancer begins to route traffic equally amongst all the enabled Availability Zones. To shrink the availability of your instances, remove an Availability Zone that was enabled for your load balancer. After you've removed the Availability Zone, the load balancer will stop routing the traffic to the disabled Availability Zone and continue to route traffic to the registered and healthy instances in the enabled Availability Zones.

For information see Add or Remove Availability Zones for Your Load Balanced Application.

Request Routing

Before a client sends a request to your load balancer, it first resolves the load balancer's domain name with the Domain Name System (DNS) servers. The DNS server uses DNS round robin to determine which load balancer node in a specific Availability Zone will receive the request.

The selected load balancer node then sends the request to healthy instances within the same Availability Zone. To determine the healthy instances, the load balancer node uses either the round robin (for TCP connections) or the leastconns (for HTTP/HTTPS connections) routing algorithm. The leastconns routing algorithm favors back-end instances with the fewest connections or outstanding requests.

By default, the load balancer node routes traffic to back-end instances within the same Availability Zone. To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to have approximately equivalent numbers of instances in each zone. For example, if you have ten instances in Availability Zone us-east-1a and two instances in us-east-1b, the traffic will still be equally distributed between the two Availability Zones. As a result, the two instances in us-east-1b will have to serve the same amount of traffic as the ten instances in us-east-1a. As a best practice, we recommend that you keep an equivalent or nearly equivalent number of instances in each of your Availability Zones. So in the example, rather than having ten instances in us-east-1a and two in us-east-1b, you could distribute your instances so that you have six instances in each Availability Zone.

If you want the request traffic to be routed evenly across all back-end instances, regardless of the Availability Zone that they may be in, enable cross-zone load balancing on your load balancer. Cross-zone load balancing allows each load balancer node to route requests across multiple Availability Zones, ensuring that all zones receive an equal amount of request traffic. Cross-zone load balancing reduces the need to maintain equivalent numbers of back-end instances in each zone, and improves the application's ability to handle the loss of one or more back-end instances. However, we still recommend that you maintain approximately equivalent numbers of instances in each Availability Zone for higher fault tolerance. The traffic between Elastic Load Balancing and your EC2 instances in another Availability Zone will not incur any EC2 data transfer charges.

For environments where clients cache DNS lookups, incoming requests may prefer one of the Availability Zones. Using cross-zone load balancing, this imbalance in request load will be spread across all available back-end instances in the region, reducing the impact of misbehaving clients on the application.

Routing traffic to EC2 instances in EC2-Classic

To allow communication between Elastic Load Balancing and your back-end instances launched in EC2-Classic, create a security group ingress rule that applies to all of your back-end instances. The security group rule can either allow ingress traffic from all IP addresses (the 0.0.0.0/0 CIDR range) or allow ingress traffic only from Elastic Load Balancing. To ensure that your back-end EC2 instances can receive traffic only from Elastic Load Balancing, enable network ingress for the Elastic Load Balancing security group on all of your back-end EC2 instances. For more information about configuring security groups for EC2 instances launched in EC2-Classic, see Manage Security Groups in Amazon EC2-Classic.

Routing traffic to EC2 instances in Amazon VPC

If you are planning on deploying your load balancer within Amazon Virtual Private Cloud (Amazon VPC), be sure to configure the security group rules and network ACLs to allow traffic to be routed between the subnets in your VPC. If your rules are not configured correctly, instances in other subnets may not be reachable by load balancer nodes in a different subnet. For more information on deploying load balancer within Amazon VPC, see Elastic Load Balancing in Amazon VPC. For information on configuring security groups for your load balancer deployed in VPC, see Manage Security Groups in Amazon VPC.

For a procedure on enabling or disabling cross-zone load balancing for your load balancer, see Enable or Disable Cross-Zone Load Balancing for Your Load Balancer

Configuring EC2 Instances for Load Balancing

After you've created your load balancer, you have to register your EC2 instances with the load balancer. Your EC2 instances can be within a single Availability Zone or span multiple Availability Zones within a region. Elastic Load Balancing routinely performs health check on all the registered EC2 instances and automatically distributes all incoming requests to the DNS name of your load balancer across your registered, healthy EC2 instances. For more information on the health check of your EC2 instances, see Health Check.

Make sure to install webserver, such as Apache or Internet Information Services (IIS), on all the EC2 instances you plan to register with your load balancer.

Stop and Start EC2 Instances

The instances are registered with the load balancer using the IP addresses associated with the instances. When an instance is stopped and then started, the IP address associated with your instance changes. This prevents the load balancer from routing traffic to your restarted instance. When you stop and then start your registered EC2 instances, we recommend that you de-register your stopped instance from your load balancer, and then register the restarted instance. Failure to do so may prevent the load balancer from performing health checks and routing the traffic to the restarted instance. For procedures associated with de-registering and then registering your instances with your load balancer, see Deregister and Register Amazon EC2 Instances.

Setting Keepalive On Your EC2 Instances

For HTTP and HTTPS listeners, we recommend that you enable the keep-alive option in your EC2 instances. This may be done in your web server settings and/or in the kernel settings for your back-end instance. Keep-alive option will allow the load balancer to re-use connections to your backend for multiple client requests, which reduces the load on your web server and improves the throughput of the load balancer. The keep-alive timeout should be at least 60 seconds to ensure that the load balancer is responsible for closing the connection to your instance.

Health Check

To discover the availability of your EC2 instances, the load balancer periodically sends pings, attempts connections, or sends requests to test the EC2 instances. These tests are called health checks. Each registered EC2 instance must respond to the target of the health check with an HTTP status code 200 to be considered healthy by your load balancer. If the response includes a body, then your application must either set the Content-Length header to a value greater than or equal to zero, or specify Transfer-Encoding with a value set to 'chunked'.

Your load balancer ensures that traffic is routed only to the healthy instances. When the load balancer receives any other HTTP status code other than 200, it stops routing traffic to that instance. It resumes routing traffic when the instance has been restored to a healthy state.

Elastic Load Balancing performs health checks on all your registered instances using the configuration that you provide, regardless of whether the instance is in a healthy or unhealthy state.

Your load balancer performs health checks on your instances using the protocol, port, URL, timeout, and interval specified when you configured your load balancer. For example, you can configure a health check for your instances as follows - Your load balancer to send request to http://node IP address:80/index.html every 5 seconds. Allow 3 seconds for the web server to respond. If the load balancer does not get any response after 2 attempts, take the node out of service. If the load balancer gets 2 successful responses, put the node back in service. Instances that are in service at the time of health check are marked healthy and the instances that are out of service at the time of health check are marked unhealthy.

For information on configuring a health check for the EC2 instances registered with your load balancer, see Configure Health Check for Your Amazon EC2 Instances.

Your registered instances can fail the health check for several reasons. The most common reasons for failing a health check are where EC2 instances close connections to your load balancer or where the response from the EC2 instances times out. For information on potential causes and steps you can take to resolve failed health check issues, see Troubleshooting Elastic Load Balancing: Health Check Configuration.

Connection Draining

Connection draining causes the ELB load balancer to stop sending new requests to a deregistering instance or an unhealthy instance, while keeping the existing connections open. This allows the load balancer to complete in-flight requests made to the deregistering or unhealthy instances.

Connection draining is a load balancer attribute and applies to all listeners for the load balancer. You can enable or disable connection draining for your load balancer at any time. You can check if connection draining is enabled for your load balancer by using the DescribeLoadBalancerAtrributes action, the elb-describe-lb-attributes command, or by using the AWS Management Console and clicking the Instances tab in the bottom pane of the selected load balancer.

When you enable connection draining for your load balancer, you can set a maximum time for the load balancer to continue serving in-flight requests to the deregistering instance before the load balancer closes the connection. The load balancer forcibly closes connections to the deregistering instance when the maximum time limit is reached.

While the in-flight requests are being served, the load balancer reports the instance state of the deregistering instance as InService: Instance deregistration currently in progress. The load balancer reports the instance state as OutOfService: Instance is not currently registered with the LoadBalancer when the deregistering instance has completed serving all in-flight requests or when the maximum timeout limit is reached, whichever comes first.

When an instance becomes unhealthy the load balancer reports the instance state as OutOfService. If there are in-flight requests made to the unhealthy instance, they get completed. The maximum timeout limit does not apply for the connections to the unhealthy instance.

You can check the instance state of a deregistering instance or an unhealthy instance by using the DescribeInstanceHealth action, or the elb-describe-instance-health command.

If your instances are part of an Auto Scaling group and if connection draining is enabled for your load balancer, Auto Scaling will wait for the in-flight requests to complete or for the maximum timeout to expire, whichever comes first, before terminating instances due to a scaling event or health check replacement. For information about using Elastic Load Balancing with Auto Scaling, see Use Elastic Load Balancing to Load Balance Your Auto Scaling Group.

For tutorials on how to enable or disable connection draining attribute for your load balancer, see Enable or Disable Connection Draining for Your Load Balancer.

Idle Connection Timeout

For each request a client makes through a load balancer, the load balancer maintains two connections. One connection is with the client and the other connection is to the back-end instance. For each connection, the load balancer manages an idle timeout that is triggered when no data is sent over the connection for a specified time period. After this time period has elapsed, if no data has been sent or received, the load balancer closes the connection.

By default, Elastic Load Balancing maintains a 60-second idle timeout setting for the connections to the client and the back-end instance. You can change the idle timeout setting for your load balancer at any time.

If you use HTTP and HTTPS listeners, we recommend that you enable the keep-alive option for your EC2 instances. You can enable keep-alive in your web server settings and/or in the kernel settings for your EC2 instance. Keep-alive, when enabled, allows the load balancer to re-use connections to your back-end instance, which reduces the CPU utilization on your instances. To ensure that the load balancer is responsible for closing the connections to your back-end instance, make sure that the value set on your instance for the keep-alive time is greater than the idle timeout setting on your load balancer.

For a tutorial on configuring the idle timeout settings for your load balancer, see Configure Idle Connection Timeout.

Sticky Sessions

By default, a load balancer routes each request independently to the application instance with the smallest load. However, you can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user's session to a specific application instance. This ensures that all requests coming from the user during the session will be sent to the same application instance.

The key to managing the sticky session is determining how long your load balancer should consistently route the user's request to the same application instance. If your application has its own session cookie, then you can set Elastic Load Balancing to create the session cookie to follow the duration specified by the application's session cookie. If your application does not have its own session cookie, then you can set Elastic Load Balancing to create a session cookie by specifying your own stickiness duration. You can associate stickiness duration for only HTTP/HTTPS load balancer listeners.

An application instance must always receive and send two cookies: A cookie that defines the stickiness duration and a special Elastic Load Balancing cookie named AWSELB, that has the mapping to the application instance.

HTTP Methods

The HTTP method (also called the verb) specifies the action to be performed on the resource receiving an HTTP request. The standard methods for HTTP requests are defined in RFC 2616, Hypertext Transfer Protocol-HTTP/1.1. Standard methods include GET, POST, PUT, HEAD, and OPTIONS. Some web applications require (and sometimes also introduce) new methods that are extensions of HTTP/1.1 methods. These HTTP extensions can be non-standard. Some common examples of HTTP extended methods include (but are not limited to) PATCH, REPORT, MKCOL, PROPFIND, MOVE, and LOCK. Elastic Load Balancing accepts all standard and non-standard HTTP methods.

When a load balancer receives an HTTP request, it performs checks for malformed requests and for the length of the method. The total method length in an HTTP request to a load balancer must not exceed 127 characters. If the HTTP request passes both the checks, the load balancer sends the request to the back-end EC2 instance. If the method field in the request is malformed, the load balancer responds with a HTTP 400: BAD_REQUEST error message. If the length of the method in the request exceeds 127 characters, the load balancer responds with a HTTP 405: METHOD_NOT_ALLOWED error message.

The back-end EC2 instance processes a valid request by implementing the method contained in the request and then sending a response back to the client. Your back-end instance must be configured to handle both supported and unsupported methods.

HTTPS Support

HTTPS Support is a feature that allows you to use the SSL/TLS protocol for encrypted connections (also known as SSL offload). This feature enables traffic encryption between the clients that initiate HTTPS sessions with your load balancer and also for connections between the load balancer and your back-end instances.

To enable HTTPS support for your load balancer, you'll have to install an SSL server certificate on your load balancer. The load balancer uses the certificate to terminate and then decrypt requests before sending them to the back-end instances.

For more information, see Using HTTP/HTTPS Protocol with Elastic Load Balancing. For information about creating a load balancer that uses HTTPS, see Create a HTTPS/SSL Load Balancer .

Proxy Protocol

The Proxy Protocol header helps you identify the IP address of a client when you use a load balancer configured for TCP/SSL connections. Because load balancers intercept traffic between clients and your back-end instances, the access logs from your back-end instance contain the IP address of the load balancer instead of the originating client. When Proxy Protocol is enabled, the load balancer adds a human-readable format header that contains the connection information, such as the source IP address, destination IP address, and port numbers of the client. The header is then sent to the back-end instance as a part of the request. You can parse the first line of the request to retrieve your client's IP address and the port number.

The Proxy Protocol line is a single line that ends with a carriage return and line feed ("\r\n"). It takes the following form:

PROXY_STRING + single space + INET_PROTOCOL + single space + CLIENT_IP + single space + PROXY_IP + single space + CLIENT_PORT + single space + PROXY_PORT + "\r\n"

The following is an example of the IPv4 Proxy Protocol.

PROXY TCP4 198.51.100.22  203.0.113.7  35646  80\r\n

The Proxy Protocol line for IPv6 takes an identical form, except it begins with TCP6 and the address is in IPv6 format.

The following is an example of the IPv6 Proxy Protocol.

PROXY TCP6 2001:DB8::21f:5bff:febf:ce22:8a2e 2001:DB8::12f:8baa:eafc:ce29:6b2e 35646  80\r\n

If the client connects with IPv6, the address of the proxy in the header will be the public IPv6 address of your load balancer. This IPv6 address matches the IP address that is resolved from your load balancer's DNS name that is prefixed with either ipv6 or dualstack. If the client connects with IPv4, the address of the proxy in the header will be the private IPv4 address of the load balancer and will therefore not be resolvable through a DNS lookup outside the Amazon Elastic Compute Cloud (Amazon EC2) network.

For information about enabling the Proxy Protocol header, see Enable or Disable Proxy Protocol Support.

X-Forwarded Headers

The HTTP requests and HTTP responses use header fields to send information about the HTTP message. Header fields are colon-separated name-value pairs that are separated by a carriage return (CR) and a line feed (LF). A standard set of HTTP header fields is defined in RFC 2616, Message Headers. There are also non-standard HTTP headers available that are widely used by the applications. Some of the non-standard HTTP headers have a X-Forwarded prefix. Elastic Load Balancing supports the following X-Forwarded headers:

X-Forwarded-For

The X-Forwarded-For request header helps you identify the IP address of a client when you use HTTP/HTTPS load balancer. Because load balancers intercept traffic between clients and servers, your server access logs contain only the IP address of the load balancer. To see the IP address of the client, use the X-Forwarded-For request header. Elastic Load Balancing stores the IP address of the client in the X-Forwarded-For request header and passes the header along to your server.

The X-Forwarded-For request header takes the following form:

X-Forwarded-For: clientIPAddress

The following example is an X-Forwarded-For request header for a client with an IP address of 203.0.113.7.

X-Forwarded-For: 203.0.113.7

The following example is an X-Forwarded-For request header for a client with an IPv6 address of 2001:DB8::21f:5bff:febf:ce22:8a2e.

X-Forwarded-For: 2001:DB8::21f:5bff:febf:ce22:8a2e

If the request goes through multiple proxies, then the clientIPAddress in the X-Forwarded-For request header is followed by IP addresses of each successive proxy that passes along the request before the request reaches your load balancer. Thus, the right-most value is the IP address of the most recent proxy (for your load balancer) and the left-most value is the IP address of the originating client. In such cases, the X-Forwarded-For request header takes the following form:

X-Forwarded-For: OriginatingClientIPAddress, proxy1-IPAddress, proxy2-IPAddress

X-Forwarded-Proto

The X-Forwarded-Proto request header helps you identify the protocol (HTTP or HTTPS) that a client used to connect to your server. Your server access logs contain only the protocol used between the server and the load balancer; they contain no information about the protocol used between the client and the load balancer. To determine the protocol used between the client and the load balancer, use the X-Forwarded-Proto request header. Elastic Load Balancing stores the protocol used between the client and the load balancer in the X-Forwarded-Proto request header and passes the header along to your server.

Your application or website can use the protocol stored in the X-Forwarded-Proto request header to render a response that redirects to the appropriate URL.

The X-Forwarded-Proto request header takes the following form:

X-Forwarded-Proto: originatingProtocol

The following example contains an X-Forwarded-Proto request header for a request that originated from the client as an HTTPS request:

X-Forwarded-Proto: https

X-Forwarded-Port

The X-Forwarded-Port request header helps you identify the port a HTTP/HTTPS load balancer uses to connect to the client.