Menu
AWS Elastic Beanstalk
Developer Guide (API Version 2010-12-01)

Worker Environments

When you launch an Elastic Beanstalk environment, you choose an environment tier, platform, and environment type. The environment tier that you choose determines whether Elastic Beanstalk provisions resources to support a web application that handles HTTP(S) requests or a web application that handles background-processing tasks. An environment whose web application processes web requests is known as a web server environment. An environment tier whose web application runs background jobs is known as a worker environment.

You can deploy a worker environment on its own to perform background-processing tasks for any AWS service that can write to an Amazon Simple Queue Service queue (for example, Amazon EC2 or AWS OpsWorks). Or you can deploy it alongside an Elastic Beanstalk web server tier. You can use a worker environment to execute long-running tasks or tasks that can be performed asynchronously. By offloading background-processing tasks to a worker environment, you free up the web application in your web server environment to handle web requests.

Worker environments run a daemon process provided by Elastic Beanstalk. This daemon is updated regularly to add features and fix bugs. To get the latest version of the daemon, update to the latest platform version.

Feature

Release Date

Description

Enhanced Health Reporting

August 11, 2015

Monitor environment health with more detail and accuracy.

Periodic Tasks

February 17, 2015

Run cron jobs that you configure in a cron.yaml file in your application source code.

Dead Letter Queues

May 27, 2014

Send failed jobs to a dead letter queue for troubleshooting.

Changed the default Visibility timeout from 30 seconds to 300 seconds.

How a Worker Environment Works

Elastic Beanstalk installs a daemon on each Amazon EC2 instance in the Auto Scaling group to process Amazon SQS messages in the worker environment. The daemon pulls data off the Amazon SQS queue, inserts it into the message body of an HTTP POST request, and sends it to a user-configurable URL path on the local host. The content type for the message body within an HTTP POST request is application/json by default.

Important

We strongly recommend that you familiarize yourself with how Amazon SQS works if you plan to deploy a worker environment. In particular, the properties of Amazon SQS queues (message order, at-least-once delivery, and message sampling) can affect how you design a web application for a worker environment. For more information, see Properties of Distributed Queues in the Amazon Simple Queue Service Developer Guide.

The following diagram illustrates an example of the worker environment processing an Amazon SQS message.

Elastic Beanstalk Worker Environment Amazon SQS Message Processing

The daemon sets the following HTTP headers:

Note

HTTP header names are not case-sensitive. For more information, see 4.2 Message Headers in the Hypertext Transfer Protocol -- HTTP/1.1 specification.

HTTP Headers

NameValue

User-Agent

aws-sqsd

aws-sqsd/1.11

X-Aws-Sqsd-Msgid

SQS message ID, used to detect message storms

X-Aws-Sqsd-Queue

Name of the SQS queue

X-Aws-Sqsd-First-Received-At

Time stamp showing when the message was first received (in the UTC time zone)

Note

The time stamp is conveyed using the ISO 8601 time format. For more information, go to http://www.w3.org/TR/NOTE-datetime.

X-Aws-Sqsd-Receive-Count

SQS message receive count

Content-Type

Mime type configuration; by default, application/json

X-Aws-Sqsd-Taskname2

Name of the periodic task

X-Aws-Sqsd-Attr-message-attribute-name2

Custom message attributes assigned to the message being processed. The message-attribute-name is the actual message attribute name. All string and number message attributes are added to the header, Binary attributes are discarded and not included in the header.

X-Aws-Sqsd-Scheduled-At2

Time at which the periodic task was scheduled

X-Aws-Sqsd-Sender-Id2

AWS account number of the sender of the message

The requests are sent to the HTTP Path value that you configure. This is done in such a way as to appear to the web application in the worker environment that the daemon originated the request. In this way, the daemon serves a similar role to a load balancer in a web server environment.

The worker environment, after processing the messages in the queue, forwards the messages over the local loopback to a web application at a URL that you designate. The queue URL is only accessible from the local host. Because you can only access the queue URL from the same EC2 instance, no authentication is needed to validate the messages that are delivered to the URL.

A web application in a worker environment should only listen on the local host. When the web application in the worker environment returns a 200 OK response to acknowledge that it has received and successfully processed the request, the daemon sends a DeleteMessage call to the SQS queue so that the message will be deleted from the queue. (SQS automatically deletes messages that have been in a queue for longer than the configured RetentionPeriod.) If the application returns any response other than 200 OK, then Elastic Beanstalk waits to put the message back in the queue after the configured VisibilityTimeout period. If there is no response, then Elastic Beanstalk waits to put the message back in the queue after the InactivityTimeout period so that the message is available for another attempt at processing.

Dead Letter Queues

Elastic Beanstalk worker environments support Amazon Simple Queue Service (SQS) dead letter queues. A dead letter queue is a queue where other (source) queues can send messages that for some reason could not be successfully processed. A primary benefit of using a dead letter queue is the ability to sideline and isolate the unsuccessfully processed messages. You can then analyze any messages sent to the dead letter queue to try to determine why they were not successfully processed.

A dead letter queue is enabled by default for a worker environment if you specify an autogenerated Amazon SQS queue at the time you create your worker environment tier. If you choose an existing SQS queue for your worker environment, you must use SQS to configure a dead letter queue independently. For information about how to use SQS to configure a dead letter queue, see Using Amazon SQS Dead Letter Queues.

You cannot disable dead letter queues. Messages that cannot be delivered will always eventually be sent to a dead letter queue. You can, however, effectively disable this feature by setting the MaxRetries option to the maximum valid value of 1000.

Note

The Elastic Beanstalk MaxRetries option is equivalent to the SQS MaxReceiveCount option. If your worker environment does not use an autogenerated SQS queue, use the MaxReceiveCount option in SQS to effectively disable your dead letter queue. For more information, see Using Amazon SQS Dead Letter Queues.

For more information about the lifecycle of an SQS message, go to Message Lifecycle.

Periodic Tasks

Elastic Beanstalk worker environments can perform periodic tasks in addition to processing messages from the Amazon SQS queue. However, Amazon SQS processes messages in the order that it receives them. As a result, a periodic task can be delayed when the Amazon SQS queue has many messages to process. We recommend that you design separate applications when punctuality for periodic tasks is critical.

To invoke periodic tasks, your application source bundle must include a cron.yaml file at the root level. The file must contain information about the periodic tasks you want to schedule. Specify this information using standard crontab syntax. For more information, see CRON expression.

The following snippet contains example file contents for the cron.yaml file. This file instructs a specified EC2 instance to run the backup-job job every 12 hours and the audit job every day at 11pm in UTC. (11pm is represented as hour 23 of the day.) Each job name must be unique within the file. The url is appended to the application URL.

version: 1
cron:
 - name: "backup-job"          # required - unique across all entries in this file
   url: "/backup"              # required - does not need to be unique
   schedule: "0 */12 * * *"    # required - does not need to be unique
 - name: "audit"
   url: "/audit"
   schedule: "0 23 * * *"

Use Amazon CloudWatch for Auto Scaling in Worker Environment Tiers

Together, Auto Scaling and CloudWatch monitor the CPU utilization of the running instances in the worker environment. How you configure the autoscaling limit for CPU capacity determines how many instances the autoscaling group runs to appropriately manage the throughput of messages in the SQS queue. Each EC2 instance publishes its CPU utilization metrics to CloudWatch. Auto Scaling retrieves from CloudWatch the average CPU usage across all instances in the worker environment. You configure the upper and lower threshold as well as how many instances to add or terminate according to CPU capacity. When Auto Scaling detects that you have reached the specified upper threshold on CPU capacity, Elastic Beanstalk creates new instances in the worker environment. The instances are deleted when the CPU load drops back below the threshold.

Note

Messages that have not been processed at the time an instance is terminated are once again made visible on the queue where they can be processed by another daemon on an instance that is still running.

You can also set other CloudWatch alarms, as needed, by using the AWS Management Console, CLI, or the options file. For more information, go to Using Elastic Beanstalk with Amazon CloudWatch and Use Auto Scaling Policies and Amazon CloudWatch Alarms for Dynamic Scaling.

Creating a Worker Environment

When you create an Elastic Beanstalk environment or update an existing environment, whether through the AWS Management Console, CreateEnvironment API, UpdateEnvironment API, the EB CLI, or the AWS command line, you specify whether you want a Web Server or Worker environment. You cannot have one environment that is both a web server environment and a worker environment because Elastic Beanstalk supports only one Auto Scaling group per environment. By default, Elastic Beanstalk launches a web server environment. You cannot change the environment tier after you launch an environment. If your web application needs a different kind of environment tier, you must launch a new environment.

Note

The CreateEnvironment and UpdateEnvironment APIs have an attribute called tier. (The DescribeEnvironments API has a tier parameter as part of its response and will omit some parameters from its response if the tier it describes is a worker environment. The DescribeEnvironmentResources API has an attribute called EnvironmentResources.)

If you use an existing Amazon SQS queue, the settings that you configure when you create a worker environment can conflict with settings you configured directly in Amazon SQS. For example, if you configure a worker environment with a RetentionPeriod value that is higher than the MessageRetentionPeriod value you set in Amazon SQS, then Amazon SQS will delete the message when it exceeds the MessageRetentionPeriod. Conversely, if the RetentionPeriod value you configure in the worker environment settings is lower than the MessageRetentionPeriod value you set in Amazon SQS, then the daemon will delete the message before Amazon SQS can. For VisibilityTimeout, the value that you configure for the daemon in the worker environment settings overrides the Amazon SQS VisibilityTimeout setting. Ensure that messages are deleted appropriately by comparing your Elastic Beanstalk settings to your Amazon SQS settings.

If you don't specify an existing Amazon SQS queue when you configure a worker environment tier, Elastic Beanstalk will create one for you. You can get the URL by calling DescribeEnvironmentResources.

For procedures to launch an environment, go to Creating an AWS Elastic Beanstalk Environment.

Configuring Worker Environments with Elastic Beanstalk

As noted earlier, Elastic Beanstalk installs a daemon on each Amazon EC2 instance in the Auto Scaling group. After the worker environment is created, you can control how that daemon processes Amazon SQS messages. For example, you can configure additional settings such as the retention period during which a message is valid or the visibility timeout period during which a message is not visible in the Amazon SQS queue because it is locked for processing.

AWS Management Console

You can manage a worker environment's configuration by editing Worker Configuration on the Configuration page in the environment management console.

Elastic Beanstalk Worker Details Configuration Window

The Worker Details page has the following options:

  • Worker queue – Specify the Amazon SQS queue from which the daemon reads. You can choose an existing queue, if you have one. If you choose Autogenerated queue, Elastic Beanstalk creates a new Amazon SQS queue and a corresponding Worker queue URL.

  • Worker queue URL – If you choose an existing Worker queue, then this setting displays the URL associated with that Amazon SQS queue.

  • HTTP path – Specify the relative path to the application that will receive the data from the Amazon SQS queue. The data is inserted into the message body of an HTTP POST message. The default value is /.

  • MIME type – Indicate the MIME type that the HTTP POST message uses. The default value is application/json. However, any value is valid because you can create and then specify your own MIME type.

  • Max retries – Specify the maximum number of times Elastic Beanstalk attempts to send the message to the Amazon SQS queue before moving the message to the dead letter queue. The default value is 10. You can specify a value between 1 and 1000.

  • HTTP connections – Specify the maximum number of concurrent connections that the daemon can make to any application(s) within an Amazon EC2 instance. The default is 50. You can specify a value between 1 and 100.

  • Connection timeout – Indicate the amount of time, in seconds, to wait for successful connections to an application. The default value is 5. You can specify a value between 1 and 60 seconds.

  • Inactivity timeout – Indicate the amount of time, in seconds, to wait for a response on an existing connection to an application. The default value is 180. You can specify a value between 1 and 1800 seconds.

  • Visibility timeout – Indicate the amount of time, in seconds, an incoming message from the Amazon SQS queue is locked for processing. After the configured amount of time has passed, the message is again made visible in the queue for another daemon to read. Choose a value that is longer than you expect your application requires to process messages, up to 43200 seconds.

  • Error visibility timeout – Indicate the amount of time, in seconds, that elapses before Elastic Beanstalk returns a message to the Amazon SQS queue after an attempt to process it fails with an explicit error. You can specify a value between 0 and 43200.

  • Retention period – Indicate the amount of time, in seconds, a message is valid and will be actively processed. The default value is 345600. You can specify a value between 60 and 1209600.