Menu
AWS CloudTrail
User Guide (Version 1.0)

Using the CloudTrail Processing Library

The CloudTrail Processing Library is a Java library that provides an easy way to process AWS CloudTrail logs in a fault-tolerant, scalable and flexible way. You provide configuration details about your CloudTrail SQS queue and write code to process events. The CloudTrail Processing Library does the rest, polling your Amazon SQS queue, reading and parsing queue messages, downloading CloudTrail log files, parsing events in the log files and passing them to your code as Java objects. The CloudTrail Processing Library is highly scalable and fault-tolerant, handling parallel processing of log files so that you can process as many logs as necessary, and robustly handling network failures related to network timeouts and inaccessible resources.

This chapter provides information about how to use the CloudTrail Processing Library to process CloudTrail logs in your Java projects. The library is provided as an Apache-licensed open-source project, available on GitHub:

The library source includes sample code that you can use as a base for your own projects.

Minimum requirements

To use the CloudTrail Processing Library, you must have the following:

Processing CloudTrail Logs with the CloudTrail Processing Library

To use the CloudTrail Processing Library to process CloudTrail logs in your Java application:

Add the CloudTrail Processing Library to your Project

To use the CloudTrail Processing Library you must add it to your Java project's classpath.

Adding the Library to an Apache Ant Project

To add the CloudTrail Processing Library to an Ant project

  1. Download or clone the CloudTrail Processing Library source code from GitHub at:

  2. Build the .jar file from source as described in the README:

    Copy
    mvn clean install -Dgpg.skip=true
  3. Copy the resulting .jar file into your project and add it to your project's build.xml file. For example:

    Copy
    <classpath> <pathelement path="${classpath}"/> <pathelement location="lib/aws-cloudtrail-processing-library-1.0.1.jar"/> </classpath>

Adding the Library to an Apache Maven Project

The CloudTrail Processing Library is available for Apache Maven, so adding it to your project is as easy as writing a single dependency in your project's pom.xml file.

To add the CloudTrail Processing Library to a Maven project

  • Using your favorite text editor, open your Maven project's pom.xml file and add the following dependency:

    Copy
    <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-cloudtrail-processing-library</artifactId> <version>1.0.1</version> </dependency>

Adding the CloudTrail Processing Library to an Eclipse Project

To add the CloudTrail Processing Library to an Eclipse project

  1. Download or clone the CloudTrail Processing Library source code from GitHub at:

  2. Build the .jar file from source as described in the README:

    Copy
    mvn clean install -Dgpg.skip=true
  3. Copy the built aws-cloudtrail-processing-library-1.0.1.jar to a directory in your project (typically lib).

  4. Right-click your project's name in the Eclipse Project Explorer, and select Build Path > Configure

  5. In the Java Build Path window, click the Libraries tab.

  6. Click Add JARs... and navigate to the path where you copied aws-cloudtrail-processing-library-1.0.1.jar.

  7. Click OK to complete adding the .jar to your project.

Configure the CloudTrail Processing Library

You can configure the CloudTrail Processing Library by creating a classpath properties file that is loaded at runtime, or by creating a ClientConfiguration object and setting options manually.

Providing a Properties File

You can write a classpath properties file that provides configuration options to your application. Here is an example file that demonstrates the options you can set:

Copy
# AWS access key. (Required) accessKey = your_access_key # AWS secret key. (Required) secretKey = your_secret_key # The SQS URL used to pull CloudTrail notification from. (Required) sqsUrl = your_sqs_queue_url # The SQS end point specific to a region. sqsRegion = us-east-1 # A period of time during which Amazon SQS prevents other consuming components # from receiving and processing that message. visibilityTimeout = 60 # The S3 region to use. s3Region = us-east-1 # Number of threads used to download S3 files in parallel. Callbacks can be # invoked from any thread. threadCount = 1 # The time allowed, in seconds, for threads to shut down after # AWSCloudTrailEventProcessingExecutor.stop() is called. If they are still # running beyond this time, they will be forcibly terminated. threadTerminationDelaySeconds = 60 # The maximum number of AWSCloudTrailClientEvents sent to a single invocation # of processEvents(). maxEventsPerEmit = 10 # Whether to include raw event information in CloudTrailDeliveryInfo. enableRawEventInfo = false

The required parameters are sqsUrl, accessKey and secretKey. The sqsUrl parameter provides the URL to pull your CloudTrail notifications from. If you don't provide this value, then an IllegalStateException will be thrown by the AWSCloudTrailProcessingExecutor. The accessKey and secretKey parameters provide your AWS credentials to the library, allowing it to access AWS on your behalf.

The other parameters have reasonable defaults that are set by the library. For information about the default values for each option, see the AWS CloudTrail Processing Library Reference.

Creating a ClientConfiguration

Instead of setting options in the classpath properties, you can provide options to the AWSCloudTrailProcessingExecutor by initializing and setting options on a ClientConfiguration object.

For example:

Copy
ClientConfiguration basicConfig = new ClientConfiguration( "http://sqs.us-east-1.amazonaws.com/123456789012/queue2", new DefaultAWSCredentialsProviderChain()); basicConfig.setEnableRawEventInfo(true); basicConfig.setThreadCount(4); basicConfig.setnEventsPerEmit(20);

Implement the Events Processor

To process CloudTrail logs, you must implement a EventsProcessor that receives the CloudTrail log data. Here is an example implementation:

Copy
public class SampleEventsProcessor implements EventsProcessor { public void process(List<CloudTrailEvent> events) { int i = 0; for (CloudTrailEvent event : events) { System.out.println(String.format("Process event %d : %s", i++, event.getEventData())); } } }

When implementing a EventsProcessor, you implement the process() callback that the AWSCloudTrailProcessingExecutor uses to send you CloudTrail events. Events are provided in a list of CloudTrailClientEvent objects.

The CloudTrailClientEvent object provides a CloudTrailEvent and CloudTrailEventMetadata that you can use to read the CloudTrail event and delivery information.

This simple example just prints the event information for each event passed to SampleEventsProcessor. In your own implementation, you can process logs as you see fit. The AWSCloudTrailProcessingExecutor will continue to send events to your EventsProcessor as long as it has events to send and is still running.

Instantiate and Run the Processing Executor

Once you have written a EventsProcessor and have set configuration values for the CloudTrail Processing Library (either in a properties file or by using the ClientConfiguration class), you can use these elements to initialize and use a AWSCloudTrailProcessingExecutor.

To use AWSCloudTrailProcessingExecutor to process CloudTrail events

  1. Instantiate an AWSCloudTrailProcessingExecutor.Builder object. Builder's constructor takes a EventsProcessor object and a classpath properties file name.

  2. Call the Builder's build() factory method to configure and obtain an AWSCloudTrailProcessingExecutor object.

  3. Use the AWSCloudTrailProcessingExecutor's start() and stop() methods to begin and end CloudTrail event processing.

Copy
public class SampleApp { public static void main(String[] args) throws InterruptedException { AWSCloudTrailProcessingExecutor executor = new AWSCloudTrailProcessingExecutor.Builder(new SampleEventsProcessor(), "/myproject/cloudtrailprocessing.properties").build(); executor.start(); Thread.sleep(24 * 60 * 60 * 1000); // let it run for a while (optional) executor.stop(); // optional } }

Advanced Topics

Filtering the Events to Process

By default, all of the logs in your Amazon SQS queue's S3 bucket and all of the events that they contain will be sent to your EventsProcessor. The CloudTrail Processing Library provides optional interfaces that you can implement to filter the sources used to obtain CloudTrail logs and to filter the events that you are interested in processing.

SourceFilter

You can implement the SourceFilter interface to choose whether or not you want to process logs from a provided source. SourceFilter declares a single callback method, filterSource(), that receives a CloudTrailSource object. To keep events from a source from being processed, return false from filterSource().

The filterSource() method is called by the CloudTrail Processing Library after the library has polled for logs on the Amazon SQS queue, but before event filtering or processing has been done for those logs.

Here is an example implementation:

Copy
public class SampleSourceFilter implements SourceFilter{ private static final int MAX_RECEIVED_COUNT = 3; private static List<String> accountIDs ; static { accountIDs = new ArrayList<>(); accountIDs.add("123456789012"); accountIDs.add("234567890123"); } @Override public boolean filterSource(CloudTrailSource source) throws CallbackException { source = (SQSBasedSource) source; Map<String, String> sourceAttributes = source.getSourceAttributes(); String accountId = sourceAttributes.get( SourceAttributeKeys.ACCOUNT_ID.getAttributeKey()); String receivedCount = sourceAttributes.get( SourceAttributeKeys.APPROXIMATE_RECEIVE_COUNT.getAttributeKey()); int approximateReceivedCount = Integer.parseInt(receivedCount); return approximateReceivedCount <= MAX_RECEIVED_COUNT && accountIDs.contains(accountId); } }

If you don't provide your own SourceFilter, then DefaultSourceFilter will be used, which allows all sources to be processed (it always returns true).

EventFilter

You can implement the EventFilter interface to choose whether a CloudTrail event will be sent to your EventsProcessor or not. EventFilter declares a single callback method, filterEvent(), that receives a CloudTrailEvent object. To keep the event from being processed, return false from filterEvent().

The filterEvent() method is called by the CloudTrail Processing Library after the library has polled for logs on the Amazon SQS queue and after source filtering, but before event processing has been done for those logs.

Here is an example implementation:

Copy
public class SampleEventFilter implements EventFilter{ private static final String EC2_EVENTS = "ec2.amazonaws.com"; @Override public boolean filterEvent(CloudTrailClientEvent clientEvent) throws CallbackException { CloudTrailEvent event = clientEvent.getEvent(); String eventSource = event.getEventSource(); String eventName = event.getEventName(); return eventSource.equals(EC2_EVENTS) && eventName.startsWith("Delete"); } }

If you don't provide your own EventFilter, then DefaultEventFilter will be used, which allows all events to be processed (it always returns true.

Reporting Progress

The ProgressReporter interface can be implemented to customize the reporting of CloudTrail Processing Library progress. ProgressReporter declares two methods: reportStart() and reportEnd(), which are called at the beginning and end of the following operations:

  • polling messages from Amazon SQS.

  • parsing messages from Amazon SQS.

  • processing an Amazon SQS source for CloudTrail logs.

  • deleting messages from Amazon SQS.

  • downloading a CloudTrail log file.

  • processing a CloudTrail log file.

Both methods receive a ProgressStatus object that contains information about the operation being performed (in the progressState member, which holds a member of the ProgressState enumeration that identifies the current operation) and can contain additional information (in the progressInfo member). Additionally, any object that you return from reportStart() will be passed to reportEnd(), so you can provide contextual information such as what time it was when the event began processing.

Here is an example implementation that provides information about how long an operation took to complete:

Copy
public class SampleProgressReporter implements ProgressReporter { private static final Log logger = LogFactory.getLog(DefaultProgressReporter.class); @Override public Object reportStart(ProgressStatus status) { return new Date(); } @Override public void reportEnd(ProgressStatus status, Object startDate) { System.out.println(status.getProgressState().toString() + " is " + status.getProgressInfo().isSuccess() + " , and latency is " + Math.abs(((Date) startDate).getTime()-new Date().getTime()) + " milliseconds."); } }

If you don't implement your own ProgressReporter, then DefaultExceptionHandler, which prints the name of the state being run, will be used instead.

Handling Errors

The ExceptionHandler interface allows you to provide special handling when an exception occurs during log processing. ExceptionHandler declares a single callback method, handleException(), which receives a ProcessingLibraryException object with context about the exception that occurred.

You can use the passed-in ProcessingLibraryException's getStatus() method to find out what operation was being executed when the exception occurred and get additional information about the status of the operation. ProcessingLibraryException is derived from Java's standard Exception class, so you can also retrieve information about the exception by invoking any of the Exception methods, as well.

Here is an example implementation:

Copy
public class SampleExceptionHandler implements ExceptionHandler{ private static final Log logger = LogFactory.getLog(DefaultProgressReporter.class); @Override public void handleException(ProcessingLibraryException exception) { ProgressStatus status = exception.getStatus(); ProgressState state = status.getProgressState(); ProgressInfo info = status.getProgressInfo(); System.err.println(String.format( "Exception. Progress State: %s. Progress Information: %s.", state, info)); } }

If you don't provide your own ExceptionHandler, then DefaultExceptionHandler, which simply prints a standard error message, will be used instead.

Additional Resources

For more information about the CloudTrail Processing Library, see the following additional resources: