Getting data from AWS X-Ray - AWS X-Ray

Getting data from AWS X-Ray

AWS X-Ray processes the trace data that you send to it to generate full traces, trace summaries, and service graphs in JSON. You can retrieve the generated data directly from the API with the AWS CLI.

Retrieving the service graph

You can use the GetServiceGraph API to retrieve the JSON service graph. The API requires a start time and end time, which you can calculate from a Linux terminal with the date command.

$ date +%s 1499394617

date +%s prints a date in seconds. Use this number as an end time and subtract time from it to get a start time.

Example Script to retrieve a service graph for the last 10 minutes
EPOCH=$(date +%s) aws xray get-service-graph --start-time $(($EPOCH-600)) --end-time $EPOCH

The following example shows a service graph with 4 nodes, including a client node, an EC2 instance, a DynamoDB table, and an Amazon SNS topic.

Example GetServiceGraph output
{ "Services": [ { "ReferenceId": 0, "Name": "xray-sample.elasticbeanstalk.com", "Names": [ "xray-sample.elasticbeanstalk.com" ], "Type": "client", "State": "unknown", "StartTime": 1528317567.0, "EndTime": 1528317589.0, "Edges": [ { "ReferenceId": 2, "StartTime": 1528317567.0, "EndTime": 1528317589.0, "SummaryStatistics": { "OkCount": 3, "ErrorStatistics": { "ThrottleCount": 0, "OtherCount": 1, "TotalCount": 1 }, "FaultStatistics": { "OtherCount": 0, "TotalCount": 0 }, "TotalCount": 4, "TotalResponseTime": 0.273 }, "ResponseTimeHistogram": [ { "Value": 0.005, "Count": 1 }, { "Value": 0.015, "Count": 1 }, { "Value": 0.157, "Count": 1 }, { "Value": 0.096, "Count": 1 } ], "Aliases": [] } ] }, { "ReferenceId": 1, "Name": "awseb-e-dixzws4s9p-stack-StartupSignupsTable-4IMSMHAYX2BA", "Names": [ "awseb-e-dixzws4s9p-stack-StartupSignupsTable-4IMSMHAYX2BA" ], "Type": "AWS::DynamoDB::Table", "State": "unknown", "StartTime": 1528317583.0, "EndTime": 1528317589.0, "Edges": [], "SummaryStatistics": { "OkCount": 2, "ErrorStatistics": { "ThrottleCount": 0, "OtherCount": 0, "TotalCount": 0 }, "FaultStatistics": { "OtherCount": 0, "TotalCount": 0 }, "TotalCount": 2, "TotalResponseTime": 0.12 }, "DurationHistogram": [ { "Value": 0.076, "Count": 1 }, { "Value": 0.044, "Count": 1 } ], "ResponseTimeHistogram": [ { "Value": 0.076, "Count": 1 }, { "Value": 0.044, "Count": 1 } ] }, { "ReferenceId": 2, "Name": "xray-sample.elasticbeanstalk.com", "Names": [ "xray-sample.elasticbeanstalk.com" ], "Root": true, "Type": "AWS::EC2::Instance", "State": "active", "StartTime": 1528317567.0, "EndTime": 1528317589.0, "Edges": [ { "ReferenceId": 1, "StartTime": 1528317567.0, "EndTime": 1528317589.0, "SummaryStatistics": { "OkCount": 2, "ErrorStatistics": { "ThrottleCount": 0, "OtherCount": 0, "TotalCount": 0 }, "FaultStatistics": { "OtherCount": 0, "TotalCount": 0 }, "TotalCount": 2, "TotalResponseTime": 0.12 }, "ResponseTimeHistogram": [ { "Value": 0.076, "Count": 1 }, { "Value": 0.044, "Count": 1 } ], "Aliases": [] }, { "ReferenceId": 3, "StartTime": 1528317567.0, "EndTime": 1528317589.0, "SummaryStatistics": { "OkCount": 2, "ErrorStatistics": { "ThrottleCount": 0, "OtherCount": 0, "TotalCount": 0 }, "FaultStatistics": { "OtherCount": 0, "TotalCount": 0 }, "TotalCount": 2, "TotalResponseTime": 0.125 }, "ResponseTimeHistogram": [ { "Value": 0.049, "Count": 1 }, { "Value": 0.076, "Count": 1 } ], "Aliases": [] } ], "SummaryStatistics": { "OkCount": 3, "ErrorStatistics": { "ThrottleCount": 0, "OtherCount": 1, "TotalCount": 1 }, "FaultStatistics": { "OtherCount": 0, "TotalCount": 0 }, "TotalCount": 4, "TotalResponseTime": 0.273 }, "DurationHistogram": [ { "Value": 0.005, "Count": 1 }, { "Value": 0.015, "Count": 1 }, { "Value": 0.157, "Count": 1 }, { "Value": 0.096, "Count": 1 } ], "ResponseTimeHistogram": [ { "Value": 0.005, "Count": 1 }, { "Value": 0.015, "Count": 1 }, { "Value": 0.157, "Count": 1 }, { "Value": 0.096, "Count": 1 } ] }, { "ReferenceId": 3, "Name": "SNS", "Names": [ "SNS" ], "Type": "AWS::SNS", "State": "unknown", "StartTime": 1528317583.0, "EndTime": 1528317589.0, "Edges": [], "SummaryStatistics": { "OkCount": 2, "ErrorStatistics": { "ThrottleCount": 0, "OtherCount": 0, "TotalCount": 0 }, "FaultStatistics": { "OtherCount": 0, "TotalCount": 0 }, "TotalCount": 2, "TotalResponseTime": 0.125 }, "DurationHistogram": [ { "Value": 0.049, "Count": 1 }, { "Value": 0.076, "Count": 1 } ], "ResponseTimeHistogram": [ { "Value": 0.049, "Count": 1 }, { "Value": 0.076, "Count": 1 } ] } ] }

Retrieving the service graph by group

To call for a service graph based on the contents of a group, include a groupName or groupARN. The following example shows a service graph call to a group named Example1.

Example Script to retrieve a service graph by name for group Example1
aws xray get-service-graph --group-name "Example1"

Retrieving traces

You can use the GetTraceSummaries API to get a list of trace summaries. Trace summaries include information that you can use to identify traces that you want to download in full, including annotations, request and response information, and IDs.

There are two TimeRangeType flags available when calling aws xray get-trace-summaries:

  • TraceId – The default GetTraceSummaries search uses TraceID time and returns traces started within the computed [start_time, end_time) range. This range of timestamps is calculated based on the encoding of the timestamp within the TraceId, or can be defined manually.

  • Event time – To search for events as they happen over the time, AWS X-Ray allows searching for traces using event timestamps. Event time returns traces active during the [start_time, end_time) range, regardless of when the trace began.

Use the aws xray get-trace-summaries command to get a list of trace summaries. The following commands get a list of trace summaries from between 1 and 2 minutes in the past using the default TraceId time.

Example Script to get trace summaries
EPOCH=$(date +%s) aws xray get-trace-summaries --start-time $(($EPOCH-120)) --end-time $(($EPOCH-60))
Example GetTraceSummaries output
{ "TraceSummaries": [ { "HasError": false, "Http": { "HttpStatus": 200, "ClientIp": "205.255.255.183", "HttpURL": "http://scorekeep.elasticbeanstalk.com/api/session", "UserAgent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36", "HttpMethod": "POST" }, "Users": [], "HasFault": false, "Annotations": {}, "ResponseTime": 0.084, "Duration": 0.084, "Id": "1-59602606-a43a1ac52fc7ee0eea12a82c", "HasThrottle": false }, { "HasError": false, "Http": { "HttpStatus": 200, "ClientIp": "205.255.255.183", "HttpURL": "http://scorekeep.elasticbeanstalk.com/api/user", "UserAgent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36", "HttpMethod": "POST" }, "Users": [ { "UserName": "5M388M1E" } ], "HasFault": false, "Annotations": { "UserID": [ { "AnnotationValue": { "StringValue": "5M388M1E" } } ], "Name": [ { "AnnotationValue": { "StringValue": "Ola" } } ] }, "ResponseTime": 3.232, "Duration": 3.232, "Id": "1-59602603-23fc5b688855d396af79b496", "HasThrottle": false } ], "ApproximateTime": 1499473304.0, "TracesProcessedCount": 2 }

Use the trace ID from the output to retrieve a full trace with the BatchGetTraces API.

Example BatchGetTraces command
$ aws xray batch-get-traces --trace-ids 1-596025b4-7170afe49f7aa708b1dd4a6b
Example BatchGetTraces output
{ "Traces": [ { "Duration": 3.232, "Segments": [ { "Document": "{\"id\":\"1fb07842d944e714\",\"name\":\"random-name\",\"start_time\":1.499473411677E9,\"end_time\":1.499473414572E9,\"parent_id\":\"0c544c1b1bbff948\",\"http\":{\"response\":{\"status\":200}},\"aws\":{\"request_id\":\"ac086670-6373-11e7-a174-f31b3397f190\"},\"trace_id\":\"1-59602603-23fc5b688855d396af79b496\",\"origin\":\"AWS::Lambda\",\"resource_arn\":\"arn:aws:lambda:us-west-2:123456789012:function:random-name\"}", "Id": "1fb07842d944e714" }, { "Document": "{\"id\":\"194fcc8747581230\",\"name\":\"Scorekeep\",\"start_time\":1.499473411562E9,\"end_time\":1.499473414794E9,\"http\":{\"request\":{\"url\":\"http://scorekeep.elasticbeanstalk.com/api/user\",\"method\":\"POST\",\"user_agent\":\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36\",\"client_ip\":\"205.251.233.183\"},\"response\":{\"status\":200}},\"aws\":{\"elastic_beanstalk\":{\"version_label\":\"app-abb9-170708_002045\",\"deployment_id\":406,\"environment_name\":\"scorekeep-dev\"},\"ec2\":{\"availability_zone\":\"us-west-2c\",\"instance_id\":\"i-0cd9e448944061b4a\"},\"xray\":{\"sdk_version\":\"1.1.2\",\"sdk\":\"X-Ray for Java\"}},\"service\":{},\"trace_id\":\"1-59602603-23fc5b688855d396af79b496\",\"user\":\"5M388M1E\",\"origin\":\"AWS::ElasticBeanstalk::Environment\",\"subsegments\":[{\"id\":\"0c544c1b1bbff948\",\"name\":\"Lambda\",\"start_time\":1.499473411629E9,\"end_time\":1.499473414572E9,\"http\":{\"response\":{\"status\":200,\"content_length\":14}},\"aws\":{\"log_type\":\"None\",\"status_code\":200,\"function_name\":\"random-name\",\"invocation_type\":\"RequestResponse\",\"operation\":\"Invoke\",\"request_id\":\"ac086670-6373-11e7-a174-f31b3397f190\",\"resource_names\":[\"random-name\"]},\"namespace\":\"aws\"},{\"id\":\"071684f2e555e571\",\"name\":\"## UserModel.saveUser\",\"start_time\":1.499473414581E9,\"end_time\":1.499473414769E9,\"metadata\":{\"debug\":{\"test\":\"Metadata string from UserModel.saveUser\"}},\"subsegments\":[{\"id\":\"4cd3f10b76c624b4\",\"name\":\"DynamoDB\",\"start_time\":1.49947341469E9,\"end_time\":1.499473414769E9,\"http\":{\"response\":{\"status\":200,\"content_length\":57}},\"aws\":{\"table_name\":\"scorekeep-user\",\"operation\":\"UpdateItem\",\"request_id\":\"MFQ8CGJ3JTDDVVVASUAAJGQ6NJ82F738BOB4KQNSO5AEMVJF66Q9\",\"resource_names\":[\"scorekeep-user\"]},\"namespace\":\"aws\"}]}]}", "Id": "194fcc8747581230" }, { "Document": "{\"id\":\"00f91aa01f4984fd\",\"name\":\"random-name\",\"start_time\":1.49947341283E9,\"end_time\":1.49947341457E9,\"parent_id\":\"1fb07842d944e714\",\"aws\":{\"function_arn\":\"arn:aws:lambda:us-west-2:123456789012:function:random-name\",\"resource_names\":[\"random-name\"],\"account_id\":\"123456789012\"},\"trace_id\":\"1-59602603-23fc5b688855d396af79b496\",\"origin\":\"AWS::Lambda::Function\",\"subsegments\":[{\"id\":\"e6d2fe619f827804\",\"name\":\"annotations\",\"start_time\":1.499473413012E9,\"end_time\":1.499473413069E9,\"annotations\":{\"UserID\":\"5M388M1E\",\"Name\":\"Ola\"}},{\"id\":\"b29b548af4d54a0f\",\"name\":\"SNS\",\"start_time\":1.499473413112E9,\"end_time\":1.499473414071E9,\"http\":{\"response\":{\"status\":200}},\"aws\":{\"operation\":\"Publish\",\"region\":\"us-west-2\",\"request_id\":\"a2137970-f6fc-5029-83e8-28aadeb99198\",\"retries\":0,\"topic_arn\":\"arn:aws:sns:us-west-2:123456789012:awseb-e-ruag3jyweb-stack-NotificationTopic-6B829NT9V5O9\"},\"namespace\":\"aws\"},{\"id\":\"2279c0030c955e52\",\"name\":\"Initialization\",\"start_time\":1.499473412064E9,\"end_time\":1.499473412819E9,\"aws\":{\"function_arn\":\"arn:aws:lambda:us-west-2:123456789012:function:random-name\"}}]}", "Id": "00f91aa01f4984fd" }, { "Document": "{\"id\":\"17ba309b32c7fbaf\",\"name\":\"DynamoDB\",\"start_time\":1.49947341469E9,\"end_time\":1.499473414769E9,\"parent_id\":\"4cd3f10b76c624b4\",\"inferred\":true,\"http\":{\"response\":{\"status\":200,\"content_length\":57}},\"aws\":{\"table_name\":\"scorekeep-user\",\"operation\":\"UpdateItem\",\"request_id\":\"MFQ8CGJ3JTDDVVVASUAAJGQ6NJ82F738BOB4KQNSO5AEMVJF66Q9\",\"resource_names\":[\"scorekeep-user\"]},\"trace_id\":\"1-59602603-23fc5b688855d396af79b496\",\"origin\":\"AWS::DynamoDB::Table\"}", "Id": "17ba309b32c7fbaf" }, { "Document": "{\"id\":\"1ee3c4a523f89ca5\",\"name\":\"SNS\",\"start_time\":1.499473413112E9,\"end_time\":1.499473414071E9,\"parent_id\":\"b29b548af4d54a0f\",\"inferred\":true,\"http\":{\"response\":{\"status\":200}},\"aws\":{\"operation\":\"Publish\",\"region\":\"us-west-2\",\"request_id\":\"a2137970-f6fc-5029-83e8-28aadeb99198\",\"retries\":0,\"topic_arn\":\"arn:aws:sns:us-west-2:123456789012:awseb-e-ruag3jyweb-stack-NotificationTopic-6B829NT9V5O9\"},\"trace_id\":\"1-59602603-23fc5b688855d396af79b496\",\"origin\":\"AWS::SNS\"}", "Id": "1ee3c4a523f89ca5" } ], "Id": "1-59602603-23fc5b688855d396af79b496" } ], "UnprocessedTraceIds": [] }

The full trace includes a document for each segment, compiled from all of the segment documents received with the same trace ID. These documents don't represent the data as it was sent to X-Ray by your application. Instead, they represent the processed documents generated by the X-Ray service. X-Ray creates the full trace document by compiling segment documents sent by your application, and removing data that doesn't comply with the segment document schema.

X-Ray also creates inferred segments for downstream calls to services that don't send segments themselves. For example, when you call DynamoDB with an instrumented client, the X-Ray SDK records a subsegment with details about the call from its point of view. However, DynamoDB doesn't send a corresponding segment. X-Ray uses the information in the subsegment to create an inferred segment to represent the DynamoDB resource in the trace map, and adds it to the trace document.

To get multiple traces from the API, you need a list of trace IDs, which you can extract from the output of get-trace-summaries with an AWS CLI query. Redirect the list to the input of batch-get-traces to get full traces for a specific time period.

Example Script to get full traces for a one minute period
EPOCH=$(date +%s) TRACEIDS=$(aws xray get-trace-summaries --start-time $(($EPOCH-120)) --end-time $(($EPOCH-60)) --query 'TraceSummaries[*].Id' --output text) aws xray batch-get-traces --trace-ids $TRACEIDS --query 'Traces[*]'

Retrieving and refining root cause analytics

Upon generating a trace summary with the GetTraceSummaries API , partial trace summaries can be reused in their JSON format to create a refined filter expression based upon root causes. See the examples below for a walkthrough of the refinement steps.

Example GetTraceSummaries output - response time root cause section
{ "Services": [ { "Name": "GetWeatherData", "Names": ["GetWeatherData"], "AccountId": 123456789012, "Type": null, "Inferred": false, "EntityPath": [ { "Name": "GetWeatherData", "Coverage": 1.0, 'Remote": false }, { "Name": "get_temperature", "Coverage": 0.8, "Remote": false } ] }, { "Name": "GetTemperature", "Names": ["GetTemperature"], "AccountId": 123456789012, "Type": null, "Inferred": false, "EntityPath": [ { "Name": "GetTemperature", "Coverage": 0.7, "Remote": false } ] } ] }

By editing and making omissions to the above output, this JSON can become a filter for matched root cause entities. For every field present in the JSON, any candidate match must be exact, or the trace will not be returned. Removed fields become wildcard values, a format which is compatible with the filter expression query structure.

Example Reformatted response time root cause
{ "Services": [ { "Name": "GetWeatherData", "EntityPath": [ { "Name": "GetWeatherData" }, { "Name": "get_temperature" } ] }, { "Name": "GetTemperature", "EntityPath": [ { "Name": "GetTemperature" } ] } ] }

This JSON is then used as part of a filter expression through a call to rootcause.json = #[{}]. Refer to the Filter Expressions chapter for more details about querying with filter expressions.

Example JSON filter
rootcause.json = #[{ "Services": [ { "Name": "GetWeatherData", "EntityPath": [{ "Name": "GetWeatherData" }, { "Name": "get_temperature" } ] }, { "Name": "GetTemperature", "EntityPath": [ { "Name": "GetTemperature" } ] } ] }]