Building the data ingestion pipeline for your Amazon selling partner data
This section provides a strategy to ingests Amazon vendor and seller data from the Amazon Selling Partner API (SP-API) to a data lake in your AWS account. This data pipeline architecture is designed for agility. After the data is available in your account, you can implement analytics and generative AI capabilities to obtain advanced business insights from this data. This data helps you understand your business, inventory details, and analytics at scale across all marketplaces.
The following architecture diagram shows how you use AWS Lambda functions in an AWS Step Functions workflow in order to ingest data from the SP-API into a data lake in your AWS account. The data is stored in Amazon Simple Storage Service (Amazon S3) and in Parameter Store, which is a capability of AWS Systems Manager.

The architecture diagram includes the following components:
-
Step Functions is used as a serverless orchestration service to centrally manage the workflow for integrating with the SP-API.
-
The Selling Partner API for Reports
(Reports API) supports notifications to automate the report workflows. For this, you use an SP-API notification Lambda function to subscribe the application to the REPORT_PROCESSING_FINISHED
notification type. -
In order to make calls to the SP-API, you use an Authentication Lambda function to obtain a Login with Amazon (LWA) access token.
-
The LWA access token from the authentication function is passed to a Report creator Lambda function. This function makes a
createReport
call to the SP-API by using the LWA access token and the regional endpoints, marketplace IDs, and report configurations data that is stored in Parameter Store. -
The SP-API generates the report. Upon completion, a
REPORT_PROCESSING_FINISHED
notification event is sent to an Amazon Simple Queue Service (Amazon SQS) queue, which provides information when report processing isCANCELLED
,DONE
, orFATAL
. This triggers a Notification processing Lambda function to process the event. If the notification event has a status ofDONE
, areportDocumentId
is included. -
The notification event is passed to a Data processing Lambda function in the Step Functions workflow. This function uses the
reportDocumentId
to make agetReportDocument
call to the SP-API. The SP-API returns a pre-signed URL for the location of the report document and the compression algorithm used, if the report document contents have been compressed. -
This response is passed to a Storage Lambda function, which downloads the report document, decompresses it (if applicable), and stores the report document in Amazon S3.
-
AWS Key Management Service (AWS KMS) is used to centrally manage encryption keys, which can be used to encrypt the secrets in AWS Secrets Manager. Data is stored in Amazon S3 and Parameter Store.
-
SP-API requests are limited by using the token bucket algorithm. Therefore, an API client is recommended for rate limiting.
-
AWS CloudTrail and Amazon CloudWatch are used for monitoring and logging across the AWS services. These logs provide traceability.