Data connectors guide - Automated Data Analytics on AWS

Data connectors guide

You can ingest data from a range of data sources via an intuitive wizard into ADA. Automated Data Analytics on AWS currently supports the following data connectors:

Note

AWS will periodically build and release connectors for additional data sources and integrate connectors built by the open source community through GitHub.

Connector Description Preview* Source Query**
AWS CloudTrail Source data from AWS CloudTrail. It supports filtering on importing for different CloudTrail Event Types and also supports importing from a different account. The connector also supports incremental daily import when it is set to scheduled mode. Due to data volumes that could be potentially massive, the automatic PII detection feature is currently unavailable for CloudTrail connector. X X
Amazon CloudWatch Source data from Amazon CloudWatch logs. The connector queries CloudWatch Logs Groups with the specific CloudWatch query and then imports the result logs into Automated Data Analytics on AWS. It also supports incremental importing by schedule. X X
Amazon Kinesis Data Stream Source data from an existing Kinesis DataStream within the same account in the same region. If data stream is encrypted using custom managed key, decrypt access is required for the Automated Data Analytics application.
Amazon Redshift Source data from an existing Amazon Redshift Cluster or Amazon Redshift Serverless table in an AWS account.
Amazon S3 Source data from existing Amazon S3 data supports the same data file types and formats as AWS Glue. The provided object path must be readable by the Automated Data Analytics on AWS application. X X
DynamoDB Source data from AWS DynamoDB tables. It supports importing data from DynamoDB tables from a different account. The data is transferred into a S3 bucket that is managed by Automated Data Analytics for AWS. X X
File Upload Source data from file upload supports .csv, .json, .parquet, and .gz file formats through the UI. The uploaded files are stored in a shared Amazon S3 buckets for all uploads and only accessible through the solution. Once a file is uploaded as source data, it is treated the same as Amazon S3 sourced data. X X
Google Analytics Source data from Google Analytics supports import of analytics dimensions and metrics. It supports both full import and incremental import. The authentication requires a service account to be provisioned in order to connect to the API for continuous import. For details on creating service account for Google Analytics, refer to Create a client ID for Google Analytics API topic. X
Google BigQuery Source data from Google BigQuery supports any queries that is runnable within BigQuery. Service account that has read permission to the BigQuery API is required for authentication through Automated Data Analytics on AWS. To provision a service account in Google BigQuery, refer to Managing service accounts page for more details. X
Google Storage Data from Google Storage supports folder level import. The connector uses RSync API to synchronize between source bucket in Google Storage and destination in shared S3 Bucket managed by Automated Data Analytics on AWS. Data removed from the source bucket will also be removed from the destination.
Microsoft SQL Server Source data from a Microsoft SQL Server database. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. X
MongoDB Source data from MongoDB or Amazon DocumentDB. The connector supports server TLS and client certificate. It also offers a bookmark field to support incremental importing. X X
MySQL5 Source data from MySQL Server 5. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. X
Oracle Source data from Oracle databases. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS.
PostgreSQL Source data from a PostgreSQL database. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. X

* Indicates the source type supports preview feature for fast schema inference and managing schema during full import process.

** Indicates the source type supports source query feature to grant creator direct query access to the original source data through queries.

To connect a data source to the Automated Data Analytics on AWS solution, you will need to create a domain, and then create a data product to import the dataset. You can choose the data source when you create the data product.

Refer to the following sections for more information on the type of connectors and steps involved to ingest data from these sources.

Note

The data connectors guide only shows the steps involved in importing a data source.