Data connectors guide
You can ingest data from a range of data sources via an intuitive wizard into ADA. Automated Data Analytics on AWS currently supports the following data connectors:
Note
AWS will periodically build and release connectors for additional data sources and integrate connectors built by the open source community through GitHub.
Connector | Description | Preview* | Source Query** |
---|---|---|---|
AWS CloudTrail | Source data from AWS CloudTrail. It supports filtering on importing for different CloudTrail Event Types and also supports importing from a different account. The connector also supports incremental daily import when it is set to scheduled mode. Due to data volumes that could be potentially massive, the automatic PII detection feature is currently unavailable for CloudTrail connector. | X | X |
Amazon CloudWatch | Source data from Amazon CloudWatch logs. The connector queries CloudWatch Logs Groups with the specific CloudWatch query and then imports the result logs into Automated Data Analytics on AWS. It also supports incremental importing by schedule. | X | X |
Amazon Kinesis Data Stream | Source data from an existing Kinesis DataStream within the same account in the same region. If data stream is encrypted using custom managed key, decrypt access is required for the Automated Data Analytics application. | ||
Amazon Redshift | Source data from an existing Amazon Redshift Cluster or Amazon Redshift Serverless table in an AWS account. | ||
Amazon S3 | Source data from existing Amazon S3 data supports the same data file types and formats as AWS Glue. The provided object path must be readable by the Automated Data Analytics on AWS application. | X | X |
DynamoDB | Source data from AWS DynamoDB tables. It supports importing data from DynamoDB tables from a different account. The data is transferred into a S3 bucket that is managed by Automated Data Analytics for AWS. | X | X |
File Upload | Source data from file upload supports .csv, .json, .parquet, and .gz file formats through the UI. The uploaded files are stored in a shared Amazon S3 buckets for all uploads and only accessible through the solution. Once a file is uploaded as source data, it is treated the same as Amazon S3 sourced data. | X | X |
Google Analytics |
Source data from Google Analytics supports import of
analytics dimensions and metrics. It supports both full
import and incremental import. The authentication requires a
service account to be provisioned in order to connect to the
API for continuous import. For details on creating service
account for Google Analytics, refer to
Create
a client ID for Google Analytics API |
X | |
Google BigQuery |
Source data from Google BigQuery supports any queries that
is runnable within BigQuery. Service account that has read
permission to the BigQuery API is required for
authentication through Automated Data Analytics on AWS. To
provision a service account in Google BigQuery, refer to
Managing
service accounts |
X | |
Google Storage | Data from Google Storage supports folder level import. The connector uses RSync API to synchronize between source bucket in Google Storage and destination in shared S3 Bucket managed by Automated Data Analytics on AWS. Data removed from the source bucket will also be removed from the destination. | ||
Microsoft SQL Server | Source data from a Microsoft SQL Server database. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. | X | |
MongoDB | Source data from MongoDB or Amazon DocumentDB. The connector supports server TLS and client certificate. It also offers a bookmark field to support incremental importing. | X | X |
MySQL5 | Source data from MySQL Server 5. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. | X | |
Oracle | Source data from Oracle databases. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. | ||
PostgreSQL | Source data from a PostgreSQL database. The connector uses JDBC to connect to the target database and import the data from the specific table into a S3 bucket managed by Automated Data Analytics on AWS. | X |
* Indicates the source type supports preview feature for fast schema inference and managing schema during full import process.
** Indicates the source type supports source query feature to grant creator direct query access to the original source data through queries.
To connect a data source to the Automated Data Analytics on AWS solution, you will need to create a domain, and then create a data product to import the dataset. You can choose the data source when you create the data product.
Refer to the following sections for more information on the type of connectors and steps involved to ingest data from these sources.
Note
The data connectors guide only shows the steps involved in importing a data source.