Importing data in Amazon SageMaker Canvas - Amazon SageMaker

Importing data in Amazon SageMaker Canvas

You can import data from different data sources into Amazon SageMaker Canvas. The data sources include Amazon S3, your local machine, and external data sources. Your data can have a maximum of 1000 columns. Currently, you can only import comma delimited .csv files. Your .csv files must not have newline characters except when denoting a new row. You can use the dataset that you import to build a model and make predictions on other datasets.

You can import data from the following external data sources:

  • An Amazon S3 bucket from an external account

  • An Amazon Redshift database

  • Snowflake

You can import data for the following data types:

  • Categorical

  • Numeric

  • Text

  • Datetime

To import data from an external data source, create a connection. For more information, see Connect to an external data source.

To import data from multiple files on your local machine or Amazon S3 locations, you import the data from each data source and join them into a single dataset. For information about joining datasets, see Join data that you've imported into SageMaker Canvas.

SageMaker Canvas provides several sample datasets in your application to help you get started. To learn more about the SageMaker-provided sample datasets you can experiment with, see Use sample datasets.