Managing the training data in your datasets - Amazon Personalize

Managing the training data in your datasets

After you import data into a dataset, you can do the following:

  • Update the data in the dataset as your catalog grows. This helps maintain and improve the relevance of Amazon Personalize recommendations. You can import more data with bulk or individual data import operations. For more information, see Importing more training data into datasets.

  • Analyze the training data in the dataset. You can learn about your data through data insights and column and row statistics. And you can learn what actions you can take to improve your data. These actions can help you meet Amazon Personalize resource requirements, such as model training requirements, or they can lead to improved recommendations. For more information, see Analyzing quality and quantity of data in datasets.

  • Export the data to an Amazon S3 bucket. You might export data to verify and inspect the data that Amazon Personalize uses to generate recommendations, view the item interaction events that you previously recorded in real time, or perform offline analysis on your data. For more information, see Exporting the training data in a dataset to Amazon S3.

  • For Items and Users datasets, you can replace the dataset's schema to add new columns of data. You might replace a dataset's schema if your data structure changed after you created the dataset. For more information, see Replacing a dataset's schema to add new columns.

  • You can delete all of the data in the dataset. Or you can delete you can delete users and their data, including their metadata and interactions data, from a dataset group. For more information, see Deleting users and their data with a data deletion job and Deleting a dataset to delete all of its data.