Step 1: Creating a dataset group - Amazon Personalize

Step 1: Creating a dataset group

A dataset group is container for Amazon Personalize components, including datasets, event trackers, solutions, filters, campaigns, and batch inference jobs. A dataset group organizes your resources into independent collections, so resources from one dataset group cannot influence resources in any other dataset group.

For example, you might have an application that provides recommendations for streaming video and another that provides recommendations for audio books. In Amazon Personalize, each application would have its own dataset group. You can create a dataset group with the Amazon Personalize console, AWS Command Line Interface (AWS CLI) or AWS SDKs.

Creating a dataset group (console)

Create a dataset group by specifying the dataset group name in the Amazon Personalize console.

To create a dataset group

  1. Open the Amazon Personalize console at https://console.aws.amazon.com/personalize/home and sign in to your account.

  2. Choose Create dataset group.

  3. If this is your first time using Amazon Personalize, on the Create dataset group page, in New dataset group, choose Get started.

  4. In Dataset group details, for Dataset group name, specify a name for your dataset group.

  5. Choose Next. The Create user-item interaction data page displays. You are now ready to add a dataset with an associated schema to your dataset group. See Creating a dataset and a schema (console).

Creating a dataset group (AWS CLI)

Create a dataset group with the following command. For more information about the CreateDatasetGroup API operation, see CreateDatasetGroup in the API reference section.

aws personalize create-dataset-group --name dataset group name

The dataset group Amazon Resource Name (ARN) is displayed as shown in the following example.

{ "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/DatasetGroupName" }

Record this value for future use. To display the dataset group that you created, use the describe-dataset-group command and specify the returned dataset group ARN.

aws personalize describe-dataset-group \ --dataset-group-arn dataset group arn

The dataset group and its properties are displayed, as shown in the following example.

{ "datasetGroup": { "name": "DatasetGroupName", "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/DatasetGroupName", "status": "ACTIVE", "creationDateTime": 1542392161.262, "lastUpdatedDateTime": 1542396513.377 } }

When the dataset group's status is ACTIVE, proceed to Creating a dataset and a schema (AWS CLI).

Creating a dataset group (AWS SDKs)

The following code shows how to create a dataset group with the AWS SDK for Python (Boto3) or the SDK for Java 2.x. For more information about the API operation, see CreateDatasetGroup in the API reference section.

SDK for Python (Boto3)
import boto3 personalize = boto3.client('personalize') response = personalize.create_dataset_group(name = 'dataset group name') dsg_arn = response['datasetGroupArn'] description = personalize.describe_dataset_group(datasetGroupArn = dsg_arn)['datasetGroup'] print('Name: ' + description['name']) print('ARN: ' + description['datasetGroupArn']) print('Status: ' + description['status'])
SDK for Java 2.x
public static void createDatasetGroup(PersonalizeClient personalizeClient, String datasetGroupName) { long waitInMilliseconds = 60 * 1000; try { CreateDatasetGroupRequest createDatasetGroupRequest = CreateDatasetGroupRequest.builder() .name(datasetGroupName) .build(); String datasetGroupArn = personalizeClient.createDatasetGroup(createDatasetGroupRequest) .datasetGroupArn(); long maxTime = Instant.now().getEpochSecond() + (15 * 60); // 15 minutes DescribeDatasetGroupRequest describeRequest = DescribeDatasetGroupRequest.builder() .datasetGroupArn(datasetGroupArn) .build(); String status = null; while (Instant.now().getEpochSecond() < maxTime) { status = personalizeClient.describeDatasetGroup(describeRequest) .datasetGroup() .status(); System.out.println("DatasetGroup status:" + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } } catch(PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } }

The DescribeDatasetGroup operation returns the datasetGroupArn and the status of the operation. When the dataset group's status is ACTIVE, proceed to Creating a dataset and a schema (AWS SDKs).