Step 1: Creating a Custom dataset group - Amazon Personalize

Step 1: Creating a Custom dataset group

A Custom dataset group is container for Amazon Personalize components and custom resources, including datasets, event trackers, solutions, filters, campaigns, and batch inference jobs. A dataset group organizes your resources into independent collections, so resources from one dataset group cannot influence resources in any other dataset group.

For example, you might have an application that provides recommendations for streaming video and another that provides recommendations for audio books. In Amazon Personalize, each application would have its own dataset group. You can create a dataset group with the Amazon Personalize console, AWS Command Line Interface (AWS CLI) or AWS SDKs.

Creating a dataset group (console)

Create a dataset group by specifying the dataset group name in the Amazon Personalize console.

To create a dataset group
  1. Open the Amazon Personalize console at https://console.aws.amazon.com/personalize/home and sign in to your account.

  2. Choose Create dataset group.

  3. If this is your first time using Amazon Personalize, on the Create dataset group page, in New dataset group, choose Get started.

  4. In Dataset group details, for Dataset group name, specify a name for your dataset group.

  5. For Domain choose Custom.

  6. For Tags, optionally add any tags. For more information about tagging Amazon Personalize resources, see Tagging Amazon Personalize resources.

  7. Choose Next. The Create user-item interaction data page displays. You are now ready to add a dataset with an associated schema to your dataset group. See Creating a dataset and a schema (console).

Creating a dataset group (AWS CLI)

Create a dataset group with the following command. For more information about the CreateDatasetGroup API operation, see CreateDatasetGroup in the API reference section. You can use the Tags parameter to optionally tag resources in Amazon Personalize. For a sample see Adding tags (AWS CLI).

aws personalize create-dataset-group --name dataset group name

The dataset group Amazon Resource Name (ARN) is displayed as shown in the following example.

{ "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/DatasetGroupName" }

Record this value for future use. To display the dataset group that you created, use the describe-dataset-group command and specify the returned dataset group ARN.

aws personalize describe-dataset-group \ --dataset-group-arn dataset group arn

The dataset group and its properties are displayed, as shown in the following example.

{ "datasetGroup": { "name": "DatasetGroupName", "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/DatasetGroupName", "status": "ACTIVE", "creationDateTime": 1542392161.262, "lastUpdatedDateTime": 1542396513.377 } }

When the dataset group's status is ACTIVE, proceed to Creating a dataset and a schema (AWS CLI).

Creating a dataset group (AWS SDKs)

The following code shows how to create a Custom dataset group. For more information about the API operation, see CreateDatasetGroup in the API reference section. You can use the Tags parameter to optionally tag resources in Amazon Personalize. For a sample see Adding tags (AWS SDKs).

SDK for Python (Boto3)
import boto3 personalize = boto3.client('personalize') response = personalize.create_dataset_group(name = 'dataset group name') dsg_arn = response['datasetGroupArn'] description = personalize.describe_dataset_group(datasetGroupArn = dsg_arn)['datasetGroup'] print('Name: ' + description['name']) print('ARN: ' + description['datasetGroupArn']) print('Status: ' + description['status'])
SDK for Java 2.x
public static void createDatasetGroup(PersonalizeClient personalizeClient, String datasetGroupName) { long waitInMilliseconds = 60 * 1000; try { CreateDatasetGroupRequest createDatasetGroupRequest = CreateDatasetGroupRequest.builder() .name(datasetGroupName) .build(); String datasetGroupArn = personalizeClient.createDatasetGroup(createDatasetGroupRequest) .datasetGroupArn(); long maxTime = Instant.now().getEpochSecond() + (15 * 60); // 15 minutes DescribeDatasetGroupRequest describeRequest = DescribeDatasetGroupRequest.builder() .datasetGroupArn(datasetGroupArn) .build(); String status = null; while (Instant.now().getEpochSecond() < maxTime) { status = personalizeClient.describeDatasetGroup(describeRequest) .datasetGroup() .status(); System.out.println("DatasetGroup status:" + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } } catch(PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } }
SDK for JavaScript v3
// Get service clients module and commands using ES6 syntax. import { CreateDatasetGroupCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the dataset group parameters. export const createDatasetGroupParam = { name: 'NAME' /* required */ } export const run = async (createDatasetGroupParam) => { try { const response = await personalizeClient.send(new CreateDatasetGroupCommand(createDatasetGroupParam)); console.log("Success", response); return "Run successfully"; // For unit tests. } catch (err) { console.log("Error", err); } }; run(createDatasetGroupParam);

The DescribeDatasetGroup operation returns the datasetGroupArn and the status of the operation. When the dataset group's status is ACTIVE, proceed to Creating a dataset and a schema (AWS SDKs).