CreateFeatureGroup - Amazon SageMaker

CreateFeatureGroup

Create a new FeatureGroup. A FeatureGroup is a group of Features defined in the FeatureStore to describe a Record.

The FeatureGroup defines the schema and features contained in the FeatureGroup. A FeatureGroup definition is composed of a list of Features, a RecordIdentifierFeatureName, an EventTimeFeatureName and configurations for its OnlineStore and OfflineStore. Check AWS service quotas to see the FeatureGroups quota for your AWS account.

Note that it can take approximately 10-15 minutes to provision an OnlineStore FeatureGroup with the InMemory StorageType.

Important

You must include at least one of OnlineStoreConfig and OfflineStoreConfig to create a FeatureGroup.

Request Syntax

{ "Description": "string", "EventTimeFeatureName": "string", "FeatureDefinitions": [ { "CollectionConfig": { ... }, "CollectionType": "string", "FeatureName": "string", "FeatureType": "string" } ], "FeatureGroupName": "string", "OfflineStoreConfig": { "DataCatalogConfig": { "Catalog": "string", "Database": "string", "TableName": "string" }, "DisableGlueTableCreation": boolean, "S3StorageConfig": { "KmsKeyId": "string", "ResolvedOutputS3Uri": "string", "S3Uri": "string" }, "TableFormat": "string" }, "OnlineStoreConfig": { "EnableOnlineStore": boolean, "SecurityConfig": { "KmsKeyId": "string" }, "StorageType": "string", "TtlDuration": { "Unit": "string", "Value": number } }, "RecordIdentifierFeatureName": "string", "RoleArn": "string", "Tags": [ { "Key": "string", "Value": "string" } ], "ThroughputConfig": { "ProvisionedReadCapacityUnits": number, "ProvisionedWriteCapacityUnits": number, "ThroughputMode": "string" } }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

Description

A free-form description of a FeatureGroup.

Type: String

Length Constraints: Maximum length of 128.

Required: No

EventTimeFeatureName

The name of the feature that stores the EventTime of a Record in a FeatureGroup.

An EventTime is a point in time when a new event occurs that corresponds to the creation or update of a Record in a FeatureGroup. All Records in the FeatureGroup must have a corresponding EventTime.

An EventTime can be a String or Fractional.

  • Fractional: EventTime feature values must be a Unix timestamp in seconds.

  • String: EventTime feature values must be an ISO-8601 string in the format. The following formats are supported yyyy-MM-dd'T'HH:mm:ssZ and yyyy-MM-dd'T'HH:mm:ss.SSSZ where yyyy, MM, and dd represent the year, month, and day respectively and HH, mm, ss, and if applicable, SSS represent the hour, month, second and milliseconds respsectively. 'T' and Z are constants.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 64.

Pattern: ^[a-zA-Z0-9]([-_]*[a-zA-Z0-9]){0,63}

Required: Yes

FeatureDefinitions

A list of Feature names and types. Name and Type is compulsory per Feature.

Valid feature FeatureTypes are Integral, Fractional and String.

FeatureNames cannot be any of the following: is_deleted, write_time, api_invocation_time

You can create up to 2,500 FeatureDefinitions per FeatureGroup.

Type: Array of FeatureDefinition objects

Array Members: Minimum number of 1 item. Maximum number of 2500 items.

Required: Yes

FeatureGroupName

The name of the FeatureGroup. The name must be unique within an AWS Region in an AWS account.

The name:

  • Must start and end with an alphanumeric character.

  • Can only include alphanumeric characters, underscores, and hyphens. Spaces are not allowed.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 64.

Pattern: ^[a-zA-Z0-9]([_-]*[a-zA-Z0-9]){0,63}

Required: Yes

OfflineStoreConfig

Use this to configure an OfflineFeatureStore. This parameter allows you to specify:

  • The Amazon Simple Storage Service (Amazon S3) location of an OfflineStore.

  • A configuration for an AWS Glue or AWS Hive data catalog.

  • An KMS encryption key to encrypt the Amazon S3 location used for OfflineStore. If KMS encryption key is not specified, by default we encrypt all data at rest using AWS KMS key. By defining your bucket-level key for SSE, you can reduce AWS KMS requests costs by up to 99 percent.

  • Format for the offline store table. Supported formats are Glue (Default) and Apache Iceberg.

To learn more about this parameter, see OfflineStoreConfig.

Type: OfflineStoreConfig object

Required: No

OnlineStoreConfig

You can turn the OnlineStore on or off by specifying True for the EnableOnlineStore flag in OnlineStoreConfig.

You can also include an AWS KMS key ID (KMSKeyId) for at-rest encryption of the OnlineStore.

The default value is False.

Type: OnlineStoreConfig object

Required: No

RecordIdentifierFeatureName

The name of the Feature whose value uniquely identifies a Record defined in the FeatureStore. Only the latest record per identifier value will be stored in the OnlineStore. RecordIdentifierFeatureName must be one of feature definitions' names.

You use the RecordIdentifierFeatureName to access data in a FeatureStore.

This name:

  • Must start and end with an alphanumeric character.

  • Can only contains alphanumeric characters, hyphens, underscores. Spaces are not allowed.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 64.

Pattern: ^[a-zA-Z0-9]([-_]*[a-zA-Z0-9]){0,63}

Required: Yes

RoleArn

The Amazon Resource Name (ARN) of the IAM execution role used to persist data into the OfflineStore if an OfflineStoreConfig is provided.

Type: String

Length Constraints: Minimum length of 20. Maximum length of 2048.

Pattern: ^arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+$

Required: No

Tags

Tags used to identify Features in each FeatureGroup.

Type: Array of Tag objects

Array Members: Minimum number of 0 items. Maximum number of 50 items.

Required: No

ThroughputConfig

Used to set feature group throughput configuration. There are two modes: ON_DEMAND and PROVISIONED. With on-demand mode, you are charged for data reads and writes that your application performs on your feature group. You do not need to specify read and write throughput because Feature Store accommodates your workloads as they ramp up and down. You can switch a feature group to on-demand only once in a 24 hour period. With provisioned throughput mode, you specify the read and write capacity per second that you expect your application to require, and you are billed based on those limits. Exceeding provisioned throughput will result in your requests being throttled.

Note: PROVISIONED throughput mode is supported only for feature groups that are offline-only, or use the Standard tier online store.

Type: ThroughputConfig object

Required: No

Response Syntax

{ "FeatureGroupArn": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

FeatureGroupArn

The Amazon Resource Name (ARN) of the FeatureGroup. This is a unique identifier for the feature group.

Type: String

Length Constraints: Maximum length of 256.

Pattern: arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:feature-group/.*

Errors

For information about the errors that are common to all actions, see Common Errors.

ResourceInUse

Resource being accessed is in use.

HTTP Status Code: 400

ResourceLimitExceeded

You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: