AWS::DataBrew::Dataset - AWS CloudFormation

AWS::DataBrew::Dataset

Specifies a new DataBrew dataset.

Syntax

To declare this entity in your AWS CloudFormation template, use the following syntax:

JSON

{ "Type" : "AWS::DataBrew::Dataset", "Properties" : { "Format" : String, "FormatOptions" : FormatOptions, "Input" : Input, "Name" : String, "PathOptions" : PathOptions, "Source" : String, "Tags" : [ Tag, ... ] } }

YAML

Type: AWS::DataBrew::Dataset Properties: Format: String FormatOptions: FormatOptions Input: Input Name: String PathOptions: PathOptions Source: String Tags: - Tag

Properties

Format

The file format of a dataset that is created from an Amazon S3 file or folder.

Required: No

Type: String

Allowed values: CSV | JSON | PARQUET | EXCEL | ORC

Update requires: No interruption

FormatOptions

A set of options that define how DataBrew interprets the data in the dataset.

Required: No

Type: FormatOptions

Update requires: No interruption

Input

Information on how DataBrew can find the dataset, in either the AWS Glue Data Catalog or Amazon S3.

Required: Yes

Type: Input

Update requires: No interruption

Name

The unique name of the dataset.

Required: Yes

Type: String

Minimum: 1

Maximum: 255

Update requires: Replacement

PathOptions

A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.

Required: No

Type: PathOptions

Update requires: No interruption

Source

The location of the data for the dataset, either Amazon S3 or the AWS Glue Data Catalog.

Required: No

Type: String

Allowed values: S3 | DATA-CATALOG | DATABASE

Update requires: No interruption

Tags

Metadata tags that have been applied to the dataset.

Required: No

Type: Array of Tag

Update requires: Replacement

Return values

Ref

When you pass the logical ID of this resource to the intrinsic Ref function, Ref returns the resource name. For example:

{ "Ref": "myDataset" }

For an AWS Glue DataBrew dataset named myDataset, Ref returns the name of the dataset.

Examples

Creating datasets

The following examples create new DataBrew datasets.

YAML

Resources: TestDataBrewDataset: Type: AWS::DataBrew::Dataset Properties: Name: dataset-name Input: S3InputDefinition: Bucket: !Join [ '', ['databrew-cfn-integration-tests-', !Ref 'AWS::Region', '-', !Ref 'AWS::AccountId' ] ] Key: cocktails.json FormatOptions: Json: MultiLine: True

JSON

{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "This CloudFormation template specifies a DataBrew Dataset", "Resources": { "TestDataBrewDataset": { "Type": "AWS::DataBrew::Dataset", "Properties": { "Name": "cf-test-dataset1", "Input": { "S3InputDefinition": { "Bucket": "test-location", "Key": "test.xlsx" } }, "FormatOptions": { "Excel": { "SheetNames": ["test"] } }, "Tags": [ { "Key": "key00AtCreate", "Value": "value001AtCreate" } ] } } } }