Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS::DataBrew::Dataset

Focus mode
AWS::DataBrew::Dataset - AWS CloudFormation
Filter View

Specifies a new DataBrew dataset.

Syntax

To declare this entity in your AWS CloudFormation template, use the following syntax:

JSON

{ "Type" : "AWS::DataBrew::Dataset", "Properties" : { "Format" : String, "FormatOptions" : FormatOptions, "Input" : Input, "Name" : String, "PathOptions" : PathOptions, "Source" : String, "Tags" : [ Tag, ... ] } }

YAML

Type: AWS::DataBrew::Dataset Properties: Format: String FormatOptions: FormatOptions Input: Input Name: String PathOptions: PathOptions Source: String Tags: - Tag

Properties

Format

The file format of a dataset that is created from an Amazon S3 file or folder.

Required: No

Type: String

Allowed values: CSV | JSON | PARQUET | EXCEL | ORC

Update requires: No interruption

FormatOptions

A set of options that define how DataBrew interprets the data in the dataset.

Required: No

Type: FormatOptions

Update requires: No interruption

Input

Information on how DataBrew can find the dataset, in either the AWS Glue Data Catalog or Amazon S3.

Required: Yes

Type: Input

Update requires: No interruption

Name

The unique name of the dataset.

Required: Yes

Type: String

Minimum: 1

Maximum: 255

Update requires: Replacement

PathOptions

A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.

Required: No

Type: PathOptions

Update requires: No interruption

Source

The location of the data for the dataset, either Amazon S3 or the AWS Glue Data Catalog.

Required: No

Type: String

Allowed values: S3 | DATA-CATALOG | DATABASE

Update requires: No interruption

Tags

Metadata tags that have been applied to the dataset.

Required: No

Type: Array of Tag

Update requires: No interruption

Return values

Ref

When you pass the logical ID of this resource to the intrinsic Ref function, Ref returns the resource name. For example:

{ "Ref": "myDataset" }

For an AWS Glue DataBrew dataset named myDataset, Ref returns the name of the dataset.

Examples

Creating datasets

The following examples create new DataBrew datasets.

YAML

Resources: TestDataBrewDataset: Type: AWS::DataBrew::Dataset Properties: Name: dataset-name Input: S3InputDefinition: Bucket: !Join [ '', ['databrew-cfn-integration-tests-', !Ref 'AWS::Region', '-', !Ref 'AWS::AccountId' ] ] Key: cocktails.json FormatOptions: Json: MultiLine: True

JSON

{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "This CloudFormation template specifies a DataBrew Dataset", "Resources": { "TestDataBrewDataset": { "Type": "AWS::DataBrew::Dataset", "Properties": { "Name": "cf-test-dataset1", "Input": { "S3InputDefinition": { "Bucket": "test-location", "Key": "test.xlsx" } }, "FormatOptions": { "Excel": { "SheetNames": ["test"] } }, "Tags": [ { "Key": "key00AtCreate", "Value": "value001AtCreate" } ] } } } }

On this page

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.