AWS::DataBrew::Dataset
Specifies a new DataBrew dataset.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "Type" : "AWS::DataBrew::Dataset", "Properties" : { "Format" :
String
, "FormatOptions" :FormatOptions
, "Input" :Input
, "Name" :String
, "PathOptions" :PathOptions
, "Source" :String
, "Tags" :[ Tag, ... ]
} }
YAML
Type: AWS::DataBrew::Dataset Properties: Format:
String
FormatOptions:FormatOptions
Input:Input
Name:String
PathOptions:PathOptions
Source:String
Tags:- Tag
Properties
Format
-
The file format of a dataset that is created from an Amazon S3 file or folder.
Required: No
Type: String
Allowed values:
CSV | JSON | PARQUET | EXCEL | ORC
Update requires: No interruption
FormatOptions
-
A set of options that define how DataBrew interprets the data in the dataset.
Required: No
Type: FormatOptions
Update requires: No interruption
Input
-
Information on how DataBrew can find the dataset, in either the AWS Glue Data Catalog or Amazon S3.
Required: Yes
Type: Input
Update requires: No interruption
Name
-
The unique name of the dataset.
Required: Yes
Type: String
Minimum:
1
Maximum:
255
Update requires: Replacement
PathOptions
-
A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.
Required: No
Type: PathOptions
Update requires: No interruption
Source
-
The location of the data for the dataset, either Amazon S3 or the AWS Glue Data Catalog.
Required: No
Type: String
Allowed values:
S3 | DATA-CATALOG | DATABASE
Update requires: No interruption
-
Metadata tags that have been applied to the dataset.
Required: No
Type: Array of Tag
Update requires: Replacement
Return values
Ref
When you pass the logical ID of this resource to the intrinsic Ref
function, Ref
returns the resource name. For example:
{ "Ref": "myDataset" }
For an AWS Glue DataBrew dataset named myDataset
,
Ref
returns the name of the dataset.
Examples
Creating datasets
The following examples create new DataBrew datasets.
YAML
Resources: TestDataBrewDataset: Type: AWS::DataBrew::Dataset Properties: Name: dataset-name Input: S3InputDefinition: Bucket: !Join [ '', ['databrew-cfn-integration-tests-', !Ref 'AWS::Region', '-', !Ref 'AWS::AccountId' ] ] Key: cocktails.json FormatOptions: Json: MultiLine: True
JSON
{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "This CloudFormation template specifies a DataBrew Dataset", "Resources": { "TestDataBrewDataset": { "Type": "AWS::DataBrew::Dataset", "Properties": { "Name": "cf-test-dataset1", "Input": { "S3InputDefinition": { "Bucket": "test-location", "Key": "test.xlsx" } }, "FormatOptions": { "Excel": { "SheetNames": ["test"] } }, "Tags": [ { "Key": "key00AtCreate", "Value": "value001AtCreate" } ] } } } }