Syntax Properties Return values Examples

AWS::DataBrew::Dataset

Specifies a new DataBrew dataset.

Syntax

To declare this entity in your CloudFormation template, use the following syntax:

JSON


{
  "Type" : "AWS::DataBrew::Dataset",
  "Properties" : {
      "Format" : String,
      "FormatOptions" : FormatOptions,
      "Input" : Input,
      "Name" : String,
      "PathOptions" : PathOptions,
      "Source" : String,
      "Tags" : [ Tag, ... ]
    }
}

YAML


Type: AWS::DataBrew::Dataset
Properties:
  Format: String
  FormatOptions: 
    FormatOptions
  Input: 
    Input
  Name: String
  PathOptions: 
    PathOptions
  Source: String
  Tags: 
    - Tag

Properties

Format

The file format of a dataset that is created from an Amazon S3 file or folder.

Required: No

Type: String

Allowed values: CSV | JSON | PARQUET | EXCEL | ORC

Update requires: No interruption

FormatOptions

A set of options that define how DataBrew interprets the data in the dataset.

Required: No

Type: FormatOptions

Update requires: No interruption

Input

Information on how DataBrew can find the dataset, in either the AWS Glue Data Catalog or Amazon S3.

Required: Yes

Type: Input

Update requires: No interruption

Name

The unique name of the dataset.

Required: Yes

Type: String

Minimum: 1

Maximum: 255

Update requires: Replacement

PathOptions

A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.

Required: No

Type: PathOptions

Update requires: No interruption

Source

The location of the data for the dataset, either Amazon S3 or the AWS Glue Data Catalog.

Required: No

Type: String

Allowed values: S3 | DATA-CATALOG | DATABASE

Update requires: No interruption

Tags

Metadata tags that have been applied to the dataset.

Required: No

Type: Array of Tag

Update requires: No interruption

Return values

Ref

When you pass the logical ID of this resource to the intrinsic Ref function, Ref returns the resource name. For example:

{ "Ref": "myDataset" }

For an AWS Glue DataBrew dataset named myDataset, Ref returns the name of the dataset.

Examples

Creating datasets

The following examples create new DataBrew datasets.

YAML



Resources:
  TestDataBrewDataset:
    Type: AWS::DataBrew::Dataset
    Properties:
      Name: dataset-name
      Input:
        S3InputDefinition:
          Bucket: !Join [ '', ['databrew-cfn-integration-tests-', !Ref 'AWS::Region', '-', !Ref 'AWS::AccountId' ] ]
          Key: cocktails.json
      FormatOptions:
        Json:
          MultiLine: True

JSON



{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Description": "This CloudFormation template specifies a DataBrew Dataset",
    "Resources": {
    "TestDataBrewDataset": {
      "Type": "AWS::DataBrew::Dataset",
      "Properties": {
        "Name": "cf-test-dataset1",
        "Input": {
          "S3InputDefinition": {
            "Bucket": "test-location",
            "Key": "test.xlsx"
          }
        },
        "FormatOptions": {
          "Excel": {
            "SheetNames": ["test"]
          }
        },
        "Tags": [
                    {
                        "Key": "key00AtCreate",
                        "Value": "value001AtCreate"
                    }
                ]
      }
    }
  }
}

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

AWS Glue DataBrew

CsvOptions