Table Of Contents

Feedback

User Guide

First time using the AWS CLI? See the User Guide for help getting started.

Note: You are viewing the documentation for an older major version of the AWS CLI (version 1).

AWS CLI version 2, the latest major version of AWS CLI, is now stable and recommended for general use. To view this page for the AWS CLI version 2, click here. For more information see the AWS CLI version 2 installation instructions and migration guide.

[ aws . databrew ]

describe-dataset

Description

Returns the definition of a specific DataBrew dataset.

See also: AWS API Documentation

See 'aws help' for descriptions of global parameters.

Synopsis

  describe-dataset
--name <value>
[--cli-input-json <value>]
[--generate-cli-skeleton <value>]

Options

--name (string)

The name of the dataset to be described.

--cli-input-json (string) Performs service operation based on the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, the CLI values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally.

--generate-cli-skeleton (string) Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input, prints a sample input JSON that can be used as an argument for --cli-input-json. If provided with the value output, it validates the command inputs and returns a sample output JSON for that command.

See 'aws help' for descriptions of global parameters.

Output

CreatedBy -> (string)

The identifier (user name) of the user who created the dataset.

CreateDate -> (timestamp)

The date and time that the dataset was created.

Name -> (string)

The name of the dataset.

Format -> (string)

The file format of a dataset that is created from an S3 file or folder.

FormatOptions -> (structure)

Represents a set of options that define the structure of either comma-separated value (CSV), Excel, or JSON input.

Json -> (structure)

Options that define how JSON input is to be interpreted by DataBrew.

MultiLine -> (boolean)

A value that specifies whether JSON input contains embedded new line characters.

Excel -> (structure)

Options that define how Excel input is to be interpreted by DataBrew.

SheetNames -> (list)

One or more named sheets in the Excel file that will be included in the dataset.

(string)

SheetIndexes -> (list)

One or more sheet numbers in the Excel file that will be included in the dataset.

(integer)

HeaderRow -> (boolean)

A variable that specifies whether the first row in the file is parsed as the header. If this value is false, column names are auto-generated.

Csv -> (structure)

Options that define how CSV input is to be interpreted by DataBrew.

Delimiter -> (string)

A single character that specifies the delimiter being used in the CSV file.

HeaderRow -> (boolean)

A variable that specifies whether the first row in the file is parsed as the header. If this value is false, column names are auto-generated.

Input -> (structure)

Represents information on how DataBrew can find data, in either the AWS Glue Data Catalog or Amazon S3.

S3InputDefinition -> (structure)

The Amazon S3 location where the data is stored.

Bucket -> (string)

The S3 bucket name.

Key -> (string)

The unique name of the object in the bucket.

DataCatalogInputDefinition -> (structure)

The AWS Glue Data Catalog parameters for the data.

CatalogId -> (string)

The unique identifier of the AWS account that holds the Data Catalog that stores the data.

DatabaseName -> (string)

The name of a database in the Data Catalog.

TableName -> (string)

The name of a database table in the Data Catalog. This table corresponds to a DataBrew dataset.

TempDirectory -> (structure)

An Amazon location that AWS Glue Data Catalog can use as a temporary directory.

Bucket -> (string)

The S3 bucket name.

Key -> (string)

The unique name of the object in the bucket.

DatabaseInputDefinition -> (structure)

Connection information for dataset input files stored in a database.

GlueConnectionName -> (string)

The AWS Glue Connection that stores the connection information for the target database.

DatabaseTableName -> (string)

The table within the target database.

TempDirectory -> (structure)

Represents an Amazon S3 location (bucket name and object key) where DataBrew can read input data, or write output from a job.

Bucket -> (string)

The S3 bucket name.

Key -> (string)

The unique name of the object in the bucket.

LastModifiedDate -> (timestamp)

The date and time that the dataset was last modified.

LastModifiedBy -> (string)

The identifier (user name) of the user who last modified the dataset.

Source -> (string)

The location of the data for this dataset, Amazon S3 or the AWS Glue Data Catalog.

PathOptions -> (structure)

A set of options that defines how DataBrew interprets an S3 path of the dataset.

LastModifiedDateCondition -> (structure)

If provided, this structure defines a date range for matching S3 objects based on their LastModifiedDate attribute in S3.

Expression -> (string)

The expression which includes condition names followed by substitution variables, possibly grouped and combined with other conditions. For example, "(starts_with :prefix1 or starts_with :prefix2) and (ends_with :suffix1 or ends_with :suffix2)". Substitution variables should start with ':' symbol.

ValuesMap -> (map)

The map of substitution variable names to their values used in this filter expression.

key -> (string)

value -> (string)

FilesLimit -> (structure)

If provided, this structure imposes a limit on a number of files that should be selected.

MaxFiles -> (integer)

The number of S3 files to select.

OrderedBy -> (string)

A criteria to use for S3 files sorting before their selection. By default uses LAST_MODIFIED_DATE as a sorting criteria. Currently it's the only allowed value.

Order -> (string)

A criteria to use for S3 files sorting before their selection. By default uses DESCENDING order, i.e. most recent files are selected first. Anotherpossible value is ASCENDING.

Parameters -> (map)

A structure that maps names of parameters used in the S3 path of a dataset to their definitions.

key -> (string)

value -> (structure)

Represents a dataset paramater that defines type and conditions for a parameter in the S3 path of the dataset.

Name -> (string)

The name of the parameter that is used in the dataset's S3 path.

Type -> (string)

The type of the dataset parameter, can be one of a 'String', 'Number' or 'Datetime'.

DatetimeOptions -> (structure)

Additional parameter options such as a format and a timezone. Required for datetime parameters.

Format -> (string)

Required option, that defines the datetime format used for a date parameter in the S3 path. Should use only supported datetime specifiers and separation characters, all litera a-z or A-Z character should be escaped with single quotes. E.g. "MM.dd.yyyy-'at'-HH:mm".

TimezoneOffset -> (string)

Optional value for a timezone offset of the datetime parameter value in the S3 path. Shouldn't be used if Format for this parameter includes timezone fields. If no offset specified, UTC is assumed.

LocaleCode -> (string)

Optional value for a non-US locale code, needed for correct interpretation of some date formats.

CreateColumn -> (boolean)

Optional boolean value that defines whether the captured value of this parameter should be loaded as an additional column in the dataset.

Filter -> (structure)

The optional filter expression structure to apply additional matching criteria to the parameter.

Expression -> (string)

The expression which includes condition names followed by substitution variables, possibly grouped and combined with other conditions. For example, "(starts_with :prefix1 or starts_with :prefix2) and (ends_with :suffix1 or ends_with :suffix2)". Substitution variables should start with ':' symbol.

ValuesMap -> (map)

The map of substitution variable names to their values used in this filter expression.

key -> (string)

value -> (string)

Tags -> (map)

Metadata tags associated with this dataset.

key -> (string)

value -> (string)

ResourceArn -> (string)

The Amazon Resource Name (ARN) of the dataset.