CreateDataset - AWS Glue DataBrew


Creates a new DataBrew dataset.

Request Syntax

POST /datasets HTTP/1.1 Content-type: application/json { "Format": "string", "FormatOptions": { "Csv": { "Delimiter": "string", "HeaderRow": boolean }, "Excel": { "HeaderRow": boolean, "SheetIndexes": [ number ], "SheetNames": [ "string" ] }, "Json": { "MultiLine": boolean } }, "Input": { "DatabaseInputDefinition": { "DatabaseTableName": "string", "GlueConnectionName": "string", "QueryString": "string", "TempDirectory": { "Bucket": "string", "BucketOwner": "string", "Key": "string" } }, "DataCatalogInputDefinition": { "CatalogId": "string", "DatabaseName": "string", "TableName": "string", "TempDirectory": { "Bucket": "string", "BucketOwner": "string", "Key": "string" } }, "Metadata": { "SourceArn": "string" }, "S3InputDefinition": { "Bucket": "string", "BucketOwner": "string", "Key": "string" } }, "Name": "string", "PathOptions": { "FilesLimit": { "MaxFiles": number, "Order": "string", "OrderedBy": "string" }, "LastModifiedDateCondition": { "Expression": "string", "ValuesMap": { "string" : "string" } }, "Parameters": { "string" : { "CreateColumn": boolean, "DatetimeOptions": { "Format": "string", "LocaleCode": "string", "TimezoneOffset": "string" }, "Filter": { "Expression": "string", "ValuesMap": { "string" : "string" } }, "Name": "string", "Type": "string" } } }, "Tags": { "string" : "string" } }

URI Request Parameters

The request does not use any URI parameters.

Request Body

The request accepts the following data in JSON format.


Represents information on how DataBrew can find data, in either the AWS Glue Data Catalog or Amazon S3.

Type: Input object

Required: Yes


The name of the dataset to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Required: Yes


The file format of a dataset that is created from an Amazon S3 file or folder.

Type: String

Valid Values: CSV | JSON | PARQUET | EXCEL | ORC

Required: No


Represents a set of options that define the structure of either comma-separated value (CSV), Excel, or JSON input.

Type: FormatOptions object

Required: No


A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.

Type: PathOptions object

Required: No


Metadata tags to apply to this dataset.

Type: String to string map

Map Entries: Maximum number of 200 items.

Key Length Constraints: Minimum length of 1. Maximum length of 128.

Value Length Constraints: Maximum length of 256.

Required: No

Response Syntax

HTTP/1.1 200 Content-type: application/json { "Name": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.


The name of the dataset that you created.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.


For information about the errors that are common to all actions, see Common Errors.


Access to the specified resource was denied.

HTTP Status Code: 403


Updating or deleting a resource can cause an inconsistent state.

HTTP Status Code: 409


A service quota is exceeded.

HTTP Status Code: 402


The input parameters for this request failed validation.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: