GetDataSource
Returns a DataSource
that includes metadata and data file information, as well as the current status of the DataSource
.
GetDataSource
provides results in normal or verbose format. The verbose format
adds the schema description and the list of files pointed to by the DataSource to the normal format.
Request Syntax
{
"DataSourceId": "string
",
"Verbose": boolean
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- DataSourceId
-
The ID assigned to the
DataSource
at creation.Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
[a-zA-Z0-9_.-]+
Required: Yes
- Verbose
-
Specifies whether the
GetDataSource
operation should returnDataSourceSchema
.If true,
DataSourceSchema
is returned.If false,
DataSourceSchema
is not returned.Type: Boolean
Required: No
Response Syntax
{
"ComputeStatistics": boolean,
"ComputeTime": number,
"CreatedAt": number,
"CreatedByIamUser": "string",
"DataLocationS3": "string",
"DataRearrangement": "string",
"DataSizeInBytes": number,
"DataSourceId": "string",
"DataSourceSchema": "string",
"FinishedAt": number,
"LastUpdatedAt": number,
"LogUri": "string",
"Message": "string",
"Name": "string",
"NumberOfFiles": number,
"RDSMetadata": {
"Database": {
"DatabaseName": "string",
"InstanceIdentifier": "string"
},
"DatabaseUserName": "string",
"DataPipelineId": "string",
"ResourceRole": "string",
"SelectSqlQuery": "string",
"ServiceRole": "string"
},
"RedshiftMetadata": {
"DatabaseUserName": "string",
"RedshiftDatabase": {
"ClusterIdentifier": "string",
"DatabaseName": "string"
},
"SelectSqlQuery": "string"
},
"RoleARN": "string",
"StartedAt": number,
"Status": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- ComputeStatistics
-
The parameter is
true
if statistics need to be generated from the observation data.Type: Boolean
- ComputeTime
-
The approximate CPU time in milliseconds that Amazon Machine Learning spent processing the
DataSource
, normalized and scaled on computation resources.ComputeTime
is only available if theDataSource
is in theCOMPLETED
state and theComputeStatistics
is set to true.Type: Long
- CreatedAt
-
The time that the
DataSource
was created. The time is expressed in epoch time.Type: Timestamp
- CreatedByIamUser
-
The AWS user account from which the
DataSource
was created. The account type can be either an AWS root account or an AWS Identity and Access Management (IAM) user account.Type: String
Pattern:
arn:aws:iam::[0-9]+:((user/.+)|(root))
- DataLocationS3
-
The location of the data file or directory in Amazon Simple Storage Service (Amazon S3).
Type: String
Length Constraints: Maximum length of 2048.
Pattern:
s3://([^/]+)(/.*)?
- DataRearrangement
-
A JSON string that represents the splitting and rearrangement requirement used when this
DataSource
was created.Type: String
- DataSizeInBytes
-
The total size of observations in the data files.
Type: Long
- DataSourceId
-
The ID assigned to the
DataSource
at creation. This value should be identical to the value of theDataSourceId
in the request.Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
[a-zA-Z0-9_.-]+
- DataSourceSchema
-
The schema used by all of the data files of this
DataSource
.Note: This parameter is provided as part of the verbose format.
Type: String
Length Constraints: Maximum length of 131071.
- FinishedAt
-
The epoch time when Amazon Machine Learning marked the
DataSource
asCOMPLETED
orFAILED
.FinishedAt
is only available when theDataSource
is in theCOMPLETED
orFAILED
state.Type: Timestamp
- LastUpdatedAt
-
The time of the most recent edit to the
DataSource
. The time is expressed in epoch time.Type: Timestamp
- LogUri
-
A link to the file containing logs of
CreateDataSourceFrom*
operations.Type: String
- Message
-
The user-supplied description of the most recent details about creating the
DataSource
.Type: String
Length Constraints: Maximum length of 10240.
- Name
-
A user-supplied name or description of the
DataSource
.Type: String
Length Constraints: Maximum length of 1024.
Pattern:
.*\S.*|^$
- NumberOfFiles
-
The number of data files referenced by the
DataSource
.Type: Long
- RDSMetadata
-
The datasource details that are specific to Amazon RDS.
Type: RDSMetadata object
- RedshiftMetadata
-
Describes the
DataSource
details specific to Amazon Redshift.Type: RedshiftMetadata object
- RoleARN
-
The Amazon Resource Name (ARN) of an AWS IAM Role, such as the following: arn:aws:iam::account:role/rolename.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 110.
- StartedAt
-
The epoch time when Amazon Machine Learning marked the
DataSource
asINPROGRESS
.StartedAt
isn't available if theDataSource
is in thePENDING
state.Type: Timestamp
- Status
-
The current status of the
DataSource
. This element can have one of the following values:-
PENDING
- Amazon ML submitted a request to create aDataSource
. -
INPROGRESS
- The creation process is underway. -
FAILED
- The request to create aDataSource
did not run to completion. It is not usable. -
COMPLETED
- The creation process completed successfully. -
DELETED
- TheDataSource
is marked as deleted. It is not usable.
Type: String
Valid Values:
PENDING | INPROGRESS | FAILED | COMPLETED | DELETED
-
Errors
For information about the errors that are common to all actions, see Common Errors.
- InternalServerException
-
An error on the server occurred when trying to process a request.
HTTP Status Code: 500
- InvalidInputException
-
An error on the client occurred. Typically, the cause is an invalid input value.
HTTP Status Code: 400
- ResourceNotFoundException
-
A specified resource cannot be located.
HTTP Status Code: 400
Examples
The following is a sample request and response of the GetDataSource operation.
This example illustrates one usage of GetDataSource.
Sample Request
POST / HTTP/1.1
Host: machinelearning.<region>.<domain>
x-amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>, SignedHeaders=contenttype;date;host;user-agent;x-amz-date;x-amz-target;x-amzn-requestid,Signature=<Signature>
User-Agent: <UserAgentString>
Content-Type: application/x-amz-json-1.1
Content-Length: <PayloadSizeBytes>
Connection: Keep-Alive
X-Amz-Target: AmazonML_20141212.GetDataSource
{"DataSourceId": "17SdAv6WC6r5vACAxF7U", "Verbose": true}
Sample Response
HTTP/1.1 200 OK
x-amzn-RequestId: <RequestId>
Content-Type: application/x-amz-json-1.1
Content-Length: <PayloadSizeBytes>
Date: <Date>
{
{
"CreatedAt":141045168.275,
"CreatedByIamUser":"arn:aws:iam::<awsAccountId>:user/testuser",
"DataLocationS3":"s3://eml-test-EXAMPLE /data.csv",
"DataRearrangement": "{\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}",
"DataSizeInBytes":0,"DataSourceId":"17SdAv6WC6r5vACAxF7U",
"DataSourceSchema":"
{
\"version\":\"1.0\",
\"recordAnnotationFieldName\":null,
\"recordWeightFieldName\":\"weight\",
\"targetFieldName\":\"label\",
\"dataFormat\":\"CSV\",
\"dataFileContainsHeader\":false,
\"attributes\":
[
{\"attributeName\":\"obsId\",\"attributeType\":\"NUMERIC\"},
{\"attributeName\":\"label\",\"attributeType\":\"BINARY\"},
{\"attributeName\":\"weight\",\"attributeType\":\"NUMERIC\"},
{\"attributeName\":\"x\",\"attributeType\":\"TEXT\"}
],
\"excludedAttributeNames\":[]
}",
"DataStatisticsStatus":"COMPLETED",
"LastUpdatedAt":141045168.275,
"LogUri": "https://s3bucket/locationToLogs/logname.tar.gz",
"Name":"EXAMPLE",
"Status":"COMPLETED",
"ComputeTime":"185200",
"FinishedAt":141045168.275,
"StartedAt":141045168.275
}
}
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: