SparkSQL
Specifies a transform where you enter a SQL query using Spark SQL syntax to transform the data. The output is a single DynamicFrame
.
Contents
- Inputs
-
The data inputs identified by their node names. You can associate a table name with each input node to use in the SQL query. The name you choose must meet the Spark SQL naming restrictions.
Type: Array of strings
Array Members: Minimum number of 1 item.
Pattern:
[A-Za-z0-9_-]*
Required: Yes
- Name
-
The name of the transform node.
Type: String
Pattern:
([^\r\n])*
Required: Yes
- SqlAliases
-
A list of aliases. An alias allows you to specify what name to use in the SQL for a given input. For example, you have a datasource named "MyDataSource". If you specify
From
as MyDataSource, andAlias
as SqlName, then in your SQL you can do:select * from SqlName
and that gets data from MyDataSource.
Type: Array of SqlAlias objects
Required: Yes
- SqlQuery
-
A SQL query that must use Spark SQL syntax and return a single data set.
Type: String
Pattern:
([\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\s])*
Required: Yes
- OutputSchemas
-
Specifies the data schema for the SparkSQL transform.
Type: Array of GlueSchema objects
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: