PySpark Extension Types
The types that are used by the AWS Glue PySpark extensions.
DataType
The base class for the other AWS Glue types.
Returns the type of the AWS Glue type class (that is, the class name with "Type" removed from the end).
Returns a JSON object that contains the data type and properties of the class:
{ "dataType": typeName, "properties": properties }
AtomicType and Simple Derivatives
Inherits from and extends the DataType class, and serves as the base class for all the AWS Glue atomic data types.
The following types are simple derivatives of the AtomicType class:
DecimalType(AtomicType)
Inherits from and extends the AtomicType class to represent a decimal number (a number expressed in decimal digits, as opposed to binary base-2 numbers).
EnumType(AtomicType)
Inherits from and extends the AtomicType class to represent an enumeration of valid options.
Collection Types
ArrayType(DataType)
__init__(elementType=UnknownType(), properties={})
ChoiceType(DataType)
__init__(choices=[], properties={})
add(new_choice)
Adds a new choice to the list of possible choices.
merge(new_choices)
Merges a list of new choices with the existing list of choices.
MapType(DataType)
__init__(valueType=UnknownType, properties={})
Field(Object)
Creates a field object out of an object that derives from DataType.
__init__(name, dataType, properties={})
StructType(DataType)
Defines a data structure (struct
).
__init__(fields=[], properties={})
hasField(field)
Returns True
if this structure has a field of the same name, or
False
if not.
getField(field)
EntityType(DataType)
__init__(entity, base_type, properties)
This class is not yet implemented.
Other Types
DataSource(object)
__init__(j_source, sql_ctx, name)
setFormat(format, **options)
getFrame()
Returns a DynamicFrame
for the data source.
DataSink(object)
setFormat(format, **options)
writeFrame(dynamic_frame, info="")
write(dynamic_frame_or_dfc, info="")
Writes a DynamicFrame
or a DynamicFrameCollection
.