DropFields Class - AWS Glue

DropFields Class

Drops fields within a DynamicFrame.

Methods

__call__(frame, paths, transformation_ctx = "", info = "", stageThreshold = 0, totalThreshold = 0)

Drops nodes within a DynamicFrame.

  • frame – The DynamicFrame in which to drop the nodes (required).

  • paths – A list of full paths to the nodes to drop (required).

  • transformation_ctx – A unique string that is used to identify state information (optional).

  • info – A string associated with errors in the transformation (optional).

  • stageThreshold – The maximum number of errors that can occur in the transformation before it errors out (optional; the default is zero).

  • totalThreshold – The maximum number of errors that can occur overall before processing errors out (optional; the default is zero).

Returns a new DynamicFrame without the specified fields.

apply(cls, *args, **kwargs)

Inherited from GlueTransform apply.

name(cls)

Inherited from GlueTransform name.

describeArgs(cls)

Inherited from GlueTransform describeArgs.

describeReturn(cls)

Inherited from GlueTransform describeReturn.

describeTransform(cls)

Inherited from GlueTransform describeTransform.

describeErrors(cls)

Inherited from GlueTransform describeErrors.

describe(cls)

Inherited from GlueTransform describe.

Examples

Dataset used for DropFields examples

The following dataset is used for the DropFields examples:

{name: Sally, age: 23, location: {state: WY, county: Fremont}, friends: []} {name: Varun, age: 34, location: {state: NE, county: Douglas}, friends: [{name: Arjun, age: 3}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, age: 21, location: {state: AK, county: Denali}} {name: Sheila, age: 63, friends: [{name: Nancy, age: 22}]}

This dataset has the following schema:

root |-- name: string |-- age: int |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int

Example: Drop a top-level field

Use code similar to the following to drop the age field:

df_no_age = DropFields.apply(df, paths=['age'])

Resulting dataset:

{name: Sally, location: {state: WY, county: Fremont}, friends: []} {name: Varun, location: {state: NE, county: Douglas}, friends: [{name: Arjun, age: 3}]} {name: George, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, location: {state: AK, county: Denali}} {name: Sheila, friends: [{name: Nancy, age: 22}]}

Resulting schema:

root |-- name: string |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int

Example: Drop a nested field

To drop a nested field, you can qualify the field with a '.'.

df_no_county = DropFields.apply(df, paths=['location.county'])

Resulting dataset:

{name: Sally, age: 23, location: {state: WY}, friends: []} {name: Varun, age: 34, location: {state: NE}, friends: [{name: Arjun, age: 3}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, age: 21, location: {state: AK}} {name: Sheila, age: 63, friends: [{name: Nancy, age: 22}]}

If you drop the last element of a struct type, the transform removes the entire struct.

df_no_county = DropFields.apply(df, paths=['location.state])

Resulting schema:

root |-- name: string |-- age: int |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int

Example: Drop a nested field from an array

No special syntax is needed to drop a field from inside a struct nested inside an array. For example, we can drop the age field from the friends array with the following:

df_no_friend_age = DropFields.apply(df, paths=['friends.age'])

Resulting dataset:

{name: Sally, age: 23, location: {state: WY, county: Fremont}} {name: Varun, age: 34, location: {state: NE, county: Douglas}, friends: [{name: Arjun}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy}]} {name: Haruki, age: 21, location: {state: AK, county: Denali}} {name: Sheila, age: 63, friends: [{name: Nancy}]}

Resulting schema:

root |-- name: string |-- age: int |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string