PIIDetection - AWS Glue

PIIDetection

Specifies a transform that identifies, removes or masks PII data.

Contents

EntityTypesToDetect

Indicates the types of entities the PIIDetection transform will identify as PII data.

PII type entities include: PERSON_NAME, DATE, USA_SNN, EMAIL, USA_ITIN, USA_PASSPORT_NUMBER, PHONE_NUMBER, BANK_ACCOUNT, IP_ADDRESS, MAC_ADDRESS, USA_CPT_CODE, USA_HCPCS_CODE, USA_NATIONAL_DRUG_CODE, USA_MEDICARE_BENEFICIARY_IDENTIFIER, USA_HEALTH_INSURANCE_CLAIM_NUMBER,CREDIT_CARD,USA_NATIONAL_PROVIDER_IDENTIFIER,USA_DEA_NUMBER,USA_DRIVING_LICENSE

Type: Array of strings

Pattern: ([\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF]|[^\S\r\n"'])*

Required: Yes

Inputs

The node ID inputs to the transform.

Type: Array of strings

Array Members: Fixed number of 1 item.

Pattern: [A-Za-z0-9_-]*

Required: Yes

Name

The name of the transform node.

Type: String

Pattern: ([\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF]|[^\r\n])*

Required: Yes

PiiType

Indicates the type of PIIDetection transform.

Type: String

Valid Values: RowAudit | RowMasking | ColumnAudit | ColumnMasking

Required: Yes

MaskValue

Indicates the value that will replace the detected entity.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 256.

Pattern: [*A-Za-z0-9_-]*

Required: No

OutputColumnName

Indicates the output column name that will contain any entity type detected in that row.

Type: String

Pattern: ([\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF]|[^\S\r\n"'])*

Required: No

SampleFraction

Indicates the fraction of the data to sample when scanning for PII entities.

Type: Double

Valid Range: Minimum value of 0. Maximum value of 1.

Required: No

ThresholdFraction

Indicates the fraction of the data that must be met in order for a column to be identified as PII data.

Type: Double

Valid Range: Minimum value of 0. Maximum value of 1.

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: