# Zero-ETL integrations
<a name="zero-etl-using"></a>

[Zero-ETL](https://aws.amazon.com/what-is/zero-etl/) is a set of fully managed integrations by AWS that minimizes the need to build ETL data pipelines for common ingestion and replication use cases. It makes data available in Lakehouse architecture of Amazon SageMaker and Amazon Redshift from multiple operational, transactional, and application sources. Currently AWS Glue Zero-ETL supports DynamoDB, Oracle Database@AWS and SaaS sources like Salesforce, SAP, Zendesk as sources. With zero-ETL integration, you have fresher data for analytics, AI/ML, and reporting. You get more accurate and timely insights for use cases like business dashboards, optimized gaming experience, data quality monitoring, and customer behavior analysis. You can make data-driven predictions with more confidence, improve customer experiences, and promote data-driven insights across the business.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.

Lakehouse architecture of Amazon SageMaker unifies all your data across Amazon Simple Storage Service (Amazon S3) data lakes and Amazon Redshift data warehouses, helping you build powerful analytics and AI/ML applications on a single copy of data. Lakehouse architecture of Amazon SageMaker gives you the flexibility to access and query your data in-place with all Apache Iceberg compatible tools and engines. With SageMaker Lakehouse, you also have the flexibility to access and query your data in-place with Apache Iceberg compatible tools and engines. Additionally, you can secure your data with integrated, fine-grained access controls, that are enforced across all your data in all analytic tools and engines. Define permissions once and confidently share data across your organization.

## Zero-ETL capabilities in AWS Glue
<a name="zero-etl-capabilities"></a>

Zero-ETL integrations in AWS Glue simplify data ingestion and replication from AWS data services and third-party applications to AWS destinations.

AWS services supported by Zero-ETL sources in AWS Glue include:
+ Amazon DynamoDB
+ Oracle at AWS, ODB

Third-party applications using AWS Glue connections include:
+ Facebook Ads
+ Instagram Ads
+ Salesforce
+ Salesforce Marketing Cloud Account Engagement
+ SAP OData
+ ServiceNow
+ Zendesk
+ Zoho CRM

AWS services supported by Zero-ETL targets in AWS Glue include:
+ General purpose Amazon S3 bucket via Lakehouse architecture of Amazon SageMaker
+ Amazon S3 Tables via Lakehouse architecture of Amazon SageMaker
+ Redshift Managed Storage via Lakehouse architecture of Amazon SageMaker
+ Amazon Redshift Datawarehouse

# Configuring a source for a zero-ETL integration
<a name="zero-etl-sources"></a>

Setting up an integration requires some prerequisites on the source such as configuring IAM roles or policies which AWS Glue uses to access data from the source (in order to write to the target), the use of KMS keys to encrypt the data in intermediate location, etc.

**Topics**
+ [Configuring Amazon DynamoDB source](#zero-etl-config-source-dynamodb)
+ [Configuring connection (SaaS) source](#zero-etl-config-source-saas)
+ [Setting up Oracle Database at AWS as source](#zero-etl-config-source-oracle)

## Configuring Amazon DynamoDB source
<a name="zero-etl-config-source-dynamodb"></a>

To access data from your source Amazon DynamoDB table, AWS Glue requires access to describe the table and export data from it. Amazon DynamoDB recently introduced a feature which allows configuring a Resource Based Access (RBAC) policy.

The following example Resource Based Access (RBAC) Policy uses a wild card (\$1) for integration:

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "1111",
      "Effect": "Allow",
      "Principal": {
        "Service": "glue.amazonaws.com"
      },
      "Resource": "*",
      "Action": [
        "dynamodb:ExportTableToPointInTime",
        "dynamodb:DescribeTable",
        "dynamodb:DescribeExport"
      ],
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "111122223333"
        },
        "ArnLike": {
          "aws:SourceArn": "arn:aws:glue:us-east-1:111122223333:integration:*"
        }
      }
    }
  ]
}
```
+ For the Amazon DynamoDB table that you want to replicate, paste the above RBAC policy template into **Resource-based policy for table** and fill in the fields.
+ If you want to make the policy restrictive, you must update the policy after creating the integration and specify the full `integrationArn` and use the `StringEquals` condition instead of `StringLike`.
+ Make sure that Point-in-time recovery (PITR) is enabled for the Amazon DynamoDB table.
+ Make sure that you add `Describe Export` to the Resource Based Access (RBAC) policy.

You can also add the RBAC policy to the table using the following command:

```
aws dynamodb put-resource-policy \
--resource-arn arn:aws:dynamodb:region:account-id:table/ddb-table-name \
--policy file://resource-policy-with-condition.json \
--region region
```

To verify that policy is applied correctly, use the following command to get the resource policy for a table:

```
aws dynamodb get-resource-policy \
--resource-arn arn:aws:dynamodb:region:account-id:table/ddb-table-name \
--region region
```

## Configuring connection (SaaS) source
<a name="zero-etl-config-source-saas"></a>

### Prerequisites for setting up an integration
<a name="zero-etl-config-source-saas-prerequisites"></a>

Before creating a zero-ETL integration from SaaS sources, you need to complete the following setup tasks:
+ Create a AWS Glue connection with SaaS source.
+ Create the source role to provide access to connection.
+ Associate the role with the connection.

### Configuring Salesforce source
<a name="zero-etl-config-source-salesforce"></a>

To create a connection for a Salesforce source, see [Connecting to Salesforce](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-salesforce.html).

Once you've created the connection, you can specify the source data to replicate.

![\[The screenshot shows selecting Salesforce source data to replicate in a zero-ETL integration.\]](http://docs.aws.amazon.com/glue/latest/dg/images/zero-etl-salesforce-source-data.png)


Using your zero-ETL integration you can perform DDL operations for supported entities. For a list of entities which are not supported, see [Unsupported entities and fields for Salesforce](#zero-etl-config-source-salesforce-unsupported).

#### Additional Salesforce Configuration
<a name="zero-etl-config-source-salesforce-additional"></a>

Salesforce Zero-ETL needs Lake Formation permission on AWS Glue database, otherwise it will get `IngestionFailed` from the Log with the following error:

```
"errorMessage": "Insufficient lake formation permissions on Target Glue database."
```

#### Unsupported entities and fields for Salesforce
<a name="zero-etl-config-source-salesforce-unsupported"></a>

The following Salesforce entities or fields are unsupported for use in a zero-ETL integration with a Salesforce source.

```
AccountChangeEvent, AccountContactRoleChangeEvent, AccountHistory, AccountShare, ActiveFeatureLicenseMetric, ActivePermSetLicenseMetric, ActiveProfileMetric, ActivityFieldHistory, amzsec__asi_Telemetry_Data_Store__ChangeEvent, amzsec__asi_Telemetry_Data_Store__History, amzsec__asi_Telemetry_Data_Store__Share, amzsec__asi_Telemetry_Job_Log__ChangeEvent, amzsec__asi_Telemetry_Job_Log__History, amzsec__asi_Telemetry_Job_Log__Share, amzsec__asi_Telemetry_Requirement__ChangeEvent, amzsec__asi_Telemetry_Requirement__History, amzsec__asi_Telemetry_Requirement__Share, ApexClass, ApexComponent, ApexLog, ApexPage, ApexTestQueueItem, ApexTestResult, ApexTrigger, AssetChangeEvent, AssetHistory, AssetRelationshipHistory, AssetShare, AssignmentRule, AssociatedLocationHistory, AsyncApexJob, AuditTrailFileExportShare, AuthorizationFormConsentChangeEvent, AuthorizationFormConsentHistory, AuthorizationFormConsentShare, AuthorizationFormDataUseHistory, AuthorizationFormDataUseShare, AuthorizationFormHistory, AuthorizationFormShare, AuthorizationFormTextHistory, AuthProvider, AuthSession, BatchJobHistory, BatchJobPartFailedRecordHistory, BatchJobPartHistory, BatchJobShare, BrandTemplate, BriefcaseAssignmentChangeEvent, BriefcaseDefinitionChangeEvent, BusinessBrandShare, BusinessHours, BusinessProcess, CalcMatrixColumnRangeHistory, CalcProcStepRelationshipHistory, CalculationMatrixColumnHistory, CalculationMatrixHistory, CalculationMatrixRowHistory, CalculationMatrixShare, CalculationMatrixVersionHistory, CalculationProcedureHistory, CalculationProcedureShare, CalculationProcedureStepHistory, CalculationProcedureVariableHistory, CalculationProcedureVersionHistory, Calendar, CalendarViewShare, CallCenter, CallCoachConfigModifyEvent, CampaignChangeEvent, CampaignHistory, CampaignMemberChangeEvent, CampaignMemberStatusChangeEvent, CampaignShare, CaseChangeEvent, CaseHistory, CaseHistory2 CaseHistory2ChangeEvent, CaseRelatedIssueChangeEvent, CaseRelatedIssueHistory, CaseShare, CaseStatus, CaseTeamMember, CaseTeamRole, CaseTeamTemplate, CaseTeamTemplateMember, CaseTeamTemplateRecord, CategoryNode, ChangeRequestChangeEvent, ChangeRequestHistory, ChangeRequestRelatedIssueChangeEvent, ChangeRequestRelatedIssueHistory, ChangeRequestRelatedItemChangeEvent, ChangeRequestRelatedItemHistory, ChangeRequestShare, ChatRetirementRdyMetrics, ChatterActivity, ClientBrowser, CollaborationGroup, CollaborationGroupMember, CollaborationGroupMemberRequest, CollaborationInvitation, CommSubscriptionChannelTypeHistory, CommSubscriptionChannelTypeShare, CommSubscriptionConsentChangeEvent, CommSubscriptionConsentHistory, CommSubscriptionConsentShare, CommSubscriptionHistory, CommSubscriptionShare, CommSubscriptionTimingHistory, Community, ConnectedApplication, ContactChangeEvent, ContactHistory, ContactPointAddressChangeEvent, ContactPointAddressHistory, ContactPointAddressShare, ContactPointConsentChangeEvent, ContactPointConsentHistory, ContactPointConsentShare, ContactPointEmailChangeEvent, ContactPointEmailHistory, ContactPointEmailShare, ContactPointPhoneChangeEvent, ContactPointPhoneHistory, ContactPointPhoneShare, ContactPointTypeConsentChangeEvent, ContactPointTypeConsentHistory, ContactPointTypeConsentShare, ContactRequestShare, ContactShare, ContentDocumentChangeEvent, ContentDocumentHistory, ContentDocumentLink, ContentDocumentLinkChangeEvent, ContentDocumentSubscription, ContentFolderItem, ContentFolderLink, ContentFolderMember, ContentNote, ContentNotification, ContentTagSubscription, ContentUserSubscription, ContentVersionChangeEvent, ContentVersionComment, ContentVersionHistory, ContentVersionRating, ContentWorkspace, ContentWorkspaceMember, ContentWorkspacePermission, ContentWorkspaceSubscription, ContractChangeEvent, ContractHistory, ContractLineItemChangeEvent, ContractLineItemHistory, ContractStatus, Conversation, ConversationParticipant, CronJobDetail, CronTrigger, CustomBrand, CustomBrandAsset, CustomerShare, CustomHttpHeader, DashboardComponent, DataUseLegalBasisHistory, DataUseLegalBasisShare, DataUsePurposeHistory, DataUsePurposeShare, DecisionTableRecordset, DeleteEvent, DocumentAttachmentMap, Domain, DomainSite, DTRecordsetReplicaShare, EmailBounceEvent, EmailMessageChangeEvent, EmailServicesAddress, EmailServicesFunction, EmailTemplate, EmailTemplateChangeEvent, EngagementAttendeeChangeEvent, EngagementAttendeeHistory, EngagementChannelTypeHistory, EngagementChannelTypeShare, EngagementInteractionChangeEvent, EngagementInteractionHistory, EngagementInteractionShare, EngagementInterface, EngagementTopicChangeEvent, EngagementTopicHistory, EntitlementChangeEvent, EntitlementHistory, EntitlementTemplate, EntityMilestoneHistory, EntitySubscription, EventChangeEvent, EventRelationChangeEvent, EventRelayConfigChangeEvent, ExpressionSetHistory, ExpressionSetShare, ExpressionSetVersionHistory, ExternalEventMappingShare, FeedAttachment, FeedLike, FeedPollChoice, FeedPollVote, FeedSignal, FieldPermissions, FieldSecurityClassification, FiscalYearSettings, FlowInterviewLogShare, FlowInterviewShare, FlowOrchestrationEvent, FlowOrchestrationInstanceShare, FlowOrchestrationStageInstanceShare, FlowOrchestrationStepInstanceShare, FlowOrchestrationWorkItemShare, FlowRecordShare, FlowRecordVersionChangeEvent, FlowTestResultShare, Folder, Group, GroupMember, Holiday, IdeaComment, IdpEventLog, ImageHistory, ImageShare, IncidentChangeEvent, IncidentHistory, IncidentRelatedItemChangeEvent, IncidentRelatedItemHistory, IncidentShare, IndividualChangeEvent, IndividualHistory, IndividualShare, KnowledgeableUser, LeadChangeEvent, LeadHistory, LeadShare, LeadStatus, LightningExitByPageMetrics, LightningToggleMetrics, LightningUsageByAppTypeMetrics, LightningUsageByBrowserMetrics, LightningUsageByFlexiPageMetrics, LightningUsageByPageMetrics, ListEmailChangeEvent, ListEmailShare, ListView, LocationChangeEvent, LocationHistory, LocationShare, LocationTrustMeasureShare, LoginHistory, LoginIp, MacroChangeEvent, MacroHistory, MacroInstructionChangeEvent, MacroShare, MacroUsageShare, ManagedContentVariantChangeEvent, MessagingEndUserHistory, MessagingEndUserShare, MessagingSessionHistory, MessagingSessionShare, MilestoneType, MLEngagementEvent, ObjectPermissions, OpportunityChangeEvent, OpportunityContactRoleChangeEvent, OpportunityFieldHistory, OpportunityLineItemChangeEvent, OpportunityShare, OpportunityStage, OrderChangeEvent, OrderHistory, OrderItemChangeEvent, OrderItemHistory, OrderShare, OrderStatus, Organization, OrgEmailAddressSecurity, OrgWideEmailAddress, OutgoingEmail, OutgoingEmailRelation, PackageLicense, PartnerRole, PartyConsentChangeEvent, PartyConsentHistory, PartyConsentShare, Period, PermissionSet, PermissionSetAssignment, PermissionSetTabSetting, Pricebook2ChangeEvent, Pricebook2History, PricebookEntryChangeEvent, PricebookEntryHistory, PrivacyJobSessionShare, PrivacyObjectSessionShare, PrivacyRTBFRequestHistory, PrivacyRTBFRequestShare, PrivacySessionRecordFailureShare, ProblemChangeEvent, ProblemHistory, ProblemIncidentChangeEvent, ProblemIncidentHistory, ProblemRelatedItemChangeEvent, ProblemRelatedItemHistory, ProblemShare, ProcessDefinition, ProcessExceptionEvent, ProcessExceptionShare, ProcessInstanceChangeEvent, ProcessInstanceStep, ProcessInstanceStepChangeEvent, ProcessNode, Product2ChangeEvent, Product2History, ProductEntitlementTemplate, Profile, ProfileSkillEndorsementHistory, ProfileSkillHistory, ProfileSkillShare, ProfileSkillUserHistory, PromptActionShare, PromptErrorShare, QueueSobject, QuickTextChangeEvent, QuickTextHistory, QuickTextShare, QuickTextUsageShare, RecentlyViewed, RecommendationChangeEvent, RecordActionHistory, RecordAlertHistory, RecordAlertShare, RecordType, ScorecardShare, SellerHistory, SellerShare, ServiceContractChangeEvent, ServiceContractHistory, ServiceContractShare, SetupAuditTrail, SetupEntityAccess, SharingRecordCollectionShare, Site, SiteHistory, SiteRedirectMapping, SocialPersonaHistory, SocialPostChangeEvent, SocialPostHistory, SocialPostShare, SolutionHistory, SolutionStatus, StaticResource, StreamingChannelShare, TableauHostMappingShare, TaskChangeEvent, TaskPriority, TaskStatus, ThreatDetectionFeedback, TimelineObjectDefinitionChangeEvent, TodayGoalShare, Topic, TopicUserEvent, Translation, User, UserAppMenuCustomizationShare, UserChangeEvent, UserDefinedLabelAssignmentShare, UserDefinedLabelShare, UserEmailPreferredPersonShare, UserLicense, UserLogin, UserPackageLicense, UserPreference, UserPrioritizedRecordShare, UserProvisioningRequestShare, UserRole, UserShare, VideoCallChangeEvent, VideoCallParticipantChangeEvent, VideoCallRecordingChangeEvent, VideoCallShare, VisualforceAccessMetrics, VoiceCallChangeEvent, VoiceCallRecordingChangeEvent, VoiceCallShare, Vote, WebLink, WorkAccessShare, WorkBadgeDefinitionHistory, WorkBadgeDefinitionShare, WorkOrderChangeEvent, WorkOrderHistory, WorkOrderLineItemChangeEvent, WorkOrderLineItemHistory, WorkOrderLineItemStatus, WorkOrderShare, WorkOrderStatus, WorkPlanChangeEvent, WorkPlanHistory, WorkPlanShare, WorkPlanTemplateChangeEvent, WorkPlanTemplateEntryChangeEvent, WorkPlanTemplateEntryHistory, WorkPlanTemplateHistory, WorkPlanTemplateShare, WorkStepChangeEvent, WorkStepHistory, WorkStepStatus, WorkStepTemplateChangeEvent, WorkStepTemplateHistory, WorkStepTemplateShare, WorkThanksShare
```

### Configuring a Salesforce Marketing Cloud Account Engagement source
<a name="zero-etl-config-source-smcae"></a>

To create a connection for a Salesforce Marketing Cloud Account Engagement source, see [Connecting to Salesforce Marketing Cloud Account Engagement](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-salesforce-marketing-cloud-account-engagement.html).

Using your zero-ETL integration you can perform DDL operations for the following supported entities:


| Entity label | Entity name | 
| --- | --- | 
| Campaign | campaign | 
| List | list | 
| Dynamic Content | dynamic-content | 
| List Membership | list-membership | 
| Prospect | prospect | 
| User | user | 
| EmailTemplate | email-template | 
| EngagementStudioProgram | engagement-studio-program | 
| Landing Page | landing-page | 
| List Email | list-email | 

### Configuring an SAP OData source
<a name="zero-etl-config-source-sap"></a>

To create a connection for an SAP OData source, see [Connecting to SAP OData](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-sap-odata.html).

Zero-ETL integrations with an SAP OData source now supports entities starting with `EntityOf`. The ability to override the primary key is currently supported only for SAPOData `EntityOf` objects. Once this property has been set, it cannot be modified.

#### Support for special SAP entities
<a name="zero-etl-config-source-sap-entityof"></a>

AWS Glue zero-ETL supports SAP OData entities that use SAP's Operational Data Provisioning (ODP) framework as well as those that do not use the ODP framework (non-ODP entities). The list of supported entities includes: ODP\$1SAP (Business Warehouse or BW extractors), ODP\$1CDS (Core Data Services or CDS Views) and non-ODP based OData services for SAP APIs. AWS Glue zero-ETL supports full snapshot and incremental change data capture for ODP and non-ODP SAP entities. For ODP entities, incremental changes are captured using delta links. For non-ODP entities, if a queryable field that can be used for timestamp based ingestion is selected, then zero-ETL will use that field for incremental ingestion.

While ingesting data from SAP entities using AWS Glue zero-ETL, the following things should be noted:
+ Zero-ETL can only ingest SAP entities which have been configured for GET\$1ENTITYSET method in SAP.
+ For non-ODP SAP entities, if a timestamp field is not selected for incremental updates, AWS Glue zero-ETL supports a full data extraction and replication with upserts only (no deletions).
+ For ODP extractor entities, we determine the valid primary key sets during data processing. Other SAP entities require an extra step of providing the valid primary key set as input, specifically SAP entities that start with `EntityOf`. When an `EntityOf` entity is selected, you will be directed to provide the set of primary keys.

![\[The screenshot shows configuring EntityOf primary key set for SAP OData source.\]](http://docs.aws.amazon.com/glue/latest/dg/images/zero-etl-settings-configure-entityof-primary-key-set.png)


### Configuring a ServiceNow source
<a name="zero-etl-config-source-servicenow"></a>

To create a connection for a ServiceNow source, see [Connecting to ServiceNow](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-servicenow.html).

#### Unsupported entities and fields for ServiceNow
<a name="zero-etl-config-source-servicenow-unsupported"></a>

The following ServiceNow entities or fields are unsupported for use in a zero-ETL integration with a ServiceNow source.

```
ais_acl_overrides, ais_async_genius_result, ais_async_request, ais_connection, ais_genius_result_configuration_parameters, ais_partition_health, ais_partition_health_response, ais_publish_history, ais_relevancy_training_execution, ais_relevancy_training_staging, ais_search_profile_relevancy_model, catalog_draft_entities, clone_log, clone_log0000, clone_log0001, clone_log0002, clone_log0003, clone_log0004, clone_log0005, clone_log0006, clone_log0007, cmdb_ie_context, cmdb_ie_log, cmdb_ie_run, cmdb_ire_partial_payloads_index, cmdb_qb_result_base$par1, cmdb$par1, discovery_log, discovery_log0000, discovery_log0001, discovery_log0002, discovery_log0003, discovery_log0004, discovery_log0005, discovery_log0006, discovery_log0007, ecc_agent_log, entitlement_data, entl_subscription_map, gs_entitlement_plugin_mapping, ih_transaction_exclusion, import_log, import_log0000, import_log0001, import_log0002, import_log0003, import_log0004, import_log0005, import_log0006, import_log0007, jrobin_archive, jrobin_database, jrobin_datasource, jrobin_definition, jrobin_graph, jrobin_graph_line, jrobin_graph_set, jrobin_graph_set_member, jrobin_shard, jrobin_shard_location, license_role_discovery_run, logger_configuration_validation, m2m_analytics_event_logger, m2m_user_consent_info, ml_artifact_object_store, np$sys_gen_ai_filter_sample, np$sys_ui_element, np$sys_ui_list_element, one_api_service_plan_feature_invocation, one_api_service_plan_invocation, open_nlu_predict_log, open_nlu_predict_log0000, open_nlu_predict_log0001, open_nlu_predict_log0002, open_nlu_predict_log0003, open_nlu_predict_log0004, open_nlu_predict_log0005, open_nlu_predict_log0006, open_nlu_predict_log0007, pa_diagnostic_log, pa_diagnostic_log0000, pa_diagnostic_log0001, pa_diagnostic_log0002, pa_diagnostic_log0003, pa_diagnostic_log0004, pa_diagnostic_log0005, pa_diagnostic_log0006, pa_diagnostic_log0007, pa_favorites, pa_job_log_rows, pa_job_log_rows0000, pa_job_log_rows0001, pa_job_log_rows0002, pa_job_log_rows0003, pa_job_log_rows0004, pa_job_log_rows0005, pa_job_log_rows0006, pa_job_log_rows0007, pa_migration_ignored_scores, pa_scores_l1, pa_scores_l2, pa_scores_migration_groups, par_dashboard_conversion_backup, promin_log, promin_log0000, promin_log0001, promin_log0002, promin_log0003, promin_log0004, promin_log0005, promin_log0006, promin_log0007, promin_request_object, proposed_change_verification_log, proposed_change_verification_log0000, proposed_change_verification_log0001, proposed_change_verification_log0002, proposed_change_verification_log0003, proposed_change_verification_log0004, proposed_change_verification_log0005, proposed_change_verification_log0006, proposed_change_verification_log0007, protected_table_log, protected_table_log0000, protected_table_log0001, protected_table_log0002, protected_table_log0003, protected_table_log0004, protected_table_log0005, protected_table_log0006, protected_table_log0007, pwd_history, qb_query_results, scan_log, scan_log0000, scan_log0001, scan_log0002, scan_log0003, scan_log0004, scan_log0005, scan_log0006, scan_log0007, schema_validator_error, sla_repair_log_entry, sla_repair_log_entry0000, sla_repair_log_entry0001, sla_repair_log_entry0002, sla_repair_log_entry0003, sla_repair_log_entry0004, sla_repair_log_entry0005, sla_repair_log_entry0006, sla_repair_log_entry0007, sla_repair_log_message, sla_repair_log_message0000, sla_repair_log_message0001, sla_repair_log_message0002, sla_repair_log_message0003, sla_repair_log_message0004, sla_repair_log_message0005, sla_repair_log_message0006, sla_repair_log_message0007, sn_bm_client_activity, sn_ci_analytics_st_actionable_notifs, sn_ci_analytics_st_conv_completion_by_cat, sn_ci_analytics_st_conv_dynamic_property, sn_ci_analytics_st_conversation, sn_ci_analytics_st_count_by_date, sn_ci_analytics_st_event_occurrence, sn_ci_analytics_st_event_property_value_trend, sn_ci_analytics_st_issue_auto_resolution, sn_ci_analytics_st_no_clicks, sn_ci_analytics_st_no_results, sn_ci_analytics_st_property_summary_by_event, sn_ci_analytics_st_session_count_per_locale, sn_ci_analytics_st_session_duration, sn_ci_analytics_st_spokes_usage, sn_ci_analytics_st_topic_execution_stats, sn_ci_analytics_st_topic_occurrence, sn_ci_analytics_st_trending_content, sn_ci_analytics_st_trending_queries, sn_ci_analytics_st_users, sn_cs_plugin_signatures, sn_cs_telemetry_log, sn_dfc_application, sn_dfc_product, sn_employee_position, sn_entitlement_st_subscription_application_family, sn_entitlement_st_subscription_application_users, sn_hr_sp_st_relevant_for_you, sn_instance_clone_log, sn_instance_clone_log0000, sn_instance_clone_log0001, sn_instance_clone_log0002, sn_instance_clone_log0003, sn_instance_clone_log0004, sn_instance_clone_log0005, sn_instance_clone_log0006, sn_instance_clone_log0007, sn_km_mr_st_kb_knowledge, sn_me_st_topic, sn_rf_conditional_definition, sn_rf_evaluation_type, sn_rf_evaluation_type_input, sn_rf_recommendation_action, sn_rf_recommendation_experience, sn_rf_recommendation_history, sn_rf_recommendation_rule, sn_rf_record_display_configuration, sn_rf_trend_definition, sn_sub_man_st_account_level_entitlement, sn_sub_man_st_gen_ai_metadata, sn_sub_man_st_instance_used_assist_count, sn_sub_man_st_now_assist_creator_instances, sn_sub_man_st_now_assists_aggregate, sn_sub_man_st_subscribed_groups, sn_sub_man_st_subscription_insights, sn_sub_man_st_subscription_license_detail_metric, sn_sub_man_st_unallocated_group_recommendation, sn_sub_man_st_unconfirmed_user_group, sn_wn_user_app_activity, sn_wn_user_content_activity, snc_monitorable_item, snpar_sched_export_v_scheduled_export_visualization, spotlight, spotlight_audit, spotlight_copy_log_row, spotlight_copy_log_row0000, spotlight_copy_log_row0001, spotlight_copy_log_row0002, spotlight_copy_log_row0003, spotlight_copy_log_row0004, spotlight_copy_log_row0005, spotlight_copy_log_row0006, spotlight_copy_log_row0007, spotlight_job_log_row, spotlight_job_log_row0000, spotlight_job_log_row0001, spotlight_job_log_row0002, spotlight_job_log_row0003, spotlight_job_log_row0004, spotlight_job_log_row0005, spotlight_job_log_row0006, spotlight_job_log_row0007, st_dfc_performance_metric, st_license_detail_metric, st_on_call_hour, st_sc_wizard_question, st_sys_catalog_items_and_variable_sets, st_sys_design_system_icon, subscription_instance_stats, svc_container_config, svc_environment_config, svc_layer_config, svc_model_assoc_ci, svc_model_checkpoint_attr, svc_model_obj_cluster, svc_model_obj_constraint, svc_model_obj_deployable, svc_model_obj_element, svc_model_obj_impact, svc_model_obj_impactrule, svc_model_obj_package, svc_model_obj_path, svc_model_obj_relation, svc_model_obj_service, sys_administrative_script_transaction, sys_amb_message, sys_amb_processor, sys_analytics_batch_state, sys_analytics_config, sys_analytics_data_points_error, sys_analytics_event, sys_analytics_logger, sys_analytics_logger_field, sys_app_payload_loader_rule, sys_app_payload_unloader_rule, sys_app_scan_payload, sys_app_scan_variable, sys_app_scan_variable_type, sys_archive_destroy_log, sys_archive_destroy_run, sys_archive_log, sys_archive_run, sys_atf_transaction_log, sys_attachment_doc, sys_attachment_doc_v2, sys_attachment_soft_deleted, sys_audit, sys_audit_relation, sys_auth_policy_api_allowed, sys_aw_registered_scripting_modal, sys_cache_flush, sys_data_egress_source, sys_dm_delete_count, sys_export_set_log, sys_export_set_log0000, sys_export_set_log0001, sys_export_set_log0002, sys_export_set_log0003, sys_export_set_log0004, sys_export_set_log0005, sys_export_set_log0006, sys_export_set_log0007, sys_flow_compiled_flow, sys_flow_compiled_flow_chunk, sys_flow_context_chunk, sys_flow_context_chunk_archive, sys_flow_context_inputs_chunk, sys_flow_execution_history, sys_flow_log, sys_flow_log0000, sys_flow_log0001, sys_flow_log0002, sys_flow_log0003, sys_flow_plan_context_binding, sys_flow_report_doc, sys_flow_report_doc_chunk, sys_flow_report_doc_chunk_archive, sys_flow_runtime_state_chunk, sys_flow_runtime_value_chunk, sys_flow_subflow_plan_chunk, sys_flow_trigger_plan_chunk, sys_flow_val_listener, sys_flow_value, sys_flow_value_chunk, sys_gen_ai_config_example, sys_gen_ai_feature_mapping, sys_gen_ai_strategy_mapping, sys_gen_ai_usage_log, sys_generative_ai_capability_definition, sys_generative_ai_log, sys_generative_ai_response_validator, sys_generative_ai_validator, sys_geo_routing, sys_geo_routing_config, sys_hop_token, sys_hub_action_plan_chunk, sys_hub_snapshot_chunk, sys_journal_field, sys_journal_field_edit, sys_json_chunk, sys_kaa_policy, sys_kaa_subidentity_assertion, sys_kaa_user_policy_mapping, sys_mapplication, sys_mass_encryption_job, sys_notification_execution_log, sys_notification_execution_log0000, sys_notification_execution_log0001, sys_notification_execution_log0002, sys_notification_execution_log0003, sys_notification_execution_log0004, sys_notification_execution_log0005, sys_notification_execution_log0006, sys_notification_execution_log0007, sys_nowmq_message, sys_nowmq_provider_param_definition, sys_orchestrator_action, sys_pd_asset_configuration, sys_pd_context_chunk, sys_pd_context_log, sys_pd_snapshot_chunk, sys_pd_trigger_license, sys_processing_framework_job, sys_query_index_hint, sys_query_rewrite, sys_query_string_log, sys_replication_queue, sys_replication_queue0, sys_replication_queue1, sys_replication_queue2, sys_replication_queue3, sys_replication_queue4, sys_replication_queue5, sys_replication_queue6, sys_replication_queue7, sys_request_performance, sys_rollback_conflict, sys_rollback_incremental, sys_rollback_log, sys_rollback_log0000, sys_rollback_log0001, sys_rollback_log0002, sys_rollback_log0003, sys_rollback_log0004, sys_rollback_log0005, sys_rollback_log0006, sys_rollback_log0007, sys_rollback_run, sys_rollback_schema_change, sys_rollback_schema_conflict, sys_rollback_sequence, sys_scheduler_assignment, sys_scheduler_memory_pressure_job_log, sys_script_adapter_rule, sys_script_batch_adapter_rule, sys_search_source_filter, sys_service_authentication, sys_signing_job, sys_suggestion_reader, sys_sync_history_review, sys_trend, sys_unreferenced_preview, sys_unreferenced_record_rule, sys_upgrade_manifest, sys_upgrade_state, sys_ux_asset_cache_buster, sys_ux_lib_component_prop, sys_ux_lib_presource, sys_ux_page_action, sys_ux_page_action_binding, sysevent_queue_runtime, syslog, syslog_app_scope0000, syslog_app_scope0001, syslog_app_scope0002, syslog_app_scope0003, syslog_app_scope0004, syslog_app_scope0005, syslog_app_scope0006, syslog_app_scope0007, syslog_email0000, syslog_email0001, syslog_email0002, syslog_email0003, syslog_email0004, syslog_email0005, syslog_email0006, syslog_email0007, syslog_transaction, syslog_transaction0000, syslog_transaction0001, syslog_transaction0002, syslog_transaction0003, syslog_transaction0004, syslog_transaction0005, syslog_transaction0006, syslog_transaction0007, syslog0000, syslog0001, syslog0002, syslog0003, syslog0004, syslog0005, syslog0006, syslog0007, ts_attachment, ts_chain, ts_deleted_doc, ts_document, ts_field, ts_index_stats, ts_phrase, ts_search_stats, ts_v4_attachment, ts_word, ua_app_metadata, ua_audit_stats, ua_extra_page, ua_monitor_property, ua_monitor_property_audit, ua_shared_service, ua_sn_table_inventory, ua_sp_known_bot, ua_upload_log, spotlight_criteria, pa_snapshots, pa_widget_indicators, pa_widgets, sn_cim_register, global, gsw_change_log, gsw_content, gsw_content_group, gsw_content_information, gsw_status_of_content, multi_factor_browser_fingerprint, multi_factor_criteria, pa_filters, password_policy, plan_execution, plan_mysql, plan_oracle, plan_postgres, sc_rest_api_without_access_policy, sn_actsub_activity, sn_actsub_activity_fanout, sn_actsub_activity_stream, sn_actsub_activity_type, sn_actsub_atype_attributes, sn_actsub_atype_notif_pref, sn_actsub_module, sn_actsub_notif_object, sn_actsub_subobject_stream, sn_actsub_subscribable_object, sn_actsub_subscription_notif_pref, sn_actsub_user_stream, sn_appclient_store_outbound_http_quota, sn_appcreator_app_template, sn_critical_update, sn_docker_spoke_images, sn_employee_app, sn_employee_app_access, sn_employee_app_access_criteria, sn_entitlement_genai_assist_counts, sn_entitlement_genai_creator_user_counts, sn_entitlement_genai_creator_users, sn_mif_instance, sn_mif_sync_data, sn_mif_sync_status, sn_mif_table_registration, sn_mif_trust_config, sn_vsc_best_practice_configurations, sn_vsc_best_practice_goals, sn_vsc_changed_hardening_settings, sn_vsc_changed_scan_findings, sn_vsc_check_security_area, sn_vsc_elevation_event, sn_vsc_event, sn_vsc_export_event, sn_vsc_export_setting, sn_vsc_harc_compliance_status_lookup, sn_vsc_hardening_compliance_scores, sn_vsc_impersonation_event, sn_vsc_instance_hardening_settings, sn_vsc_login_event, sn_vsc_scan_comparisons, sn_vsc_scan_summary, sn_vsc_security_check_categories, sn_vsc_security_check_configurations, sn_vsc_security_configuration_groups, sn_vsc_security_privacy_capabilities, sn_vsc_updated_settings, sn_vsc_user_comparisons, sys_app_hash_inventory, sys_coalesce_strategy_deferred, sys_flow_secure_data, sys_formula_function, sys_geocoding_request, sys_global_file_hash, sys_import_set_row, sys_index, sys_index_explain, sys_installation_schedule, sys_installation_schedule_item, sys_offline_app, sys_package, sys_package_dependency_item, sys_package_dependency_m2m, sys_plugins, sys_querystat, sys_reap_package, sys_scoped_plugin, sys_stage_storage_alias, sys_storage_alias, sys_storage_table_alias, sys_store_app, sys_table_partition, sys_upgrade_history_log, sys_user_public_credential, sys_webauthn_authentication_request, sys_webauthn_registration_request, syslog_app_scope, syslog_email, syslog_page_timing, ua_instance_state_config, v_expression_cache, v_private_cache, v_shared_cache, sys_amb_message0002, sys_amb_message0004, sys_amb_message0005, sys_metadata, v_par_unified_report_viz
```

### Configuring a Zendesk source
<a name="zero-etl-config-source-zendesk"></a>

To create a connection for a Zendesk source, see [Connecting to Zendesk](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-zendesk.html).

Using your zero-ETL integration you can perform the following DDL operations for supported entities:


| Entity label | Entity name | Create supported | Update supported | Delete supported | 
| --- | --- | --- | --- | --- | 
| Tickets | tickets | Y | Y | Y | 
| User | users | Y | Y | Y | 
| Satisfaction Rating | satisfaction-rating | Y | Y | N | 
| Articles | articles | Y | Y | N | 
| Organization | organizations | Y | Y | Y | 
| Calls | calls | Y | Y | N | 
| Call Legs | legs | Y | Y | N | 

### Configuring a Zoho CRM source
<a name="zero-etl-config-source-zoho"></a>

To create a connection for a Zoho CRM source, see [Connecting to Zoho CRM](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-zoho-crm.html).

Using your zero-ETL integration you can perform the following DDL operations for supported entities:


| Entity label | Entity name | DML-Insert | DML-Modify | DML-Delete | DDL-Insert | DDL-Modify | DDL-Delete | 
| --- | --- | --- | --- | --- | --- | --- | --- | 
| Leads | lead | Y | Y | Y | Y | Y | Y | 
| Accounts | account | Y | Y | Y | Y | Y | Y | 
| Contacts | contact | Y | Y | Y | Y | Y | Y | 
| Campaigns | campaign | Y | Y | Y | Y | Y | Y | 
| Tasks | task | Y | Y | Y | Y | Y | Y | 
| Events | event | Y | Y | Y | Y | Y | Y | 
| Calls | call | Y | Y | Y | Y | Y | Y | 
| Solutions | solution | Y | Y | Y | Y | Y | Y | 
| Products | product | Y | Y | Y | Y | Y | Y | 
| Vendors | vendor | Y | Y | Y | Y | Y | Y | 
| Quotes | quote | Y | Y | Y | Y | Y | Y | 
| Sales Orders | sales-order | Y | Y | Y | Y | Y | Y | 
| Purchase Orders | purchase-order | Y | Y | Y | Y | Y | Y | 
| Invoices | invoice | Y | Y | Y | Y | Y | Y | 
| Cases | case | Y | Y | Y | Y | Y | Y | 
| Price Books | price-book | Y | Y | Y | Y | Y | Y | 

### Configuring a Facebook Ads source
<a name="zero-etl-config-source-facebook"></a>

To create a connection for a Facebook Ads source, see [Connecting to Facebook Ads](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-facebook-ads.html).

Using your zero-ETL integration you can perform the following DDL operations for supported entities:


| Entity label | Entity name | Create supported | Update supported | Delete supported | 
| --- | --- | --- | --- | --- | 
| Adset | \$1/adsets | Y | Y | Y | 
| Campaign | \$1/campaigns | Y | Y | Y | 
| Ads | \$1/ads | Y | Y | Y | 

### Configuring an Instagram Ads source
<a name="zero-etl-config-source-instagram"></a>

To create a connection for an Instagram Ads source, see [Connecting to Instagram Ads](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-instagram-ads.html).

Using your zero-ETL integration you can perform the following DDL operations for supported entities:


| Entity name | Create supported | Update supported | Delete supported | 
| --- | --- | --- | --- | 
| \$1/adsets | Y | Y | Y | 
| \$1/campaigns | Y | Y | Y | 
| \$1/ads | Y | Y | Y | 

### Setting up the source role for integration
<a name="zero-etl-config-source-role"></a>

Create a source role to allow the zero-ETL integration to access your connection. This is applicable only for SaaS sources. This is a prerequisite for creating integrations with SaaS sources.

**Note**  
To restrict access to only a few connections, you can first create the connection to obtain the connection ARN. See [Configuring a source for a zero-ETL integration](#zero-etl-sources).

Create a role which has permissions for the integration to access the connection:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "GlueConnections",
            "Effect": "Allow",
            "Action": [
                "glue:GetConnections",
                "glue:GetConnection"
            ],
            "Resource": [
                "arn:aws:glue:*:111122223333:catalog",
                "arn:aws:glue:us-east-1:111122223333:connection/*"
            ]
        },
        {
            "Sid": "GlueActionBasedPermissions",
            "Effect": "Allow",
            "Action": [
                "glue:ListEntities",
                "glue:RefreshOAuth2Tokens"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "CloudWatchLogging",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
```

Trust policy:

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "glue.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

Associate the role with the connection using the AWS Glue CLI or API:

```
aws glue create-integration-resource-property \
--resource-arn arn:aws:glue:us-east-1:123456789012:connection/connectionName \
--source-processing-properties "{\"RoleArn\" : \"arn:aws:iam::123456789012:role/rolename\"}" \
--region us-east-1
```

## Setting up Oracle Database at AWS as source
<a name="zero-etl-config-source-oracle"></a>

Currently AWS Glue supports Oracle Databases at AWS as source and Redshift as target. The details of the setup can be found at [Setting up zero-ETL for Oracle Database at AWS](https://docs.aws.amazon.com/odb/latest/UserGuide/setting-up-zero-etl.html).

# Configuring a target for a zero-ETL integration
<a name="zero-etl-target"></a>

There are several options offered by AWS Glue when configuring a target for a zero-ETL integration. The target may be an encrypted Amazon Redshift data warehouse or a Lakehouse architecture of Amazon SageMaker.

Before selecting the target for the zero-ETL integration, you need to configure one of the following target resources. The configuration options for a target in a zero-ETL integration include:
+ A general purpose Amazon S3 bucket using the lakehouse architecture of Amazon SageMaker. See [Configuring a general purpose S3 bucket target](#zero-etl-config-target-regular-s3).
+ An Amazon S3 Tables bucket using the lakehouse architecture of Amazon SageMaker. See [Configuring an Amazon S3 Tables bucket target](#zero-etl-config-target-s3-tables).
+ An Amazon Redshift Managed Storage using the lakehouse architecture of Amazon SageMaker. See [Configuring an Amazon Redshift Managed Storage target](#zero-etl-config-target-redshift-managed-storage).
+ An Amazon Redshift data warehouse identified by a Redshift namespace. See [Configuring an Amazon Redshift data warehouse target](#zero-etl-config-target-redshift-data-warehouse).

**Note**  
You cannot modify the target of a zero-ETL integration after creation.

## Configuring a general purpose S3 bucket target
<a name="zero-etl-config-target-regular-s3"></a>

This section describes the prerequisites and setup steps for configuring a general purpose S3 bucket as storage for your target in a zero-ETL integration, using the Lakehouse architecture of Amazon SageMaker.

Before creating a zero-ETL integration with the Lakehouse architecture of Amazon SageMaker using general purpose S3 storage, you need to complete the following setup tasks:
+ Set up an AWS Glue database
+ Provide Catalog RBAC policy
+ Create target IAM role
+ Associate target role, KMS (optional) and Connection (optional) with target resource
+ (Optional) Configure target table properties

### Setting up an AWS Glue database
<a name="zero-etl-config-target-s3-glue-database"></a>

To set up a target database in the Data Catalog with an Amazon S3 general purpose bucket location:

1. In the AWS Glue console home page, select **Database** under Data Catalog.

1. Choose **Add database** in the top right corner. If you have already created a database, make sure that the location with Amazon S3 URI is set for the database.

1. Enter a name and **Location** (Amazon S3 URI). Note that the location is required for the zero-ETL integration. Click **Create database** when done.

**Note**  
The general purpose Amazon S3 bucket must be in the same region as the AWS Glue database.

For information on creating a new database in AWS Glue, see [Getting started with the Data Catalog](https://docs.aws.amazon.com/glue/latest/dg/start-data-catalog.html).

You can also use the [https://docs.aws.amazon.com/cli/latest/reference/glue/create-database.html](https://docs.aws.amazon.com/cli/latest/reference/glue/create-database.html) CLI to create the database in AWS Glue. Note that the `LocationUri` in `--database-input` is required.

#### Optimizing Iceberg tables
<a name="zero-etl-config-target-s3-iceberg-optimization"></a>

Once a table is created by AWS Glue in the target database, you can enable the compaction to speed up queries in Amazon Athena. For information on setting up the resources (IAM Role) for compaction, see [Table optimization prerequisites](https://docs.aws.amazon.com/glue/latest/dg/optimization-prerequisites.html).

For more information on setting up compaction on the AWS Glue table created by the integration, see [Optimizing Iceberg tables](https://docs.aws.amazon.com/glue/latest/dg/table-optimizers.html).

### Providing a catalog Resource Based Access (RBAC) policy
<a name="zero-etl-config-target-s3-rbac-policy"></a>

For integrations that use an AWS Glue database, add the following permissions to the catalog RBAC Policy to allow for integrations between source and target.

**Note**  
For cross-account integrations, both the user creating the integration role policy and catalog resource policy need to allow `glue:CreateInboundIntegration` on the resource. For same-account, either a resource policy or role policy allowing `glue:CreateInboundIntegration` on the resource is sufficient. Both scenarios do still need to allow `glue.amazonaws.com` to `glue:AuthorizeInboundIntegration`.

You can access the **Catalog settings** under **Data Catalog**. Then provide the following permissions and fill in the missing information.

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Principal": {
        "AWS": [
            "arn:aws:iam::123456789012:user/Alice"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "glue:CreateInboundIntegration"
      ],
      "Resource": [
          "arn:aws:glue:us-east-1:111122223333:catalog",
          "arn:aws:glue:us-east-1:111122223333:database/database-name"
      ],
      "Condition": {
        "StringLike": {
        "aws:SourceArn": "arn:aws:dynamodb:us-east-1:444455556666:table/table-name"
        }
      }
    },
    {
      "Principal": {
        "Service": [
          "glue.amazonaws.com"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "glue:AuthorizeInboundIntegration"
      ],
      "Resource": [
          "arn:aws:glue:us-east-1:111122223333:catalog",
          "arn:aws:glue:us-east-1:111122223333:database/database-name"
      ],
      "Condition": {
        "StringEquals": {
        "aws:SourceArn": "arn:aws:dynamodb:us-east-1:444455556666:table/table-name"
        }
      }
    }
  ]
}
```

### Creating a target IAM role
<a name="zero-etl-config-target-s3-iam-role"></a>

Create a target IAM role with the following permissions and trust relationships:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::amzn-s3-bucket",
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/prefix/*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "glue:GetDatabase"
            ],
            "Resource": [
                "arn:aws:glue:us-east-1:111122223333:catalog",
                "arn:aws:glue:us-east-1:111122223333:database/database-name"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "glue:CreateTable",
                "glue:GetTable",
                "glue:GetTables",
                "glue:DeleteTable",
                "glue:UpdateTable",
                "glue:GetTableVersion",
                "glue:GetTableVersions",
                "glue:GetResourcePolicy"
            ],
            "Resource": [
                "arn:aws:glue:us-east-1:111122223333:catalog",
                "arn:aws:glue:us-east-1:111122223333:database/database-name",
                "arn:aws:glue:us-east-1:111122223333:table/database-name/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "cloudwatch:PutMetricData"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "cloudwatch:namespace": "AWS/Glue/ZeroETL"
                }
            },
            "Effect": "Allow"
        },
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}
```

Add the following trust policy to allow the AWS Glue service to assume the role:

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "glue.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

### Associate target role, KMS (optional) and Connection (optional) with target resource
<a name="zero-etl-config-target-s3-associate-role"></a>

Associate the above target role with the target resource i.e. AWS Glue Database. Optionally, KMS for encrypting the data before storing in target iceberg table and Connection ARN for accessing the S3 bucket can be configured for the target AWS Glue database. This will allow AWS Glue to access data on the target S3 location using the provided role and optionally encrypt using the provided KMS key. If the target S3 bucket is configured to be accessible using a certain VPC, the connection ARN can be associated to allow AWS Glue to run the processing inside that VPC. For more information on setting up a VPC, see [Create a VPC](https://docs.aws.amazon.com/vpc/latest/userguide/create-vpc.html).

![\[The screenshot shows configuring a target in a zero-ETL integration.\]](http://docs.aws.amazon.com/glue/latest/dg/images/zero-etl-target-selection.png)


Or using the AWS Glue CLI / API:

```
aws glue create-integration-resource-property \
--resource-arn arn:aws:glue:us-east-1:123456789012:database/database-name \
--target-processing-properties '{"RoleArn": "arn:aws:iam::123456789012:role/gmi_target_role"}' \
--region us-east-1
```

### (Optional) Configure target table properties
<a name="zero-etl-config-target-s3-table-properties"></a>

Optionally, target table properties can be configured for the target tables that are going to be synced to the target.

You can configure these settings in the **Output settings** section of the integration creation workflow in the AWS Glue console:

![\[The screenshot shows the Output settings section with schema unnesting options, data partitioning options, and target table name configuration.\]](http://docs.aws.amazon.com/glue/latest/dg/images/zero-etl-output-settings-unnesting.png)


When you select **Specify custom partition keys**, you can configure partition keys and their function and conversion specs:

![\[The screenshot shows the Output settings with custom partition keys configuration and Partition Spec Configuration table.\]](http://docs.aws.amazon.com/glue/latest/dg/images/zero-etl-output-settings-partitioning.png)


If the source and target are in the same account, then this configuration can be done as part of integration creation workflow from the AWS Glue console UI. But if the target is in different account, then this configuration is required to be complete before creating the integration. When using the CLI or API, this should be done before invoking the Create-Integration API even when both source and target are in the same account. AWS Glue console UI just encapsulates this API call for the same account scenario.

If this is not configured, then default values will be used when syncing the table. This configuration can also be changed anytime after the integration creation as well.

**Note**  
If this property is updated after the integration is created, then it could trigger a full table resync when the updated configuration conflicts with the existing configuration. For example, updating the table "un-nesting" from 'No-Unnest' to 'Full-Unnest', or changing the partition column.

Using CLI or API:

```
aws glue create-integration-table-properties \
--resource-arn arn:aws:glue:us-east-1:123456789012:database/database-name \
--table-name table-name \
--target-table-config '{
        "UnnestSpec":"TOPLEVEL"|"FULL"|"NOUNNEST",
        "PartitionSpec":
            [
                {
                    "FieldName":"string",
                    "FunctionSpec":"string",
                    "ConversionSpec":"string"}
                    ...
             ],
         "TargetTableName":"string"
     }' \
--region us-east-1
```

After configuring the Lakehouse architecture of Amazon SageMaker with general purpose Amazon S3 bucket storage, you can proceed to [Configuring the integration with your target](#zero-etl-config-target-configuring-the-integration) to complete the integration setup.

## Configuring an Amazon S3 Tables bucket target
<a name="zero-etl-config-target-s3-tables"></a>

This section describes the prerequisites and setup steps for configuring Amazon S3 Tables as a target for your zero-ETL integration, using the lakehouse architecture of Amazon SageMaker.

Before creating a zero-ETL integration with Amazon S3 Tables as a target, you need to complete the following setup tasks:
+ Setup Amazon S3 tables bucket (and analytics services integration)
+ Provide Catalog RBAC policy
+ Create target IAM role
+ Associate target role, KMS (optional) and Connection (optional) with target resource
+ (Optional) Configure target table properties

### Setup Amazon S3 tables bucket (with analytics services integration)
<a name="zero-etl-config-target-s3-tables-setup"></a>

1. Create an S3 table bucket in your account by following the instructions at [Getting started with Amazon S3 Tables](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-getting-started.html).

1. Enable Analytics integrations with your S3-Table bucket by following these instructions: [Integrating AWS services with Amazon S3 Tables](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-aws.html).

1. This will create a new S3-Table Catalog in AWS Lake Formation.

### Provide Catalog RBAC Policy
<a name="zero-etl-config-target-s3-tables-rbac"></a>

The following permissions must be added to the Catalog RBAC Policy to allow for integrations between source and Amazon S3 tables catalog target.

Target AWS Glue Catalog resource policy needs to include AWS Glue Service permissions to `AuthorizeInboundIntegration`. Additionally, `CreateInboundIntegration` permission is required either on the source principal creating the integration or in the target AWS Glue resource policy.

**Note**  
For cross-account scenario, both source principal as well as target AWS Glue Catalog resource policy need to include `glue:CreateInboundIntegration` permissions on the resource.

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Principal": {
        "AWS": [
            "arn:aws:iam::123456789012:user/Alice"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "glue:CreateInboundIntegration"
      ],
      "Resource": [
          "arn:aws:glue:us-east-1:111122223333:catalog/s3tablescatalog/*"
      ],
      "Condition": {
        "StringLike": {
        "aws:SourceArn": "arn:aws:dynamodb:us-east-1:444455556666:table/table-name"
        }
      }
    },
    {
      "Principal": {
        "Service": [
          "glue.amazonaws.com"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "glue:AuthorizeInboundIntegration"
      ],
      "Resource": [
      "arn:aws:glue:us-east-1:111122223333:catalog/s3tablescatalog/*"
      ],
      "Condition": {
        "StringEquals": {
        "aws:SourceArn": "arn:aws:dynamodb:us-east-1:444455556666:table/table-name"
        }
      }
    }
  ]
}
```

**Note**  
Replace `s3tablescatalog` with the parent catalog name of your S3 tables (if different). Default value (when hosting S3-Table Catalog in the same account) for this is `s3tablescatalog`.

### Create target IAM Role
<a name="zero-etl-config-target-s3-tables-iam-role"></a>

Create a target IAM role with the following permissions and trust relationships:

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "s3tables:ListTableBuckets",
        "s3tables:GetTableBucket",
        "s3tables:GetTableBucketEncryption",
        "s3tables:GetNamespace",
        "s3tables:CreateNamespace",
        "s3tables:ListNamespaces",
        "s3tables:CreateTable",
        "s3tables:DeleteTable",
        "s3tables:GetTable",
        "s3tables:GetTableEncryption",
        "s3tables:ListTables",
        "s3tables:GetTableMetadataLocation",
        "s3tables:UpdateTableMetadataLocation",
        "s3tables:GetTableData",
        "s3tables:PutTableData"
      ],
      "Resource": "arn:aws:s3tables:us-east-1:111122223333:bucket/s3-table-bucket",
      "Effect": "Allow"
    },
    {
      "Action": [
        "cloudwatch:PutMetricData"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "cloudwatch:namespace": "AWS/Glue/ZeroETL"
        }
      },
      "Effect": "Allow"
    },
    {
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*",
      "Effect": "Allow"
    }
  ]
}
```

Add the following trust policy in the target IAM role to allow AWS Glue Service to assume it:

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "glue.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

**Note**  
Make sure there is no explicit DENY statement for this target IAM role in the S3-Tables bucket resource policy. An explicit DENY would override any ALLOW permissions and prevent the integration from working properly.

### Associate target role, KMS (optional) and Connection (optional) with target resource
<a name="zero-etl-config-target-s3-tables-associate-role"></a>

Associate the above target role with the target resource. Optionally, KMS for encrypting the data before storing in target iceberg table and Connection ARN for accessing target S3 bucket can be configured. If the target S3 bucket is configured to be accessible using a certain VPC, the connection ARN can be associated to allow AWS Glue to run the processing inside that VPC. For more information on setting up a VPC, see [Create a VPC](https://docs.aws.amazon.com/vpc/latest/userguide/create-vpc.html).

Using the AWS Glue CLI / API:

```
aws glue create-integration-resource-property \
--resource-arn arn:aws:glue:us-east-1:123456789012:catalog/s3tablescatalog/S3 table bucket name \
--target-processing-properties '{
                    "RoleArn": "arn:aws:iam::123456789012:role/target_role"
                }' \
--region us-east-1
```

### (Optional) Configure target table properties
<a name="zero-etl-config-target-s3-tables-table-properties"></a>

Optionally, target table properties can be configured for the target tables that are going to be synced to the target. The same rules apply as described in the general purpose S3 target section.

Using CLI or API:

```
aws glue create-integration-table-properties \
--resource-arn arn:aws:glue:us-east-1:123456789012:catalog/s3tablescatalog/S3 table bucket name \
--table-name table-name \
--target-table-config '' \
--region us-east-1
```

After configuring the Amazon S3-Tables storage using Lakehouse architecture of Amazon SageMaker, you can proceed to [Configuring the integration with your target](#zero-etl-config-target-configuring-the-integration) to complete the integration setup.

## Configuring an Amazon Redshift Managed Storage target
<a name="zero-etl-config-target-redshift-managed-storage"></a>

This section describes the prerequisites and setup steps for configuring an Amazon Redshift managed storage (RMS) as a target for your zero-ETL integration, using the lakehouse architecture of Amazon SageMaker.

Before creating a zero-ETL integration with a Lakehouse architecture of Amazon SageMaker using Redshift managed storage, you need to complete the following setup tasks:
+ Set up an Amazon Redshift cluster or Serverless workgroup
+ Register the Amazon Redshift integration with Lake Formation
+ Create a managed catalog in Lake Formation
+ Configure IAM permissions

### Setting up Amazon Redshift managed storage
<a name="zero-etl-config-target-rms-setup"></a>

To set up Amazon Redshift managed storage for your zero-ETL integration:
+ Create or use an existing Amazon Redshift cluster or Serverless workgroup. Make sure the target Amazon Redshift workgroup or cluster has the `enable_case_sensitive_identifier` parameter turned on for the integration to be successful. For more information on enabling case sensitivity, see [Turn on case sensitivity for your data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-setting-up.case-sensitivity.html) in the Amazon Redshift management guide.
+ Register an integration from Redshift into the catalog in AWS Lake Formation. See [Registering Amazon Redshift clusters and namespaces to the Data Catalog](https://docs.aws.amazon.com/redshift/latest/dg/iceberg-integration-register.html).
+ Create a federated or managed catalog in AWS Lake Formation. For more information, see: 
  + [Bringing Amazon Redshift data into the Data Catalog](https://docs.aws.amazon.com/lake-formation/latest/dg/managing-namespaces-datacatalog.html)
  + [Creating an Amazon Redshift managed catalog in the Data Catalog](https://docs.aws.amazon.com/lake-formation/latest/dg/create-rms-catalog.html)
+ Configure IAM permissions for the target role. The role needs permissions to access both Redshift and Lake Formation resources. At minimum, the role should have: 
  + Permissions to access the Redshift cluster or workgroup
  + Permissions to access the Lake Formation catalog
  + Permissions to create and manage tables in the catalog
  + CloudWatch and CloudWatch Logs permissions for monitoring

After configuring the Amazon SageMaker Lakehouse catalog with Amazon Redshift managed storage, you can proceed to [Configuring the integration with your target](#zero-etl-config-target-configuring-the-integration) to complete the integration setup.

## Configuring an Amazon Redshift data warehouse target
<a name="zero-etl-config-target-redshift-data-warehouse"></a>

This section describes the prerequisites and setup steps for configuring an Amazon Redshift data warehouse as a target for your zero-ETL integration.

Before creating a zero-ETL integration with an Amazon Redshift data warehouse target, you need to complete the following setup tasks:
+ Set up an Amazon Redshift cluster or Serverless workgroup
+ Configure case sensitivity
+ Configure IAM permissions

### Setting up the Amazon Redshift data warehouse
<a name="zero-etl-config-target-redshift-setup"></a>

To set up an Amazon Redshift data warehouse for your zero-ETL integration:

1. Navigate to the [Amazon Redshift console](https://console.aws.amazon.com/redshiftv2/home) and click **Create cluster** or use an existing cluster. To create an Amazon Redshift cluster, see [Creating a cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/create-cluster.html). For Amazon Redshift Serverless, click **Create workgroup**. To create an Amazon Redshift Serverless workgroup, see [Creating a workgroup with a namespace](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-console-workgroups-create-workgroup-wizard.html).

1. If creating a new cluster, choose an appropriate cluster size and ensure your cluster is encrypted. For Serverless, configure the workgroup settings according to your requirements.

1. Make sure the target Amazon Redshift workgroup or cluster has the `enable_case_sensitive_identifier` parameter turned on for the integration to be successful. For more information on enabling case sensitivity, see [Turn on case sensitivity for your data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-setting-up.case-sensitivity.html) in the Amazon Redshift management guide.

1. Configure IAM permissions to allow the zero-ETL integration to access your Amazon Redshift data warehouse. You'll need to create an IAM role with the following permissions: 
   + Permissions to access the Amazon Redshift cluster or workgroup
   + Permissions to create and manage databases and tables in Amazon Redshift
   + CloudWatch and CloudWatch Logs permissions for monitoring

1. After the Amazon Redshift workgroup or cluster setup is complete, you need to configure your data warehouse for zero-ETL integrations. See [Getting started with zero-ETL integrations](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html) in the Amazon Redshift Management Guide for more information.

**Note**  
When using a Amazon Redshift data warehouse as a target, the integration creates a schema in the specified database to store the replicated data. The schema name is derived from the integration name.

**Note**  
The target Amazon Redshift workgroup or cluster must have the `enable_case_sensitive_identifier` parameter turned on for the integration to be successful.

After configuring the Amazon Redshift data warehouse, you can proceed to [Configuring the integration with your target](#zero-etl-config-target-configuring-the-integration) to complete the integration setup.

## Configuring the integration with your target
<a name="zero-etl-config-target-configuring-the-integration"></a>

After configuring the source and target resources, follow these steps to complete the integration setup:

1. Navigate to "Zero-ETL integrations" page and start the integration creation workflow.

1. Select the source resource configured in the previous steps.

1. Select or specify the target resource (same account or cross account) configured in the previous steps.

1. Select the target IAM role configured previously.

1. Select the **Fix it for me** option (only available when the target is in same account). 
   + For the regular Amazon S3 (AWS Glue Database) and S3-Table (Catalog) target, this will: 
     + Apply an authorized service principal on the target Catalog resource policy.
     + Apply an authorized AWS Glue source Principal ARN to the target Catalog resource policy.
   + For the Amazon Redshift target, this will: 
     + Apply an authorized service principal on the Amazon Redshift cluster or Serverless workgroup.
     + Apply an authorized AWS Glue source ARN to the Amazon Redshift cluster or Serverless workgroup.
     + Associate a new parameter group with `enable_case_sensitive_identifier = true`.

Use the following to create the integration via API or CLI: [CreateIntegration API](https://docs.aws.amazon.com/glue/latest/webapi/API_CreateIntegration.html).

# Configuring an integration
<a name="zero-etl-configuring-integration"></a>

When setting up a zero-ETL integration, you can configure various parameters to control how data is synchronized between your source and target systems. The following settings are currently available for SaaS sources only.

## Configuring Refresh Interval
<a name="zero-etl-config-refresh-interval"></a>

You can configure the Refresh interval for integration for SaaS sources at the time of integration creation. The default value is 1 hour. You can configure the frequency at which CDC (Change Data Capture) pulls or incremental loads should occur. This provides flexibility to align the refresh rate with your specific data update patterns, system load considerations, and performance optimization goals. Time increment can be set from 15 minutes to 8640 minutes (six days). The refresh interval cannot be modified after the integration is created when the target is Redshift. For other targets, the refresh interval can be modified after integration creation. For DynamoDB sources with refresh intervals of 24 hours or more, see [Sequential daily batches for DynamoDB sources](#zero-etl-config-refresh-interval-ddb-batches) for details about sequential daily batch processing.

This can be done through console, by updating the refresh interval within Replication Settings.

![\[The screenshot shows the refreshInterval parameter configuration in the zero-ETL integration settings.\]](http://docs.aws.amazon.com/glue/latest/dg/images/refreshinterval.png)


The time increment can be set from 15 minutes to 8640 minutes (six days), allowing you to balance between data freshness and system resource utilization. Currently, the refresh interval is customizable for both DynamoDB and SaaS sources:
+ **Minimum interval:** 15 minutes
+ **Maximum interval:** 8640 minutes (6 days)
+ **Default value:** 15 minutes for DynamoDB source and 60 minutes for SaaS source

Factors to consider when choosing a refresh interval:
+ **Data volatility:** How frequently your source data changes
+ **Business requirements:** How current your analytics data needs to be
+ **Cost considerations:** More frequent updates may result in higher processing and storage costs

**Note**  
RefreshInterval parameter defines frequency of trigger of CDC. The actual refresh frequency may be affected by the volume of changes in your source data and the processing capacity of the target system. Monitor your integration performance and adjust the refresh interval as needed to optimize for your specific use case.

Or through API by passing the `RefreshInterval` within [IntegrationConfig](https://docs.aws.amazon.com/glue/latest/webapi/API_IntegrationConfig.html) as part of CreateIntegration Request. To modify the refresh interval programmatically, you can use the [ModifyIntegration API](https://docs.aws.amazon.com/glue/latest/webapi/API_ModifyIntegration.html#API_ModifyIntegration_RequestSyntax) with the IntegrationConfig parameter.

### Sequential daily batches for DynamoDB sources
<a name="zero-etl-config-refresh-interval-ddb-batches"></a>

For zero-ETL integrations with an Amazon DynamoDB source, when you configure a refresh interval of 1440 minutes (24 hours) or greater, the integration uses sequential daily batch processing instead of a single export operation. This behavior is due to the [DynamoDB export window limitation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ServiceQuotas.html), which has a maximum export period of 24 hours.

When the refresh interval exceeds 24 hours, the integration operates as follows:

1. The CDC process waits for the full refresh interval duration (for example, 6 days for a 8640-minute interval).

1. After the refresh interval elapses, the integration performs multiple sequential DynamoDB exports, each covering up to a 24-hour window.

1. The CDC jobs process each batch sequentially to capture all changes that occurred during the refresh interval period.

For example, if you set a refresh interval of 8640 minutes (6 days), the integration will wait 6 days and then execute 6 or 7 sequential exports (1 tail export covering extra time spent on export operations) and CDC jobs to synchronize all changes from that period.

## On-demand Snapshot
<a name="zero-etl-config-continuous-sync"></a>

Zero-ETL by default includes continuous data capture (CDC) but if you have use cases to replicate full data once you can do so by using the On-demand Snapshot feature. The feature currently supported for only SaaS sources can be used to replicate data once without continuous synchronization. This option provides one-time data replication with no ongoing updates, and requires manual cleanup. Once replication is complete, we recommend deleting the integration resource to avoid reaching the account integration limit.

![\[The screenshot shows the On-demand Snapshot setting configuration.\]](http://docs.aws.amazon.com/glue/latest/dg/images/ContinuousSync.png)


Or through API by setting the `ContinuousSync` parameter to `false` within [IntegrationConfig](https://docs.aws.amazon.com/glue/latest/webapi/API_IntegrationConfig.html) as part of CreateIntegration Request.

**Note**  
The On-demand Snapshot setting cannot be modified after the integration is created. Choose this option carefully based on your data synchronization requirements.

## Modifying Refresh interval
<a name="zero-etl-config-modify-refresh-interval"></a>

This feature is currently only available for AWS Glue targets and allows you to update the refresh interval for an existing integration.

# Creating and managing integrations
<a name="zero-etl-creating-managing"></a>


## Creating an integration
<a name="zero-etl-creating"></a>

This section describes the general steps to create an integration. This example uses Amazon DynamoDB as a source.

1. On the AWS Glue console home page, select **Zero-ETL integrations**.

1. You can view all your integrations on the Zero ETL integration home page. To create a new integration, select **Create zero-ETL integration**.   
![\[The screenshot shows the main zero-ETL integration page.\]](http://docs.aws.amazon.com/glue/latest/dg/images/zero-etl-main.png)

1. You are prompted to select a **Source Type**. Select your source and click **Next**. Refer to the source configuration sections for SaaS integration sources.

1. In the **Configure source and target** page, select the tables or entities to replicate. For Amazon DynamoDB make sure the PITR and RBAC policy is configured.

1. Specify your integration target: 
   + For an AWS Glue Data Catalog target, select the AWS Glue database you want to replicate the data to.
   + For an Amazon Redshift data warehouse target, select the Redshift cluster namespace or Redshift Serverless workgroup namespace.

   For more information, see [Configuring the integration with your target](zero-etl-target.md#zero-etl-config-target-configuring-the-integration).

1. Provide the **Target IAM Role** that you created in the prerequisites.

1. If you want to configure an optional **Target KMS Key** for your data being stored in the target, provide an enabled KMS Key. Likewise, if you want to configure a target network connection, select an AWS Glue connection.

1. The **Fix Target** button configures some of the steps in the Prerequisite section of this documentation. Namely it will 1) provide a Catalog RBAC Policy and 2) if no Amazon S3 URI is provided it will generate one for you, otherwise it will use the provided URI.

1. In the **Output setting** section of the **Configure source and target** page, select a schema unnesting option that you want for your data in the target. If you want to use customer partition keys for your data, select **Specify custom partition keys** and provide up to 10 keys. Otherwise, you can simply use the partition keys that are assigned to your DynamoDB table being replicated.

1. In the **Security and data encryption** section, you can provide a KMS key that will be used in the intermediary process of replicating your data to the target. Otherwise, an AWS managed KMS key will be used. Enter a name for the Zero ETL integration in **Integration details**.

1. Review and make sure that all the provided details are correct. Click **Create and launch integration** once everything has been confirmed.

1. In the Zero ETL home page, you can select the integration you created and the details for your integrations will appear. The "Status" indicates the state of your integration.

## Modifying an integration
<a name="zero-etl-modifying"></a>

You can modify an existing integration.

1. Select **Edit** in the top right corner of your integration details page.

1. On the **Edit source and target** page you can change the Target IAM role and Target network connection. The other fields are not editable after integration creation. Click **Next**.

1. You can also edit the name and description of the integration in the **Edit integration and configuration** page. Click **Next**.

1. Review your edits and once confirmed, click **Update integration**.

## Deleting an integration
<a name="zero-etl-deleting"></a>

Delete is a terminal state for an integration. Once deleted, the integration cannot be revived. Deleting an integration clears up all internal metadata and any intermediate stored data.

During this process any running tasks which are writing data to a target table are terminated. AWS Glue will not delete or cleanup the target AWS Glue database (in the Data Catalog) and the associated data in the Amazon S3 bucket in your account. You need to explicitly clean those up if required.

To delete an integration:

1. In the integration details page, click **Delete**.

1. Enter "Delete" and click **Delete**. Note: This is an irreversible action.

1. In the integration details page, the status shows "Deleting". Once the integration is actually deleted, it will no longer appear on the Zero ETL integration home page.

## Integration states
<a name="zero-etl-integration-states"></a>

Integration goes through various states from creation to deletion:
+ `CREATING` - This is the first state when integration creation is initiated. In this state, AWS Glue does the initializations. This state should quickly move to CREATED state unless some configurations are missing.
+ `ACTIVE` - Once the integration reaches this state, AWS Glue will start the data transfer (initial full load). Unless there are permission issues, after the initial full load completes, periodic change data capture will follow.
+ `MODIFYING` - Once you make modification to the integration, the integration goes into Modifying state. Once the modification is applied, the integration goes to `ACTIVE` if integration was successful after the modification or will go into `NEEDS_ATTENTION` or `FAILED` if there were any issues.
+ `NEEDS_ATTENTION` - Integration will move into this state if there are either user error or system error. User error includes missing permissions, missing source or target resource(s), unsupported data errors. System error includes internal system errors. For both the error types, AWS Glue Zero ETL will keep retrying for data sync for 7 days before marking the integration as FAILED. If you fix the issue before that, the integration will become ACTIVE again and start transferring data.
+ `SYNCING` - Integration will move into this state if AWS Glue Zero ETL detects any data type changes in regards to incoming schema for columns within table/tables. In such cases AWS Glue Zero ETL will request fresh set of snapshots for all such tables. During this time the integration will be in SYNCING state and will eventually transition to ACTIVE state once newly requested snapshots are available for ingestion.
+ `FAILED` - This is a non-recoverable state. Once the integration moves into this state, it cannot be recovered. The only way to start the data transfer from source to target again is to delete and re-create the integration. If AWS Glue Zero ETL identifies that user error or system error has not been fixed for a period of 7 days and all retries are exhausted, AWS Glue Zero ETL will mark the integration as FAILED.
+ `DELETING` - When you invoke delete-integration API, AWS Glue first moves the integration into DELETING state. After all the metadata is cleared and internal processings are terminated, AWS Glue will move the integration into DELETED state.
+ `DELETED` - This is the terminal state for integration. Integration cannot be moved from this state into any other state. If the data transfer is required from same source to target, you should create the integration again.

# Monitoring an integration
<a name="zero-etl-monitoring"></a>


## Viewing Amazon CloudWatch logs for an integration
<a name="zero-etl-cloudwatch-logs"></a>

AWS Glue zero-ETL integrations generate CloudWatch logs for visibility into your data movement. Log events relating to each successful ingestion or any failures experienced due to problematic data records at source, or data write errors due to schema changes or insufficient permissions are emitted to a default log group created in a customer account.

For each integration created, the log events for that integration will be collected under `/aws-glue/zeroETL-integrations/logs/` in CloudWatch. Inside the log group, log messages will be split into log streams. Each integration created has a dedicated log stream to where all logs for that integration are written.

**Note**  
For a cross-account scenario, source processing logs are emitted in the source account where the integration exists and target processing logs are emitted in the target account where the target database exists.

### IAM permissions required to enable logging
<a name="zero-etl-cloudwatch-logs-iam"></a>

When creating your integration, the following IAM permissions are needed by the source and target roles to enable CloudWatch logging for an integration. AWS Glue zero-ETL integrations use these permissions provided in the source and target roles to emit CloudWatch logs to customer accounts.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}
```

------

### Log messages
<a name="zero-etl-cloudwatch-logs-messages"></a>

Log format: zero-ETL integrations emit four types of log messages:

```
// Ingestion started
{
"integrationArn": "arn:aws:glue:us-east-2:123456789012:integration/1a012bba-123a-1bba-ab1c-173de3b12345",
...
    "messageType": "IngestionStarted",
    "details": {
        "tableName": "testDDBTable",
        "message": "Ingestion Job started"
    }
}
// Data processing stats on successful table ingestion
{
...
    "messageType": "IngestionProcessingStats",
    "details": {
        "tableName": "testDDBTable",
        "insert_count": 100,
        "update_count": 10,
        "delete_count": 10
    }
}
// Ingestion failure logs for failed table-processing
{
...
    "messageType": "IngestionFailed",
    "details": {
        "tableName": "testDDBTable",
        "errorMessage": "Failed to ingest data with error: Target Glue database not found.",
        "error_code" : "client_error"
    }
}
// Ingestion completed notification with lastSyncedTimestamp
{
...
    "messageType": "IngestionCompleted",
    "details": {
        "tableName": "testDDBTable",
        "message": "Ingestion Job completed"
        "lastSyncedTimestamp": "1132344255745"
    }
}
```

## Viewing Amazon CloudWatch metrics for an integration
<a name="zero-etl-cloudwatch-metrics"></a>

Once an integration completes, you can see these CloudWatch metrics generated in your account for each AWS Glue job run:

CloudWatch metrics namespace: "AWS/Glue/ZeroETL"

Metrics dimensions:
+ `integrationArn`
+ `loadType`
+ `tableName`

Metric names:
+ `InsertCount` - number of records inserted in the target Iceberg table.
+ `UpdateCount` - number of records updated in the target Iceberg table.
+ `DeleteCount` - number of records deleted from the target Iceberg table.
+ `IngestionSucceeded` - count 1, if the ingestion succeeded for the integration.
+ `IngestionFailed` - count 1, if the ingestion failed for the integration.
+ `LastSyncTimestamp` - timestamp until which source has been synced to target.

## Managing event notifications with Amazon EventBridge
<a name="zero-etl-eventbridge-notifications-setup"></a>

Zero-ETL integrations use Amazon EventBridge to manage event notifications to keep you up-to-date regarding changes in your integrations. Amazon EventBridge is a serverless event bus service that you can use to connect your applications with data from a variety of sources. In this case, the event source is AWS Glue. Events, which are monitored changes in an environment, are sent to EventBridge from AWS Glue automatically. Events are delivered in near real time.

To capture all Zero-ETL notifications, create an EventBridge rule which matches the following:

```
{
  "source": [{
    "prefix": "aws.glue-zero-etl"
  }],
  "detail-type": [{
    "prefix": "Glue Zero ETL"
  }]
}
```

The following table includes zero-ETL integration events:


| Detail type | Explanation | 
| --- | --- | 
| AWS Glue Zero ETL Ingestion Completed | Individual execution for an entity has completed successfully. | 
| AWS Glue Zero ETL Ingestion Failed | Individual execution for an entity has completed unsuccessfully (either with a client or system error). | 
| AWS Glue Zero ETL Integration Resynced | Integration has been RESYNCED. | 
| AWS Glue Zero ETL Integration Failed | Integration status has changed to FAILED due to an error. | 
| AWS Glue Zero ETL Integration Needs Attention | Integration status has changed to NEEDS\$1ATTENTION due to an error. | 
| AWS Glue Zero ETL Ingestion In Progress | Individual execution for an entity has made partial progress towards completion. | 

# Schema unnesting & data partitioning
<a name="zero-etl-partition-schema-unnesting"></a>

 When working with NoSQL data sources like DynamoDB and SaaS applications, data often presents unique challenges for analytics: 

1. Records within the same table may have different schema

1. Nested records within the same table can be represented differently

1. Complex nested structures like maps and arrays require transformation for efficient querying

1. Optimal data organization is needed to ensure query performance at scale

 AWS Glue Zero-ETL integrations address these challenges through two powerful capabilities: 
+  **Schema Unnesting:** Automatically flattens complex nested data structures into analytics-friendly formats, with configurable levels of unnesting to balance between preserving data structure and optimizing for query simplicity. 
+  **Data Partitioning:** Organizes data into logical partitions based on specified columns or time-based dimensions, improving query performance and reducing costs by enabling partition pruning during query execution. 

 In order to query such data sources effectively, AWS Glue Zero-ETL provides out-of-the-box schema handling and partitioning schemes for source data being replicated in the target AWS Glue Database. You can configure schema unnesting and partitioning settings for each table through the CreateIntegrationTableProperty API, allowing for fine-tuned control over how data is structured and organized for analytics workloads. 

## Default unnesting & partitioning behavior
<a name="default-behavior"></a>

1. AWS Glue Zero-ETL defaults to FULL Unnest when no Unnesting options are provided for target table

1. AWS Glue Zero-ETL defaults to Bucket partitioning when no PartitionSpec are provided for target table

# Schema unnesting
<a name="zero-etl-ddb-schema-unnesting"></a>

 When integrating with analytics services through Zero-ETL, you can choose how nested structures are represented in the target tables. AWS Glue Zero-ETL provides schema unnesting options to flatten complex data structures into more analytics-friendly formats. 

## Unnesting options
<a name="unnesting-options"></a>

 When creating a Zero-ETL integration with a source, you can choose from the following unnesting options. These options correspond to specific enumeration values that you'll use when calling the CreateIntegrationTableProperty API. For all unnest options, we would traverse to most inner layer and map DDB type to target spark/iceberg primitive type with best efforts. The type mapping between source DDB and target table is as below: 


| DDB source data type | Target table data type | 
| --- | --- | 
| "S" | StringType | 
| "B" | BinaryType | 
| "N" | DoubleType | 
| "BOOL" | BooleanType | 
| "SS" | ArrayType(StringType) | 
| "NS" | ArrayType(DoubleType) | 
| "BS" | ArrayType(BinaryType) | 
| "L" | ArrayType(StringType) | 
| "NULL" | Ignore | 
| "M" | StructType (TOP/NOUNNEST) | 

No unnesting - NO\$1UNNEST  
 **API value: `NO_UNNEST`**   
 Preserves the original nested structure of Amazon DynamoDB items. Maps and lists are stored as structured columns in the target.   
 Best for: Preserving the exact structure of your Amazon DynamoDB data when your analytics tools can work with nested data. 

Top level - TOP\$1LEVEL  
 **API value: `TOP_LEVEL`**   
 Flattens the top level of nested maps into individual columns. List structures remain nested.   
 Preserves the exact structure of your Amazon DynamoDB data when your analytics tools can work with nested data with all DDB type information removed.   
 Best for: Balancing between data structure preservation and query simplicity when your Amazon DynamoDB table items have a consistent schema. 

Unnest all levels - FULL (default)  
 **API value: `FULL`**   
 Recursively flattens all nested structures (maps and lists) into individual columns with dot notation for naming.   
 Best for: Maximizing query simplicity when working with deeply nested structures and analytics tools that prefer flat schemas.   
 Full unnesting can lead to very wide tables with many columns if your DynamoDB data has variable or deeply nested structures. 

**Example Using unnesting options in the API**  
 When configuring schema unnesting through the CreateIntegrationTableProperty API, specify the unnesting option in the UnnestSpec parameter:   

```
aws glue create-integration-table-property 
  --resource-arn "arn:aws:glue:us-east-1:123456789012:database/my_db" 
  --table-name "my-table" 
  --cli-input-json '{
      "TargetTableConfig": {
          "UnnestSpec": "FULL",
          "TargetTableName": "my-target-table",
      }
  }'
```

## Unnesting examples
<a name="unnesting-examples"></a>

 Consider a DynamoDB item with the following structure: 

```
// Input DynamoDB Record
{
  "Item": {
    "col_1": {
      "S": "value_1"
    },
    "col_2": {
      "M": {
        "col_3": {
          "M": {
            "id": {
              "S": "value_3"
            }
          }
        },
        "col_4": {
          "BOOL": true
        }
      }
    }
  }
}
```

### NO\$1UNNEST example
<a name="no-unnesting-example"></a>

 With NO\$1UNNEST, the entire row is stored within one column plus the primary key. DynamoDB type information is preserved. This maintains compatibility with Redshift querying patterns. 

Resulting Iceberg table (assuming col\$11 is primary key):


| col\$11 (string) | value (struct) | 
| --- | --- | 
| value\$11 | <pre>{<br />  "col_2": {<br />    "M": {<br />      "col_3": {<br />        "M": {<br />          "id": {<br />            "S": "value_3"<br />          }<br />        }<br />      },<br />      "col_4": {<br />        "BOOL": true<br />      }<br />    }<br />  }<br />}</pre> | 

Queries would need to use struct and array access patterns:

```
SELECT 
  value.col_1,
  value.col_2.M.col_3.M.id.S,
  value.col_2.M.col_4.BOOL
FROM product_table;
```

### TOP\$1LEVEL example
<a name="unnest-one-level-example"></a>

 With TOP\$1LEVEL, only the top-level fields are unnested while keeping nested fields intact as structs. DynamoDB type information is removed and typing is maintained. Converts to string type when schema conflicts occur. 

Resulting Glue table after replication:


| col\$11 (string) | col\$12 (struct) | 
| --- | --- | 
| value\$11 | <pre>{<br />  "col_3": {<br />    "id": "value_3"<br />  },<br />  "col_4": true<br />}</pre> | 

Queries would be simplified for the first level:

```
SELECT 
  col_1, 
  col_2.col_3.id,
  col_2.col_4
FROM product_table;
```

### FULL example
<a name="unnest-all-levels-example"></a>

 With FULL unnesting, both top-level fields and nested struct/map fields are flattened. Dot notation is used for nested fields (e.g., "col\$12.col\$13.id"). Array elements remain unnested. Each leaf node becomes a top-level column. 

Resulting Glue table after replication:


| col\$11 (string) | col\$12.col\$13.id (string) | col\$12.col\$14 (boolean) | 
| --- | --- | --- | 
| value\$11 | value\$13 | TRUE | 

Queries would be fully flattened:

```
SELECT 
  col_1, 
  "col_2.col_3.id",
  "col_2.col_4"
FROM product_table;
```

# Data partitioning
<a name="zero-etl-data-partitioning"></a>

## What is data partitioning?
<a name="partitioning-overview"></a>

 Data partitioning is a technique that divides large datasets into smaller, more manageable segments called partitions. In the context of AWS Glue Zero-ETL integrations, partitioning organizes your data in the target location based on specific column values or transformations of those values. 

### Benefits of data partitioning
<a name="partitioning-benefits"></a>

 Effective data partitioning provides several key benefits for analytics workloads: 
+  **Improved query performance:** Queries can skip irrelevant partitions (partition pruning), reducing the amount of data that needs to be scanned. 
+  **Reduced costs:** By scanning less data, you can lower compute and I/O costs for your analytics queries. 
+  **Better scalability:** Partitioning allows parallel processing of data segments, enabling more efficient scaling of analytics workloads. 
+  **Simplified data lifecycle management:** You can manage retention policies at the partition level, making it easier to archive or delete older data. 

### Key partitioning concepts
<a name="partitioning-concepts"></a>

Partition columns  
 Columns in your data that are used to determine how records are organized into partitions. Effective partition columns should align with common query patterns and have appropriate cardinality. 

Partition functions  
 Transformations applied to partition column values to create the actual partition boundaries. Examples include identity (using the raw value) and time-based functions (year, month, day, hour). 

Partition pruning  
 The process where the query engine identifies and skips partitions that don't contain relevant data for a query, significantly improving performance. 

Partition granularity  
 The level of detail at which data is partitioned. Finer granularity (more partitions) can improve query performance but may increase metadata overhead. Coarser granularity (fewer partitions) reduces metadata overhead but may result in scanning more data than necessary. 

### Partitioning in AWS Glue Zero-ETL integrations
<a name="partitioning-in-zero-etl"></a>

 AWS Glue Zero-ETL integrations use Apache Iceberg table format, which provides advanced partitioning capabilities. When you create a Zero-ETL integration, you can: 
+ Use default partitioning strategies optimized for your data source
+ Define custom partitioning specifications tailored to your query patterns
+ Apply transformations to partition columns (especially useful for timestamp-based partitioning)
+ Combine multiple partition strategies for multi-level partitioning

 Partitioning configurations are specified through the `CreateIntegrationTableProperty` API when setting up your Zero-ETL integration. Once configured, AWS Glue automatically applies these partitioning strategies to organize your data in the target location. 

## Partition specification API reference
<a name="partition-api-reference"></a>

Use the following parameters in the CreateIntegrationTableProperties API to configure partitioning:

PartitionSpec  
An array of partition specifications that defines how data is partitioned in the target location.  

```
{
  "partitionSpec": [
    {
      "fieldName": "timestamp_col",
      "functionSpec": "month",
      "conversionSpec": "epoch_milli"
    },
    {
      "fieldName": "category",
      "functionSpec": "identity"
    }
  ]
}
```

FieldName  
A UTF-8 string (1-128 bytes) specifying the column name to use for partitioning.

FunctionSpec  
Specifies the partitioning function. Valid values:  
+ `identity` - Uses source values directly without transformation
+ `year` - Extracts the year from timestamp values (e.g., 2023)
+ `month` - Extracts the month from timestamp values (e.g., 2023-01)
+ `day` - Extracts the day from timestamp values (e.g., 2023-01-15)
+ `hour` - Extracts the hour from timestamp values (e.g., 2023-01-15-14)
 Time-based functions (`year`, `month`, `day`, `hour`) require the `ConversionSpec` parameter to specify the source timestamp format. 

ConversionSpec  
 A UTF-8 string that specifies the timestamp format of the source data. Valid values are:   
+ `epoch_sec` - Unix epoch timestamp in seconds
+ `epoch_milli` - Unix epoch timestamp in milliseconds
+ `iso` - ISO 8601 formatted timestamp

## Partitioning strategies
<a name="partitioning-strategies"></a>

### Default partitioning
<a name="default-partitioning"></a>

 When no partition columns are specified, AWS Glue Zero-ETL applies default partitioning strategies optimized for your data source: 
+  **Primary key-based partitioning:** For sources with primary keys (like DynamoDB tables), AWS Glue Zero-ETL automatically partitions data using the primary key with bucketing to prevent partition explosion. 

 Default partitioning is designed to work well for common query patterns without requiring manual configuration. However, for specific query patterns or performance requirements, you may want to define custom partitioning strategies. 

### User-defined partitioning strategies
<a name="user-defined-partitioning"></a>

 AWS Glue Zero-ETL allows you to define custom partitioning strategies using the `PartitionSpec` parameter. You can specify one or more partition columns and apply different partitioning functions to each column. 

 **Identity partitioning** uses the raw values from a column to create partitions. This strategy is useful for columns with low to medium cardinality, such as category, region, or status fields. 

**Example Identity partitioning example**  

```
{
  "partitionSpec": [
    {
      "fieldName": "category",
      "functionSpec": "identity"
    }
  ]
}
```
 This creates separate partitions for each unique value in the "category" column. 

**Warning**  
 Avoid using identity partitioning with high-cardinality columns (like primary keys or timestamps) as it can lead to partition explosion, which degrades performance and increases metadata overhead. 

 **Time-based partitioning** organizes data based on timestamp values at different granularities (year, month, day, or hour). This strategy is ideal for time-series data and enables efficient time-range queries. 

 When using time-based partitioning, AWS Glue Zero-ETL can automatically convert various timestamp formats to a standardized format before applying the partition function. This conversion is specified using the `ConversionSpec` parameter. 

**Example Time-based partitioning example**  

```
{
  "partitionSpec": [
    {
      "fieldName": "created_at",
      "functionSpec": "month",
      "conversionSpec": "epoch_milli"
    }
  ]
}
```
 This partitions data by month based on the "created\$1at" column, which contains Unix epoch timestamps in milliseconds. 

 AWS Glue Zero-ETL supports the following time-based partition functions: 
+  **year:** Partitions data by year (e.g., 2023, 2024) 
+  **month:** Partitions data by month (e.g., 2023-01, 2023-02) 
+  **day:** Partitions data by day (e.g., 2023-01-01, 2023-01-02) 
+  **hour:** Partitions data by hour (e.g., 2023-01-01-01, 2023-01-01-02) 

 AWS Glue Zero-ETL supports the following timestamp formats through the `ConversionSpec` parameter: 
+  **epoch\$1sec:** Unix epoch timestamps in seconds 
+  **epoch\$1milli:** Unix epoch timestamps in milliseconds 
+  **iso:** ISO 8601 formatted timestamps 

**Note**  
 The original column values remain unchanged in your source data. AWS Glue only transforms partition column values to Timestamp Type in the target database table. The transformations only apply to the partitioning process. 

 **Multi-level partitioning** combines multiple partition strategies to create a hierarchical partitioning scheme. This is useful for optimizing different types of queries against the same dataset. 

**Example Multi-level partitioning example**  

```
{
  "partitionSpec": [
    {
      "fieldName": "created_at",
      "functionSpec": "month",
      "conversionSpec": "iso"
    },
    {
      "fieldName": "region",
      "functionSpec": "identity"
    }
  ]
}
```
 This creates a two-level partitioning scheme: first by month (from the "created\$1at" column), then by region. This enables efficient queries that filter by date ranges, specific regions, or a combination of these dimensions. 

 When designing multi-level partitioning schemes, consider: 
+  Placing higher-selectivity columns first in the partition hierarchy 
+  Balancing partition granularity with the number of partitions 
+  Aligning the partitioning scheme with your most common query patterns 

## Best practices
<a name="best-practices"></a>

### Partition column selection
<a name="best-practices-partition-column-selection"></a>
+  Do not use high-cardinality columns with the `identity` partition function. Using high-cardinality columns with identity partitioning creates many small partitions, which can significantly degrade ingestion performance. High-cardinality columns may include: 
  + Primary keys
  + Timestamp fields (such as `LastModifiedTimestamp`, `CreatedDate`)
  + System-generated timestamps
+  Do not select multiple timestamp partitions on same column. For example: 

  ```
  "partitionSpec": [
        {"fieldName": "col1", "functionSpec": "year", "conversionSpec" : "epoch_milli"},
        {"fieldName": "col1", "functionSpec": "month", "conversionSpec" : "epoch_milli"},
        {"fieldName": "col1", "functionSpec": "day", "conversionSpec" : "epoch_milli"},
        {"fieldName": "col1", "functionSpec": "hour", "conversionSpec" : "epoch_milli"}
  ]
  ```

### Partition FunctionSpec/ConversionSpec selection
<a name="best-practices-partition-functionspec-conversionspec-selection"></a>
+  Specify the correct ConversionSpec (epoch\$1sec \$1 epoch\$1milli \$1 iso) that represents format of column values chosen for timestamp based partitioning when using timestamp-based partition functions. AWS Glue Zero-ETL uses this parameter to correctly transform source data into timestamp format before partitioning. 
+  Use appropriate granularity (year/month/day/hour) based on data volume. 
+  Consider timezone implications when using ISO timestamps. AWS Glue Zero-ETL populates all the record values of chosen timestamp column with UTC timezone. 

## Error handling
<a name="error-handling"></a>

### NEEDS\$1ATTENTION State
<a name="needs-attention-state"></a>

 An integration enters the NEEDS\$1ATTENTION state when: 
+ Specified partition columns do not exist in the source
+ Timestamp conversion fails for partition columns

# Limitations
<a name="limitations"></a>

## Partitioning limitations
<a name="partitioning-limitations"></a>
+  Partition specifications cannot be changed after an integration is created. To use a different partitioning strategy, you must create a new integration. 
+  The maximum number of partition columns is limited to 10. 

## Cross-account integration limitations
<a name="cross-account-limitations"></a>
+  When creating cross-account integrations, AWS Glue Console has a limitation where it doesn't invoke CreateIntegrationTableProperty API for configuring UnnestSpec and PartitionSpec for target AWS Glue tables that are hosted in the account where integration does not exist. 

   **Workaround:** The CreateIntegrationTableProperty API must be invoked by CX from the account where the target database exists. 

## Multiple integrations limitations
<a name="multiple-integrations-limitations"></a>
+  If you need to replicate the same source with different schema unnest/partition configurations, it is required to create a new AWS Glue database for each integration separately. Later invoke CreateIntegrationTableProperty for each table from individual AWS Glue Database with desired schema unnesting and partitioning configurations. 

# Creating integrations using the APIs
<a name="zero-etl-using-apis"></a>

You can use the following APIs to create and manage Zero-ETL integrations in AWS Glue:
+ CreateIntegration
+ CreateIntegrationTableProperties
+ CreateIntegrationResourceProperty
+ UpdateIntegrationTableProperties
+ UpdateIntegrationResourceProperty
+ ModifyingIntegration
+ DeleteIntegration
+ DeleteIntegrationTableProperties
+ DescribeIntegrations
+ DescribeInboundIntegrations
+ GetIntegrationTableProperties
+ GetIntegrationResourceProperty

For more information, see [Integration APIs in AWS Glue](aws-glue-api-integrations.md).

# Limitations
<a name="zero-etl-limitations"></a>

The following are general limitations of or considerations about zero-ETL integrations:
+ Resource properties have a one-to-one relationship with the corresponding resource. Consequently, all integrations created using that resource must adhere to the singular resource property. Modifying a resource property will therefore impact all integrations associated with that resource.
+ Table properties have a one-to-one relationship with the corresponding table or object within a resource. As a result, all integrations that process the same table to or from the same resource must adhere to the singular table property.
+ You cannot rename a column at the source. If a column is renamed then there is no guarantee that the schema detection will be accurately done by AWS Glue and the repercussions on the integration are undefined.
+ The following consideration applies to how the integration works with AWS Lake Formation managed tables: By default, you use the IAM/AWS Glue policy to manage your table and databases.
+ If you want to use AWS Lake Formation to manage the table creation in that database, you need to make sure that the role is given sufficient Lake Formation permissions to create, modify and delete the table and database.
+ The zero-ETL summary page does not contain any metrics at this time.

The following are source-specific limitations of zero-ETL integrations:
+ Zero-ETL integrations with an SAP OData source now supports entities starting with `EntityOf`. The ability to override the primary key is currently supported only for SAPOData `EntityOf` objects. Once this property has been set, it cannot be modified.
+ Zero-ETL integrations from Amazon DynamoDB to Amazon SageMaker Lakehouse (via S3) support a maximum DynamoDB table size of 100TB.
+ The source DynamoDB table must be encrypted with either an Amazon-owned or customer-managed AWS KMS key. Amazon managed encryption is not supported for the source DynamoDB table.
+ SAP OData works using a delta token, where the combination of an OAuth client plus an entity, or basic authentication plus an entity, can have only a single delta token. Avoid using the same entity in two different integrations with the same client.
+ The following Salesforce entities or fields are unsupported for use in a zero-ETL integration with a Salesforce source. See [Unsupported entities and fields for Salesforce](zero-etl-sources.md#zero-etl-config-source-salesforce-unsupported).
+ The following ServiceNow entities or fields are unsupported for use in a zero-ETL integration with a ServiceNow source. See [Unsupported entities and fields for ServiceNow](zero-etl-sources.md#zero-etl-config-source-servicenow-unsupported).

The following are target-specific limitations of zero-ETL integrations:
+ Due to limitation in Data Catalog, AWS Glue Zero ETL can only support replication of tables with table names smaller than 255 characters.
+ Due to limitation in Athena, the source column/nested field name should not contain special characters ":", which will cause the target table schema metadata showing up incorrectly. In such cases, AWS Glue Zero ETL will move the integration to FAILED state.
+ Due to limitation in Iceberg, AWS Glue Zero ETL target can only support replication of 1000 columns per table. If chosen table in any integration has more than 1000 columns, AWS Glue Zero ETL will move the integration to FAILED state.
+ Due to limitation in Data Catalog, AWS Glue Zero ETL can only support replicating table of schema size 10 MB. If chosen table in any integration has schema size larger than 10 MB, AWS Glue Zero ETL will move the integration to FAILED state.

# Service quotas
<a name="zero-etl-service-quotas"></a>

AWS Glue Zero-ETL has the following service quotas:
+ Each account can create 25 integrations by default.
+ Each source can be associated with up to 15 integrations.
+ Each target can be associated with up to 50 integrations.

# Supported Regions
<a name="zero-etl-supported-regions"></a>

The following Regions are available with AWS Glue Zero-ETL:


| Region | AWS Glue Zero-ETL | 
| --- | --- | 
| Africa (Cape Town) | Not available | 
| Asia Pacific (Hong Kong) | Available | 
| Asia Pacific (Tokyo) | Available | 
| Asia Pacific (Seoul) | Available | 
| Asia Pacific (Osaka) | Not available | 
| Asia Pacific (Mumbai) | Not available | 
| Asia Pacific (Hyderabad) | Not available | 
| Asia Pacific (Singapore) | Available | 
| Asia Pacific (Sydney) | Available | 
| Asia Pacific (Jakarta) | Not available | 
| Asia Pacific (Melbourne) | Not available | 
| Asia Pacific (Malaysia) | Not available | 
| Canada (Central) | Available | 
| Canada West (Calgary) | Not available | 
| China (Beijing) | Not available | 
| China (Ningxia) | Not available | 
| Europe (Frankfurt) | Available | 
| Europe (Zurich) | Not available | 
| Europe (Stockholm) | Available | 
| Europe (Milan) | Not available | 
| Europe (Spain) | Not available | 
| Europe (Ireland) | Available | 
| Europe (London) | Available | 
| Europe (Paris) | Not available | 
| Israel (Tel Aviv) | Not available | 
| Middle East (UAE) | Not available | 
| Middle East (Bahrain) | Not available | 
| South America (São Paulo) | Available | 
| US East (N. Virginia) | Available | 
| US East (Ohio) | Available | 
| US West (N. California) | Not available | 
| US West (Oregon) | Available | 
| AWS GovCloud (US-East) | Not available | 
| AWS GovCloud (US-West) | Not available | 

# Troubleshooting zero-ETL integrations
<a name="zero-etl-troubleshooting"></a>

Use the following sections to help troubleshoot problems that you have with AWS Glue zero-ETL integrations.

## Troubleshooting zero-ETL integrations with Amazon DynamoDB source
<a name="zero-etl-troubleshoot-dynamodb"></a>

### Missing RBAC policy or point-in-time recovery on source DynamoDB table
<a name="zero-etl-troubleshoot-dynamodb-rbac"></a>

Before creating the integration, source must be configured properly. If the source DynamoDB table is missing the RBAC policy with appropriate permissions or if the point-in-time recovery is disabled, then integration will go into Needs\$1Attention state. To resolve this issue, fix the permissions and/or enable the point-in-time recovery. The integration should automatically recover after some time once you fix the missing configurations.

## Troubleshooting zero-ETL integrations with SaaS sources (using AWS Glue connection)
<a name="zero-etl-troubleshoot-saas"></a>

### Connection not configured properly
<a name="zero-etl-troubleshoot-saas-connection"></a>

If the AWS Glue connection is not configured properly, the integration may fail to access the SaaS source. Verify that the connection credentials are valid and that the source role has the appropriate permissions to access the connection.

## Troubleshooting zero-ETL integrations with general purpose Amazon S3 target
<a name="zero-etl-troubleshoot-s3-target"></a>

### Target-role is missing permissions
<a name="zero-etl-troubleshoot-s3-target-role"></a>

If the target role is missing the appropriate permissions or setup incorrectly, it will cause the integration to go into NEEDS\$1ATTENTION state. Please refer to the section for target role configuration to fix the issue. Integration should be automatically recovered after some time once you fix the issue.

### Target Catalog RBAC policy is incorrectly configured
<a name="zero-etl-troubleshoot-s3-target-rbac"></a>

If the target catalog resource policy is incorrectly configured, it will also cause the integration to go into NEEDS\$1ATTENTION state. Please refer to the section for target role configuration to fix the issue. Integration should be automatically recovered after some time once you fix the issue.

## Troubleshooting zero-ETL integrations with Amazon S3-Table target
<a name="zero-etl-troubleshoot-s3-tables-target"></a>

### Target-role is missing permissions
<a name="zero-etl-troubleshoot-s3-tables-target-role"></a>

If the target role is missing the appropriate permissions or setup incorrectly, it will cause the integration to go into NEEDS\$1ATTENTION state. Please refer to the section for target role configuration to fix the issue. Integration should be automatically recovered after some time once you fix the issue.

### Target Catalog RBAC policy is incorrectly configured
<a name="zero-etl-troubleshoot-s3-tables-target-rbac"></a>

If the target catalog resource policy is incorrectly configured, it will also cause the integration to go into NEEDS\$1ATTENTION state. Please refer to the section for target role configuration to fix the issue. Integration should be automatically recovered after some time once you fix the issue.

## General troubleshooting guide for AWS Glue zero-ETL integration errors
<a name="zero-etl-troubleshoot-general"></a>

All integrations emit CloudWatch logs after the completion of each process e.g. full data load or change data capture. You can refer to those logs to determine the exact root cause of the failure or error.

Additionally, AWS Glue also creates system table in the target AWS Glue database or S3-Table. During the time when integration remains operational (i.e. not in FAILED or DELETED state), AWS Glue will keep appending the statuses for each individual operations on the target i.e. completion of full data load or change data capture with statistics such as number of records, number of insertions, deletions etc.