在 Amazon AWS Glue 與 Athena 使用時,表格不兼容 QuickSight - Amazon QuickSight

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

在 Amazon AWS Glue 與 Athena 使用時,表格不兼容 QuickSight

如果您在 Athena 搭配 Amazon 使用資料AWS Glue表時收到錯誤 QuickSight,可能是因為您遺漏了一些中繼資料。請依照下列步驟確定您的表格是否沒有 Amazon 所 QuickSight 需的 Athena 連接器運作的TableType屬性。一般來說,這些表格的中繼資料不會遷移到 AWS Glue 資料目錄。如需詳細資訊,請參閱《AWS Glue 開發人員指南》中的逐步升級到 AWS Glue 資料目錄

如果您目前不想遷移到 AWS Glue 資料目錄,您有兩個選項。您可以透過 AWS Glue 管理主控台重新建立每個 AWS Glue 表格。或者,您可以使用以下程序中的 AWS CLI 指令碼,來識別和更新資料表中遺漏的 TableType 屬性。

如果您偏好使用 CLI 來執行此操作,請使用下列程序來協助您設計您的指令碼。

若要使用 CLI 設計指令碼
  1. 使用 CLI 來了解哪些 AWS Glue 表格沒有 TableType 屬性。

    aws glue get-tables --database-name <your_datebase_name>;

    例如,您可以在 CLI 中執行下列命令。

    aws glue get-table --database-name "test_database" --name "table_missing_table_type"

    以下是輸出的範例。您可以看到表格 "table_missing_table_type" 沒有宣告 TableType 屬性。

    { "TableList": [ { "Retention": 0, "UpdateTime": 1522368588.0, "PartitionKeys": [ { "Name": "year", "Type": "string" }, { "Name": "month", "Type": "string" }, { "Name": "day", "Type": "string" } ], "LastAccessTime": 1513804142.0, "Owner": "owner", "Name": "table_missing_table_type", "Parameters": { "delimiter": ",", "compressionType": "none", "skip.header.line.count": "1", "sizeKey": "75", "averageRecordSize": "7", "classification": "csv", "objectCount": "1", "typeOfData": "file", "CrawlerSchemaDeserializerVersion": "1.0", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "recordCount": "9", "columnsOrdered": "true" }, "StorageDescriptor": { "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "SortColumns": [], "StoredAsSubDirectories": false, "Columns": [ { "Name": "col1", "Type": "string" }, { "Name": "col2", "Type": "bigint" } ], "Location": "s3://myAthenatest/test_dataset/", "NumberOfBuckets": -1, "Parameters": { "delimiter": ",", "compressionType": "none", "skip.header.line.count": "1", "columnsOrdered": "true", "sizeKey": "75", "averageRecordSize": "7", "classification": "csv", "objectCount": "1", "typeOfData": "file", "CrawlerSchemaDeserializerVersion": "1.0", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "recordCount": "9" }, "Compressed": false, "BucketColumns": [], "InputFormat": "org.apache.hadoop.mapred.TextInputFormat", "SerdeInfo": { "Parameters": { "field.delim": "," }, "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe" } } } ] }
  2. 在您的編輯器中編輯表格定義,將 "TableType": "EXTERNAL_TABLE" 新增到表格定義,如以下範例所示。

    { "Table": { "Retention": 0, "TableType": "EXTERNAL_TABLE", "PartitionKeys": [ { "Name": "year", "Type": "string" }, { "Name": "month", "Type": "string" }, { "Name": "day", "Type": "string" } ], "UpdateTime": 1522368588.0, "Name": "table_missing_table_type", "StorageDescriptor": { "BucketColumns": [], "SortColumns": [], "StoredAsSubDirectories": false, "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "SerdeInfo": { "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "Parameters": { "field.delim": "," } }, "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "Columns": [ { "Name": "col1", "Type": "string" }, { "Name": "col2", "Type": "bigint" } ], "Compressed": false, "InputFormat": "org.apache.hadoop.mapred.TextInputFormat", "NumberOfBuckets": -1, "Location": "s3://myAthenatest/test_date_part/" }, "Owner": "owner", "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "LastAccessTime": 1513804142.0 } }
  3. 您可以調整以下指令碼更新表格輸入,以便包含 TableType 屬性。

    aws glue update-table --database-name <your_datebase_name> --table-input <updated_table_input>

    下列顯示一個範例。

    aws glue update-table --database-name test_database --table-input ' { "Retention": 0, "TableType": "EXTERNAL_TABLE", "PartitionKeys": [ { "Name": "year", "Type": "string" }, { "Name": "month", "Type": "string" }, { "Name": "day", "Type": "string" } ], "Name": "table_missing_table_type", "StorageDescriptor": { "BucketColumns": [], "SortColumns": [], "StoredAsSubDirectories": false, "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "SerdeInfo": { "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "Parameters": { "field.delim": "," } }, "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "Columns": [ { "Name": "col1", "Type": "string" }, { "Name": "col2", "Type": "bigint" } ], "Compressed": false, "InputFormat": "org.apache.hadoop.mapred.TextInputFormat", "NumberOfBuckets": -1, "Location": "s3://myAthenatest/test_date_part/" }, "Owner": "owner", "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "LastAccessTime": 1513804142.0 }'