Amazon QuickSight 中搭配 Athena 使用 AWS Glue 時資料表不相容 - Amazon QuickSight

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

Amazon QuickSight 中搭配 Athena 使用 AWS Glue 時資料表不相容

如果您在將 Athena 中的 AWS Glue 資料表與 Amazon QuickSight 搭配使用時發生錯誤,可能是因為您缺少一些中繼資料。請依照這些步驟來了解您的資料表是否沒有 Amazon QuickSight 讓 Athena 連接器運作所需的 TableType 屬性。一般來說,這些表格的中繼資料不會遷移到 AWS Glue 資料目錄。如需詳細資訊,請參閱《 開發人員指南》中的Step-by-Step升級至 AWS Glue Data Catalog AWS Glue

如果您目前不想遷移至 AWS Glue Data Catalog,您有兩個選項。您可以透過 AWS Glue 管理主控台重新建立每個 AWS Glue 資料表。或者,您可以使用下列程序中列出的 AWS CLI 指令碼來識別和更新缺少TableType屬性的資料表。

如果您偏好使用 CLI 來執行此操作,請使用下列程序來協助您設計您的指令碼。

若要使用 CLI 設計指令碼
  1. 使用 CLI 來了解哪些 AWS Glue 資料表沒有TableType屬性。

    aws glue get-tables --database-name <your_datebase_name>;

    例如,您可以在 CLI 中執行下列命令。

    aws glue get-table --database-name "test_database" --name "table_missing_table_type"

    以下是輸出的範例。您可以看到表格 "table_missing_table_type" 沒有宣告 TableType 屬性。

    { "TableList": [ { "Retention": 0, "UpdateTime": 1522368588.0, "PartitionKeys": [ { "Name": "year", "Type": "string" }, { "Name": "month", "Type": "string" }, { "Name": "day", "Type": "string" } ], "LastAccessTime": 1513804142.0, "Owner": "owner", "Name": "table_missing_table_type", "Parameters": { "delimiter": ",", "compressionType": "none", "skip.header.line.count": "1", "sizeKey": "75", "averageRecordSize": "7", "classification": "csv", "objectCount": "1", "typeOfData": "file", "CrawlerSchemaDeserializerVersion": "1.0", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "recordCount": "9", "columnsOrdered": "true" }, "StorageDescriptor": { "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "SortColumns": [], "StoredAsSubDirectories": false, "Columns": [ { "Name": "col1", "Type": "string" }, { "Name": "col2", "Type": "bigint" } ], "Location": "s3://myAthenatest/test_dataset/", "NumberOfBuckets": -1, "Parameters": { "delimiter": ",", "compressionType": "none", "skip.header.line.count": "1", "columnsOrdered": "true", "sizeKey": "75", "averageRecordSize": "7", "classification": "csv", "objectCount": "1", "typeOfData": "file", "CrawlerSchemaDeserializerVersion": "1.0", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "recordCount": "9" }, "Compressed": false, "BucketColumns": [], "InputFormat": "org.apache.hadoop.mapred.TextInputFormat", "SerdeInfo": { "Parameters": { "field.delim": "," }, "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe" } } } ] }
  2. 在您的編輯器中編輯表格定義,將 "TableType": "EXTERNAL_TABLE" 新增到表格定義,如以下範例所示。

    { "Table": { "Retention": 0, "TableType": "EXTERNAL_TABLE", "PartitionKeys": [ { "Name": "year", "Type": "string" }, { "Name": "month", "Type": "string" }, { "Name": "day", "Type": "string" } ], "UpdateTime": 1522368588.0, "Name": "table_missing_table_type", "StorageDescriptor": { "BucketColumns": [], "SortColumns": [], "StoredAsSubDirectories": false, "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "SerdeInfo": { "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "Parameters": { "field.delim": "," } }, "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "Columns": [ { "Name": "col1", "Type": "string" }, { "Name": "col2", "Type": "bigint" } ], "Compressed": false, "InputFormat": "org.apache.hadoop.mapred.TextInputFormat", "NumberOfBuckets": -1, "Location": "s3://myAthenatest/test_date_part/" }, "Owner": "owner", "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "LastAccessTime": 1513804142.0 } }
  3. 您可以調整以下指令碼更新表格輸入,以便包含 TableType 屬性。

    aws glue update-table --database-name <your_datebase_name> --table-input <updated_table_input>

    下列顯示一個範例。

    aws glue update-table --database-name test_database --table-input ' { "Retention": 0, "TableType": "EXTERNAL_TABLE", "PartitionKeys": [ { "Name": "year", "Type": "string" }, { "Name": "month", "Type": "string" }, { "Name": "day", "Type": "string" } ], "Name": "table_missing_table_type", "StorageDescriptor": { "BucketColumns": [], "SortColumns": [], "StoredAsSubDirectories": false, "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "SerdeInfo": { "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "Parameters": { "field.delim": "," } }, "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "Columns": [ { "Name": "col1", "Type": "string" }, { "Name": "col2", "Type": "bigint" } ], "Compressed": false, "InputFormat": "org.apache.hadoop.mapred.TextInputFormat", "NumberOfBuckets": -1, "Location": "s3://myAthenatest/test_date_part/" }, "Owner": "owner", "Parameters": { "classification": "csv", "CrawlerSchemaSerializerVersion": "1.0", "UPDATED_BY_CRAWLER": "crawl_date_table", "columnsOrdered": "true", "averageRecordSize": "7", "objectCount": "1", "sizeKey": "75", "delimiter": ",", "compressionType": "none", "recordCount": "9", "CrawlerSchemaDeserializerVersion": "1.0", "typeOfData": "file", "skip.header.line.count": "1" }, "LastAccessTime": 1513804142.0 }'