自 2025 年 11 月 1 日起,Amazon Redshift 將不再支援建立新的 Python UDFs。如果您想要使用 Python UDFs,請在該日期之前建立 UDFs。現有的 Python UDFs將繼續如常運作。如需詳細資訊,請參閱部落格文章
本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
在 Amazon Redshift 中使用半結構化資料的範例
下列範例示範如何使用 PartiQL 語法在 Amazon Redshift 中使用半結構化資料。您將建立範例資料表以載入一組半結構化資料,然後在各種使用案例中查詢半結構化資料物件。
注意
建議您先設定 enable_case_sensitive_identifier
和 enable_case_sensitive_super_attribute
組態選項,再使用 SUPER 資料類型。如需詳細資訊,請參閱enable_case_sensitive_identifier及enable_case_sensitive_super_attribute。
載入半結構化資料
下列陳述式會建立範例資料表,並將範例 JSON 物件載入 SUPER all_data
資料欄。
DROP TABLE IF EXISTS test_json; SET enable_case_sensitive_super_attribute TO true; SET enable_case_sensitive_identifier TO true; CREATE TABLE test_json (all_data SUPER); INSERT INTO test_json VALUES (JSON_PARSE(' { "data":{ "pnr":{ "type":"pnr", "pnrid":"123PQRS-2024-09-20", "bookingIdentifier":"123PQRS", "version":"5", "triggerType":"", "events":[ { "eventType":"UPDATED", "type":"PART", "id":"123PQRS-2024-09-20-HO-1" }, { "eventType":"CREATED", "type":"ABC", "id":"123PQRS-2024-09-20-OT-38" } ], "create":{ "pnrCreateDate":"2024-09-20T16:56:00Z", "officeID":"OFFCID1234", "officeIDCategory":"Email" }, "lastModification":{ "dateTime":"2024-09-20T17:09:00Z" }, "PARTDetails":[ { "path":"1", "TrainPARTs":[ { "PARTID":"123PQRS-2024-09-20-HO-1", "departure":{ "departureStation":"XYZ", "departureTimeLocal":"2024-10-03T06:30:00", "departureTimeGMT":"2024-10-03T10:30:00Z" }, "arrival":{ "arrivalStation":"ABC", "arrivalTimeLocal":"2024-10-03T08:20:00", "arrivalTimeGMT":"2024-10-03T15:20:00Z" }, "marketing":{ "carrierCode":"XX", "TrainNumber":"100" }, "operating":{ "carrierCode":"YY", "TrainNumber":"100-A" }, "status":"ON", "aircraft":{ "code":"222" }, "class":"WC", "first":"Y", "seating":[ ] } ] } ], "commuterInformation":[ { "commuterID":"2", "commuterPNR":"123PQRS-2024-09-20-RO-2", "commuterTypeCode":"DOM", "firstName":"JOHN", "lastName":"MILLER" } ], "contactDetail":[ { "emailContacts":[ { "id":"123PQRS-2024-09-20-OT-4", "contact":"JOHNMILLER@EXAMPLE.COM", "purpose":[ "BUSINESS" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ], "language":"EN" }, { "id":"123PQRS-2024-09-20-OT-5", "contact":"HARVEYCORMIER@EXAMPLE.COM", "purpose":[ "NOTIFICATION" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ], "language":"EN" } ] }, { "phoneContacts":[ { "id":"123PQRS-2024-09-20-OT-3", "contact":"1234567890", "purpose":[ "NOTIFICATION" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ], "language":"" } ] }, { "addressInfo":[ { "id":"123PQRS-2024-09-20-OT-6", "addressline":[ "112 PORT STREET" ], "provinceState":"CA", "cityName":"SAN JOSE", "postalCode":"12345", "countryCode":"USA", "purpose":[ "MAILING" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ] } ] } ], "PendingService":[ { "id":"123PQRS-2024-09-20-OT-26", "code":"MONO", "status":"", "text":"Broken Seat at Coach-No XYZ123 Seat-No 567", "trainCode":"WC-1", "TrainsArray":[ "123PQRS-2024-09-20-HO-1" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ] }, { "id":"123PQRS-2024-09-20-OT-27", "code":"OTHS", "status":"", "text":"Broken Seat at Coach-No XYZ567 Seat-No 111", "trainCode":"WC-1", "TrainsArray":[ "123PQRS-2024-09-20-HO-1" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ] }, { "id":"123PQRS-2024-09-20-OT-28", "code":"OTHS", "status":"", "text":"Broken Seat at Coach-No XYZ890 Seat-No 123", "trainCode":"WC-1", "TrainsArray":[ "123PQRS-2024-09-20-HO-1" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ] }, { "id":"123PQRS-2024-09-20-OT-29", "code":"OTHS", "status":"", "text":"Broken Seat at Coach-No XYZ111 Seat-No 333", "trainCode":"WC-1", "TrainsArray":[ "123PQRS-2024-09-20-HO-1" ], "commuter":[ "123PQRS-2024-09-20-RO-2" ] } ], "parts": [ { "partname": "prop", "manufacturer": "local parts co", "quality": 2, "price": 10.00 }, { "partname": "prop", "manufacturer": "big parts co", "quality": null, "price": 9.00 }, { "partname": "prop", "manufacturer": "small parts co", "quality": 1, "price": 12.00 }, { "partname": "rudder", "manufacturer": "local parts co", "quality": 1, "price": 2.50 }, { "partname": "rudder", "manufacturer": "big parts co", "quality": 2, "price": 3.75 }, { "partname": "rudder", "manufacturer": "small parts co", "quality": null, "price": 1.90 }, { "partname": "wing", "manufacturer": "local parts co", "quality": null, "price": 7.50 }, { "partname": "wing", "manufacturer": "big parts co", "quality": 1, "price": 15.20 }, { "partname": "wing", "manufacturer": "small parts co", "quality": null, "price": 11.80 } ], "count_by_color": [ { "quality": "high", "red": 15, "green": 20, "blue": 7 }, { "quality": "normal", "red": 35, "green": null, "blue": 40 }, { "quality": "low", "red": 10, "green": 23, "blue": null } ] } }, "id":"abcdefgh-ijklmnop-qrstuvwxyz123", "mainIds":[ { "ID":"pqrstuvwxyz-aabbcc123", "Source":"NYC" } ] } '));
查詢巢狀半結構化資料
下列陳述式使用 PartiQL 的點表示法來擷取 欄位,該pnrid
欄位在最上層all_data
物件深處巢狀三層。
select all_data.data.pnr.pnrid::varchar from test_json;
pnrid -------------------- 123PQRS-2024-09-20
下列陳述式使用 PartiQL 的括號表示法,只指定和擷取最上層物件內巢狀events
陣列的第一個元素。
SELECT all_data.data.pnr.events[0] FROM test_json;
events --------------------------------- { "eventType":"UPDATED", "type":"PART", "id":"123PQRS-2024-09-20-HO-1" }
下列陳述式只會從events
陣列中擷取指定元素的 eventType
屬性。
SELECT all_data.data.pnr.events[0].eventType FROM test_json;
eventtype ----------- "UPDATED"
下列陳述式
使用 enable_case_sensitive_identifier
和 enable_case_sensitive_super_attribute
搭配半結構化資料
下列範例顯示用於查詢半結構化資料時,組態選項 enable_case_sensitive_identifier 和 有何enable_case_sensitive_super_attribute不同。如需這些組態選項的詳細資訊,請參閱 使用大寫和混合大小寫欄位名稱或屬性存取 JSON 欄位。
在下列陳述式中,將兩個組態選項重設為預設值 false 會使查詢傳回 NULL。
RESET enable_case_sensitive_identifier; RESET enable_case_sensitive_super_attribute; SELECT all_data.data.pnr.events[0].eventType FROM test_json;
eventtype ----------- NULL
在下列範例中,範例查詢會在您以雙引號包裝區分大小寫屬性並enable_case_sensitive_identifier
設為 true 之後傳回所需的結果。
RESET enable_case_sensitive_identifier; RESET enable_case_sensitive_super_attribute; SELECT all_data.data.pnr.events[0]."eventType" FROM test_json;
eventtype ----------- NULL
SET enable_case_sensitive_identifier TO true; SELECT all_data.data.pnr.events[0]."eventType" FROM test_json;eventtype ----------- "UPDATED"
在下列範例中,範例查詢會在您將 enable_case_sensitive_super_attribute
設定為 true 之後傳回所需的結果,而不以雙引號括住區分大小寫的屬性。
RESET enable_case_sensitive_identifier; RESET enable_case_sensitive_super_attribute; SELECT all_data.data.pnr.events[0].eventType FROM test_json;
eventtype ----------- NULL
SET enable_case_sensitive_super_attribute TO true; SELECT all_data.data.pnr.events[0].eventType FROM test_json;eventtype ----------- "UPDATED"
篩選半結構化資料
下列陳述式使用陳述式 WHERE 子句中的 PartiQL 語法,計算 類型的事件UPDATED
,以從陣列中擷取特定屬性的資料。您可以在通常參考資料欄的查詢任何部分使用此語法。
SELECT COUNT(*) FROM test_json WHERE all_data.data.pnr.events[0].eventType = 'UPDATED';
count ------ 1
下列範例在 GROUP BY 和 ORDER BY 子句中使用 PartiQL 的括號和點語法。
SELECT all_data.data.pnr.events[0].eventType::varchar, COUNT(*) FROM test_json WHERE all_data.data.pnr.events[0].eventType IS NOT NULL GROUP BY all_data.data.pnr.events[0].eventType ORDER BY all_data.data.pnr.events[0].eventType;
eventtype | count -----------+------- "UPDATED" | 1
解除半結構化資料的巢狀化
下列陳述式使用 PartiQL 聯結來取消events
陣列巢狀。請注意,即使陣列的索引數目不是靜態的,此聯結仍然有效。
如需使用 FROM 子句中的 UNNEST 將半結構化資料解除巢狀化的範例,請參閱 UNNEST 範例。
SELECT a.all_data.data.pnr.type::varchar type_info, a.all_data.data.pnr.pnrid::varchar pnr_id , a.all_data.data.pnr.bookingIdentifier::varchar booking_id, a.all_data.data.pnr.version::varchar version_info, b.eventType::varchar event_type, b.id::varchar event_id FROM test_json a, a.all_data.data.pnr.events b;
type_info | pnr_id | booking_id | version_info | event_type | event_id -----------+---------------------+------------+--------------+------------+------------------------- pnr | 123PQRS-2024-09-20 | 123PQRS | 5 | UPDATED | 123PQRS-2024-09-20-HO-1 pnr | 123PQRS-2024-09-20 | 123PQRS | 5 | CREATED | 123PQRS-2024-09-20-OT-38
解除巢狀陣列巢狀
下列陳述式使用 PartiQL 聯結來取消巢狀在另一個陣列中的陣列。
如需使用 FROM 子句中的 UNNEST 將半結構化資料解除巢狀化的範例,請參閱 UNNEST 範例。
SELECT a.all_data.data.pnr.type::varchar type_info, a.all_data.data.pnr.pnrid::varchar pnr_id , a.all_data.data.pnr.bookingIdentifier::varchar booking_id, a.all_data.data.pnr.version::varchar version_info, d.id::varchar email_record_id, d.contact::varchar email_contact, e::varchar email_purpose, f::varchar email_commuter FROM test_json a, a.all_data.data.pnr.contactDetail c, c."emailContacts" d, d.purpose e, d.commuter f;
type_info | pnr_id | booking_id | version_info | email_record_id | email_contact | email_purpose | email_commuter -----------+---------------------+------------+--------------+-------------------------+---------------------------+---------------+------------------------- pnr | 123PQRS-2024-09-20 | 123PQRS | 5 | 123PQRS-2024-09-20-OT-4 | JOHNMILLER@EXAMPLE.COM | BUSINESS | 123PQRS-2024-09-20-RO-2 pnr | 123PQRS-2024-09-20 | 123PQRS | 5 | 123PQRS-2024-09-20-OT-5 | HARVEYCORMIER@EXAMPLE.COM | NOTIFICATION | 123PQRS-2024-09-20-RO-2
在子查詢中使用半結構化資料
下列陳述式使用 WHERE 子句中的子查詢,僅傳回先前範例結果的子區段。
SELECT a.all_data.data.pnr.type::varchar type_info, a.all_data.data.pnr.pnrid::varchar pnr_id , a.all_data.data.pnr.bookingIdentifier::varchar booking_id, a.all_data.data.pnr.version::varchar version_info, d.id::varchar email_record_id, d.contact::varchar email_contact FROM test_json a, a.all_data.data.pnr.contactDetail c, c."emailContacts" d WHERE (SELECT COUNT(*) FROM d.purpose e WHERE e = 'BUSINESS') > 0;
type_info | pnr_id | booking_id | version_info | email_record_id | email_contact | email_purpose | email_commuter -----------+---------------------+------------+--------------+-------------------------+---------------------------+---------------+------------------------- pnr | 123PQRS-2024-09-20 | 123PQRS | 5 | 123PQRS-2024-09-20-OT-4 | JOHNMILLER@EXAMPLE.COM | BUSINESS | 123PQRS-2024-09-20-RO-2
使用半結構化資料彙總查詢
下列陳述式使用 COUNT 函數來彙總PendingService
陣列中的元素數目。
SELECT a.all_data.data.pnr.type::varchar type_info, a.all_data.data.pnr.pnrid::varchar pnr_id , a.all_data.data.pnr.bookingIdentifier::varchar booking_id, a.all_data.data.pnr.version::varchar version_info, COUNT(*) AS total_pending_service FROM test_json a, a.all_data.data.pnr.PendingService c GROUP BY 1,2,3,4;
type_info | pnr_id | booking_id | version_info | total_pending_service -----------+--------------------+------------+--------------+----------------------- pnr | 123PQRS-2024-09-20 | 123PQRS | 5 | 4
在具體化視觀表中使用半結構化資料
下列陳述式使用上一個範例中的 陳述式來建立具體化視觀表。當基礎資料表取得新資料時,具體化檢視會自動重新整理待定服務的數量。
CREATE MATERIALIZED VIEW mv_total_pending_service AUTO REFRESH YES AS SELECT a.all_data.data.pnr.type::varchar type_info, a.all_data.data.pnr.pnrid::varchar pnr_id , a.all_data.data.pnr.bookingIdentifier::varchar booking_id, a.all_data.data.pnr.version::varchar version_info, COUNT(*) AS total_pending_service FROM test_json a, a.all_data.data.pnr.PendingService c GROUP BY 1,2,3,4;
搭配半結構化資料使用 PIVOT 和 UNPIVOT
下列陳述式在 partname
欄上使用 PIVOT 來傳回每個部分的平均價格。
SELECT * FROM ( SELECT c.partname::varchar, c.price FROM test_json a, a.all_data.data.pnr.parts c) PIVOT (AVG(price) for partname IN ('prop', 'rudder', 'wing'));
prop | rudder | wing ------------+--------------------+-------- 10.33 | 2.71 | 11.50
在上一個範例中,結果會轉換為資料欄。下列範例顯示 GROUP BY 查詢,以資料列傳回平均價格,而非資料欄。
SELECT partname, avg(price) FROM ( SELECT c.partname::varchar, c.price FROM test_json a, a.all_data.data.pnr.parts c) WHERE partname IN ('prop', 'rudder', 'wing') GROUP BY partname;
partname | avg ----------+------- prop | 10.33 rudder | 2.71 wing | 11.50
以下是使用 manufacturer
做為隱含資料欄的 PIVOT 範例。
SELECT * FROM ( SELECT c.quality, c.manufacturer::varchar FROM test_json a, a.all_data.data.pnr.parts c) PIVOT ( count(*) FOR quality IN (1, 2, NULL) );
manufacturer | 1 | 2 | null -------------------+----+----+------ local parts co | 1 | 1 | 1 big parts co | 1 | 1 | 1 small parts co | 1 | 0 | 2
以下是 quality
欄上的 UNPIVOT 範例。
SELECT * FROM ( SELECT c.quality as quality FROM test_json a, a.all_data.data.pnr.parts c) UNPIVOT (cnt FOR column_header IN (quality));
column_header | cnt -----------------+---- quality | 2 quality | 1 quality | 1 quality | 2 quality | 1