UNLOAD例子 - Amazon Redshift

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

UNLOAD例子

這些範例會示範UNLOAD指令的各種參數。範TICKIT例資料用於許多範例中。如需詳細資訊,請參閱範本資料庫

注意

這些範例包含換行以方便閱讀。請勿在 credentials-args 字串中包含換行或空格。

卸載VENUE到以管道分隔的文件(默認分隔符)

下列範例會卸載資料VENUE表並將資料寫入:s3://amzn-s3-demo-bucket/unload/

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

依預設,UNLOAD會為每個磁碟片段寫入一或多個檔案。假設有一個雙節點叢集,且每個節點有兩個分割,則上方範例會在 amzn-s3-demo-bucket 中建立這些檔案:

unload/0000_part_00 unload/0001_part_00 unload/0002_part_00 unload/0003_part_00

為了能更清楚區分輸出檔案,您可以在位置中包含字首。下列範例會卸載資料VENUE表並將資料寫入:s3://amzn-s3-demo-bucket/unload/venue_pipe_

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

結果會是 unload 資料夾中的這四個檔案,同樣假設有四個分割。

venue_pipe_0000_part_00 venue_pipe_0001_part_00 venue_pipe_0002_part_00 venue_pipe_0003_part_00

卸載LINEITEM表到分區的實木複合地板文件

l_shipdate列範例會以 Parquet 格式卸載資料LINEITEM表,並依資料行分割。

unload ('select * from lineitem') to 's3://amzn-s3-demo-bucket/lineitem/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' PARQUET PARTITION BY (l_shipdate);

假設有四個分割,則結果產生的 Parquet 檔案會以動態方式分成不同的資料夾。

s3://amzn-s3-demo-bucket/lineitem/l_shipdate=1992-01-02/0000_part_00.parquet 0001_part_00.parquet 0002_part_00.parquet 0003_part_00.parquet s3://amzn-s3-demo-bucket/lineitem/l_shipdate=1992-01-03/0000_part_00.parquet 0001_part_00.parquet 0002_part_00.parquet 0003_part_00.parquet s3://amzn-s3-demo-bucket/lineitem/l_shipdate=1992-01-04/0000_part_00.parquet 0001_part_00.parquet 0002_part_00.parquet 0003_part_00.parquet ...
注意

在某些情況下,UNLOAD命令使用的INCLUDE選項顯示在下面的SQL語句。

unload ('select * from lineitem') to 's3://amzn-s3-demo-bucket/lineitem/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' PARQUET PARTITION BY (l_shipdate) INCLUDE;

在這些情況下,l_shipdate 資料欄也在 Parquet 檔案的資料中。否則,l_shipdate 資料欄資料不在 Parquet 檔案中。

將VENUE表格卸載至檔案 JSON

下列範例會卸載資料VENUE表,並以JSON格式將資料寫入。s3://amzn-s3-demo-bucket/unload/

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' JSON;

以下是從VENUE表中的示例行。

venueid | venuename | venuecity | venuestate | venueseats --------+----------------------------+-----------------+------------+----------- 1 | Pinewood Racetrack | Akron | OH | 0 2 | Columbus "Crew" Stadium | Columbus | OH | 0 4 | Community, Ballpark, Arena | Kansas City | KS | 0

卸載到之後JSON,檔案的格式類似於以下內容。

{"venueid":1,"venuename":"Pinewood Racetrack","venuecity":"Akron","venuestate":"OH","venueseats":0} {"venueid":2,"venuename":"Columbus \"Crew\" Stadium ","venuecity":"Columbus","venuestate":"OH","venueseats":0} {"venueid":4,"venuename":"Community, Ballpark, Arena","venuecity":"Kansas City","venuestate":"KS","venueseats":0}

卸載VENUE到文件 CSV

下列範例會卸載資料VENUE表,並以CSV格式將資料寫入。s3://amzn-s3-demo-bucket/unload/

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' CSV;

假設該VENUE表包含以下行。

venueid | venuename | venuecity | venuestate | venueseats --------+----------------------------+-----------------+------------+----------- 1 | Pinewood Racetrack | Akron | OH | 0 2 | Columbus "Crew" Stadium | Columbus | OH | 0 4 | Community, Ballpark, Arena | Kansas City | KS | 0

卸載檔案看起來類似下列。

1,Pinewood Racetrack,Akron,OH,0 2,"Columbus ""Crew"" Stadium",Columbus,OH,0 4,"Community, Ballpark, Arena",Kansas City,KS,0

使用分隔符卸載VENUE到CSV文件

下列範例會卸載資料VENUE表,並使用直線字元 (|) 做為分隔符號以CSV格式寫入資料。卸載的檔案會寫入到 s3://amzn-s3-demo-bucket/unload/。此範例中的VENUE表格包含第一列 (Pinewood Race|track) 值中的直線字元。它這樣做是為了表明結果中的值用雙引號括住。雙引號會逸出雙引號,而整個欄位會用雙引號括住。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' CSV DELIMITER AS '|';

假設該VENUE表包含以下行。

venueid | venuename | venuecity | venuestate | venueseats --------+----------------------------+-----------------+------------+------------- 1 | Pinewood Race|track | Akron | OH | 0 2 | Columbus "Crew" Stadium | Columbus | OH | 0 4 | Community, Ballpark, Arena | Kansas City | KS | 0

卸載檔案看起來類似下列。

1|"Pinewood Race|track"|Akron|OH|0 2|"Columbus ""Crew"" Stadium"|Columbus|OH|0 4|Community, Ballpark, Arena|Kansas City|KS|0

使用清單VENUE文件卸載

若要建立資訊清單檔案,請包含選MANIFEST項。下列範例會卸載資料VENUE表,並將資訊清單檔案與資料檔案一起寫入 s3://amzn-s3-demo-bucket/venue_pipe_:

重要

如果您使用MANIFEST選項卸載檔案,則在載入檔案時應該搭配COPY指令使用該MANIFEST選項。如果您使用相同的前綴來加載文件,但未指定MANIFEST選項,則COPY會失敗,因為它假定清單文件是數據文件。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest;

結果會是這五個檔案:

s3://amzn-s3-demo-bucket/venue_pipe_0000_part_00 s3://amzn-s3-demo-bucket/venue_pipe_0001_part_00 s3://amzn-s3-demo-bucket/venue_pipe_0002_part_00 s3://amzn-s3-demo-bucket/venue_pipe_0003_part_00 s3://amzn-s3-demo-bucket/venue_pipe_manifest

以下示範資訊清單檔案的內容。

{ "entries": [ {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0000_part_00"}, {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0001_part_00"}, {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0002_part_00"}, {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0003_part_00"} ] }

卸載 VENUE MANIFEST VERBOSE

當您指定MANIFESTVERBOSE選項時,資訊清單檔案包含下列區段:

  • entries 區段會列出每個檔案的 Amazon S3 路徑、檔案大小和資料列計數。

  • schema 區段會列出每個資料欄的資料欄名稱、資料類型和維度。

  • meta 區段會顯示所有檔案的檔案大小總計和資料列計數。

下列範例會使用MANIFESTVERBOSE選項卸載資VENUE料表。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload_venue_folder/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest verbose;

以下示範資訊清單檔案的內容。

{ "entries": [ {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0000_part_00", "meta": { "content_length": 32295, "record_count": 10 }}, {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0001_part_00", "meta": { "content_length": 32771, "record_count": 20 }}, {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0002_part_00", "meta": { "content_length": 32302, "record_count": 10 }}, {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0003_part_00", "meta": { "content_length": 31810, "record_count": 15 }} ], "schema": { "elements": [ {"name": "venueid", "type": { "base": "integer" }}, {"name": "venuename", "type": { "base": "character varying", 25 }}, {"name": "venuecity", "type": { "base": "character varying", 25 }}, {"name": "venuestate", "type": { "base": "character varying", 25 }}, {"name": "venueseats", "type": { "base": "character varying", 25 }} ] }, "meta": { "content_length": 129178, "record_count": 55 }, "author": { "name": "Amazon Redshift", "version": "1.0.0" } }

VENUE使用標題卸載

下列範例會卸載VENUE標題列。

unload ('select * from venue where venueseats > 75000') to 's3://amzn-s3-demo-bucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' header parallel off;

以下示範資訊具有標題列的輸出檔案的內容。

venueid|venuename|venuecity|venuestate|venueseats 6|New York Giants Stadium|East Rutherford|NJ|80242 78|INVESCO Field|Denver|CO|76125 83|FedExField|Landover|MD|91704 79|Arrowhead Stadium|Kansas City|MO|79451

卸載VENUE到較小的文件

根據預設,檔案大小的上限為 6.2 GB。如果卸載資料大於 6.2 GB,則會為每個 6.2 GB 資料區段UNLOAD建立一個新檔案。若要建立較小的檔案,請包括MAXFILESIZE參數。假設上一個範例中的資料大小為 20 GB,下列UNLOAD命令會建立 20 個檔案,每個檔案大小為 1 GB。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' maxfilesize 1 gb;

順序卸載 VENUE

若要依序卸載,請指定。PARALLEL OFFUNLOAD然後一次寫入一個文件,每個文件最多 6.2 GB。

下列範例會卸載資料VENUE表,並依序將資料寫入。s3://amzn-s3-demo-bucket/unload/

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/venue_serial_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' parallel off;

結果會產生一個名為 venue_serial_000 的檔案。

如果卸載資料大於 6.2 GB,則會為每個 6.2 GB 資料區段UNLOAD建立一個新檔案。下列範例會卸載資料LINEORDER表,並依序將資料寫入。s3://amzn-s3-demo-bucket/unload/

unload ('select * from lineorder') to 's3://amzn-s3-demo-bucket/unload/lineorder_serial_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' parallel off gzip;

結果會是下方的一系列檔案。

lineorder_serial_0000.gz lineorder_serial_0001.gz lineorder_serial_0002.gz lineorder_serial_0003.gz

為了能更清楚區分輸出檔案,您可以在位置中包含字首。下列範例會卸載資料VENUE表並將資料寫入:s3://amzn-s3-demo-bucket/venue_pipe_

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/unload/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

結果會是 unload 資料夾中的這四個檔案,同樣假設有四個分割。

venue_pipe_0000_part_00 venue_pipe_0001_part_00 venue_pipe_0002_part_00 venue_pipe_0003_part_00

VENUE從卸載文件加載

要從一組卸載文件中加載表,只需使用COPY命令反轉過程即可。下列範例會建立新資料表LOADVENUE,並從上一個範例中建立的資料檔載入表格。

create table loadvenue (like venue); copy loadvenue from 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

如果您使用MANIFEST此選項建立包含卸載檔案的資訊清單檔案,您可以使用相同的資訊清單檔案載入資料。您可以使用帶有該MANIFEST選項的COPY命令來執行此操作。下列範例會使用資訊清單檔案載入資料。

copy loadvenue from 's3://amzn-s3-demo-bucket/venue_pipe_manifest' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest;

卸載VENUE到加密文件

下列範例會使用 AWS KMS 金鑰將VENUE資料表卸載至一組加密檔案。如果您使用ENCRYPTED選項指定資訊清單檔案,資訊清單檔案也會加密。如需詳細資訊,請參閱卸載加密的資料檔案

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_encrypt_kms' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' kms_key_id '1234abcd-12ab-34cd-56ef-1234567890ab' manifest encrypted;

下列範例會使用根對稱金鑰將VENUE資料表卸載至一組加密檔案。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_encrypt_cmk' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' master_symmetric_key 'EXAMPLEMASTERKEYtkbjk/OpCwtYSx/M4/t7DMCDIK722' encrypted;

VENUE從加密檔案載入

若要從UNLOAD搭配ENCRYPT選項使用建立的一組檔案載入表格,請使用COPY指令反轉程序。使用該命令,使用ENCRYPTED選項並指定用於該UNLOAD命令的相同根對稱金鑰。下列範例會從前一個範例中建立的加密資料檔案載入資料LOADVENUE表。

create table loadvenue (like venue); copy loadvenue from 's3://amzn-s3-demo-bucket/venue_encrypt_manifest' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' master_symmetric_key 'EXAMPLEMASTERKEYtkbjk/OpCwtYSx/M4/t7DMCDIK722' manifest encrypted;

將VENUE資料卸載到以 Tab 字元分隔的檔案

unload ('select venueid, venuename, venueseats from venue') to 's3://amzn-s3-demo-bucket/venue_tab_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' delimiter as '\t';

輸出資料檔案看起來像這樣:

1 Toyota Park Bridgeview IL 0 2 Columbus Crew Stadium Columbus OH 0 3 RFK Stadium Washington DC 0 4 CommunityAmerica Ballpark Kansas City KS 0 5 Gillette Stadium Foxborough MA 68756 ...

卸載VENUE到固定寬度的數據文件

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_fw_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' fixedwidth as 'venueid:3,venuename:39,venuecity:16,venuestate:2,venueseats:6';

輸出資料檔案看起來會像下面這樣。

1 Toyota Park Bridgeview IL0 2 Columbus Crew Stadium Columbus OH0 3 RFK Stadium Washington DC0 4 CommunityAmerica BallparkKansas City KS0 5 Gillette Stadium Foxborough MA68756 ...

卸載VENUE到一組以 Tab 字元分GZIP隔的壓縮檔

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_tab_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' delimiter as '\t' gzip;

卸載VENUE到GZIP壓縮的文本文件

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_tab_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' extension 'txt.gz' gzip;

卸載包含分隔符號的資料

此範例使用選ADDQUOTES項來卸載逗號分隔的資料,其中一些實際資料欄位包含逗號。

首先,請建立包含引號的資料表。

create table location (id int, location char(64)); insert into location values (1,'Phoenix, AZ'),(2,'San Diego, CA'),(3,'Chicago, IL');

然後,使用該ADDQUOTES選項卸載數據。

unload ('select id, location from location') to 's3://amzn-s3-demo-bucket/location_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' delimiter ',' addquotes;

卸載的資料檔案看起來像這樣:

1,"Phoenix, AZ" 2,"San Diego, CA" 3,"Chicago, IL" ...

卸載聯結查詢的結果

以下範例會卸載包含視窗函數之聯結查詢的結果。

unload ('select venuecity, venuestate, caldate, pricepaid, sum(pricepaid) over(partition by venuecity, venuestate order by caldate rows between 3 preceding and 3 following) as winsum from sales join date on sales.dateid=date.dateid join event on event.eventid=sales.eventid join venue on event.venueid=venue.venueid order by 1,2') to 's3://amzn-s3-demo-bucket/tickit/winsum' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

輸出檔案看起來像這樣:

Atlanta|GA|2008-01-04|363.00|1362.00 Atlanta|GA|2008-01-05|233.00|2030.00 Atlanta|GA|2008-01-06|310.00|3135.00 Atlanta|GA|2008-01-08|166.00|8338.00 Atlanta|GA|2008-01-11|268.00|7630.00 ...

使用 NULL AS 卸載

UNLOAD默認情況下輸出 null 值作為空字符串。下列範例說明如何使用 NULL AS 來取代空值的文字字串。

對於這些示例,我們將一些 null 值添加到VENUE表中。

update venue set venuestate = NULL where venuecity = 'Cleveland';

從VENUE其中選取VENUESTATE為 null,以驗證資料行是否包含NULL。

select * from venue where venuestate is null; venueid | venuename | venuecity | venuestate | venueseats ---------+--------------------------+-----------+------------+------------ 22 | Quicken Loans Arena | Cleveland | | 0 101 | Progressive Field | Cleveland | | 43345 72 | Cleveland Browns Stadium | Cleveland | | 73200

現UNLOAD在,使VENUE用 NULL AS 選項用字符串 'fred' 替換空值的表。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/nulls/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' null as 'fred';

下列來自卸載檔案的範例顯示,null 值已取代為 fred。事實證明,的某些值也為 null,並被替換為fred。VENUESEATS即使的資料類型VENUESEATS為整數,請將值UNLOAD轉換為卸載檔案中的文字,然後COPY將它們轉換回整數。如果您要卸載到固定寬度的檔案,NULLAS 字串不得大於欄位寬度。

248|Charles Playhouse|Boston|MA|0 251|Paris Hotel|Las Vegas|NV|fred 258|Tropicana Hotel|Las Vegas|NV|fred 300|Kennedy Center Opera House|Washington|DC|0 306|Lyric Opera House|Baltimore|MD|0 308|Metropolitan Opera|New York City|NY|0 5|Gillette Stadium|Foxborough|MA|5 22|Quicken Loans Arena|Cleveland|fred|0 101|Progressive Field|Cleveland|fred|43345 ...

若要從卸載檔案載入表格,請使用具有相同 NULL AS 選項的COPY指令。

注意

如果您嘗試將空值載入定義為的資料行 NOTNULL,則COPY命令會失敗。

create table loadvenuenulls (like venue); copy loadvenuenulls from 's3://amzn-s3-demo-bucket/nulls/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' null as 'fred';

要驗證列包含 null,而不僅僅是空字符串,請從中選擇LOADVENUENULLS並過濾 null。

select * from loadvenuenulls where venuestate is null or venueseats is null; venueid | venuename | venuecity | venuestate | venueseats ---------+--------------------------+-----------+------------+------------ 72 | Cleveland Browns Stadium | Cleveland | | 73200 253 | Mirage Hotel | Las Vegas | NV | 255 | Venetian Hotel | Las Vegas | NV | 22 | Quicken Loans Arena | Cleveland | | 0 101 | Progressive Field | Cleveland | | 43345 251 | Paris Hotel | Las Vegas | NV | ...

您可以UNLOAD使用預設 AS 行為包含空值的資料表,然後COPY使用預設 NULL AS 行為將資料回到資料表中;不過,目標資料表中的任何非數值欄位都包含空字串,而非空值。NULL默認情況下,空值UNLOAD轉換為空字符串(空格或零長度)。COPY將空字串轉換NULL為數值欄,但將空字串插入非數值欄中。下列範例會示範如何執行UNLOAD後續COPY使用預設 NULL AS 行為。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/nulls/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' allowoverwrite; truncate loadvenuenulls; copy loadvenuenulls from 's3://amzn-s3-demo-bucket/nulls/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

在這種情況下,當您篩選空值時,只有VENUESEATS包含空值的列。凡在表中包VENUESTATE含空值 (VENUE),VENUESTATE在目標表 (LOADVENUENULLS) 中包含空字符串。

select * from loadvenuenulls where venuestate is null or venueseats is null; venueid | venuename | venuecity | venuestate | venueseats ---------+--------------------------+-----------+------------+------------ 253 | Mirage Hotel | Las Vegas | NV | 255 | Venetian Hotel | Las Vegas | NV | 251 | Paris Hotel | Las Vegas | NV | ...

若要將空字串載入至非數值欄NULL,請包括EMPTYASNULL或選BLANKSASNULL項。也可以兩者都使用。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/nulls/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' allowoverwrite; truncate loadvenuenulls; copy loadvenuenulls from 's3://amzn-s3-demo-bucket/nulls/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' EMPTYASNULL;

要驗證列是否包含NULL,而不僅僅是空白或空字符串,請從中選擇LOADVENUENULLS並過濾 null。

select * from loadvenuenulls where venuestate is null or venueseats is null; venueid | venuename | venuecity | venuestate | venueseats ---------+--------------------------+-----------+------------+------------ 72 | Cleveland Browns Stadium | Cleveland | | 73200 253 | Mirage Hotel | Las Vegas | NV | 255 | Venetian Hotel | Las Vegas | NV | 22 | Quicken Loans Arena | Cleveland | | 0 101 | Progressive Field | Cleveland | | 43345 251 | Paris Hotel | Las Vegas | NV | ...

使ALLOWOVERWRITE用參數卸載

依預設,UNLOAD不會覆寫目標值區中的現有檔案。例如,如果您執行相同的UNLOAD陳述式兩次,而不修改目的地值區中的檔案,則第二個陳述式UNLOAD會失敗。若要覆寫現有檔案 (包括資訊清單檔案),請指定ALLOWOVERWRITE選項。

unload ('select * from venue') to 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest allowoverwrite;

使用PARALLEL和MANIFEST參數卸載EVENT表

您可以 parallel 地UNLOAD建立資料表並產生資訊清單檔案。Amazon S3 資料檔案都是在相同層級建立的,名稱字尾為模式 0000_part_00。清單檔案與資料檔案位於相同的資料夾層級,並加上字尾 manifest。以下內容將SQL卸載EVENT表並創建具有基本名稱的文件 parallel

unload ('select * from mytickit1.event') to 's3://amzn-s3-demo-bucket/parallel' iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole' parallel on manifest;

Amazon S3 檔案清單類似以下內容。

Name Last modified Size parallel0000_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 52.1 KB parallel0001_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 53.4 KB parallel0002_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 52.1 KB parallel0003_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 51.1 KB parallel0004_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 54.6 KB parallel0005_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 53.4 KB parallel0006_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 54.1 KB parallel0007_part_00 - August 2, 2023, 14:54:39 (UTC-07:00) 55.9 KB parallelmanifest - August 2, 2023, 14:54:39 (UTC-07:00) 886.0 B

parallelmanifest 檔案內容類似以下內容。

{ "entries": [ {"url":"s3://amzn-s3-demo-bucket/parallel0000_part_00", "meta": { "content_length": 53316 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0001_part_00", "meta": { "content_length": 54704 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0002_part_00", "meta": { "content_length": 53326 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0003_part_00", "meta": { "content_length": 52356 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0004_part_00", "meta": { "content_length": 55933 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0005_part_00", "meta": { "content_length": 54648 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0006_part_00", "meta": { "content_length": 55436 }}, {"url":"s3://amzn-s3-demo-bucket/parallel0007_part_00", "meta": { "content_length": 57272 }} ] }

使用PARALLELOFF和MANIFEST參數卸載EVENT表

您可以連續UNLOAD一個表(PARALLELOFF)並生成清單文件。Amazon S3 資料檔案都是在相同層級建立的,名稱字尾為模式 0000。清單檔案與資料檔案位於相同的資料夾層級,並加上字尾 manifest

unload ('select * from mytickit1.event') to 's3://amzn-s3-demo-bucket/serial' iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole' parallel off manifest;

Amazon S3 檔案清單類似以下內容。

Name Last modified Size serial0000 - August 2, 2023, 15:54:39 (UTC-07:00) 426.7 KB serialmanifest - August 2, 2023, 15:54:39 (UTC-07:00) 120.0 B

serialmanifest 檔案內容類似以下內容。

{ "entries": [ {"url":"s3://amzn-s3-demo-bucket/serial000", "meta": { "content_length": 436991 }} ] }

使用 PARTITION BY 和MANIFEST參數卸載EVENT表

您可以UNLOAD按分區劃分表並生成清單文件。Amazon S3 中會建立一個新資料夾,其中包含子分割區資料夾,而子資料夾中的資料檔案名稱模式類似於 0000_par_00。清單檔案與名稱為 manifest 的子資料夾位於相同的資料夾層級。

unload ('select * from mytickit1.event') to 's3://amzn-s3-demo-bucket/partition' iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole' partition by (eventname) manifest;

Amazon S3 檔案清單類似以下內容。

Name Type Last modified Size partition Folder

partition 資料夾中的是具有分割區名稱和清單檔案的子文件夾。下面顯示的是資料夾 partition 中資料夾清單的底端,類似以下內容。

Name Type Last modified Size ... eventname=Zucchero/ Folder eventname=Zumanity/ Folder eventname=ZZ Top/ Folder manifest - August 2, 2023, 15:54:39 (UTC-07:00) 467.6 KB

eventname=Zucchero/ 資料夾中的資料檔案類似於以下內容。

Name Last modified Size 0000_part_00 - August 2, 2023, 15:59:19 (UTC-07:00) 70.0 B 0001_part_00 - August 2, 2023, 15:59:16 (UTC-07:00) 106.0 B 0002_part_00 - August 2, 2023, 15:59:15 (UTC-07:00) 70.0 B 0004_part_00 - August 2, 2023, 15:59:17 (UTC-07:00) 141.0 B 0006_part_00 - August 2, 2023, 15:59:16 (UTC-07:00) 35.0 B 0007_part_00 - August 2, 2023, 15:59:19 (UTC-07:00) 108.0 B

manifest 檔案內容底部類似以下內容。

{ "entries": [ ... {"url":"s3://amzn-s3-demo-bucket/partition/eventname=Zucchero/007_part_00", "meta": { "content_length": 108 }}, {"url":"s3://amzn-s3-demo-bucket/partition/eventname=Zumanity/007_part_00", "meta": { "content_length": 72 }} ] }

使用MAXFILESIZE、ROWGROUPSIZE和MANIFEST參數卸載EVENT表格

您可以 parallel 地UNLOAD建立資料表並產生資訊清單檔案。Amazon S3 資料檔案都是在相同層級建立的,名稱字尾為模式 0000_part_00。產生的 Parquet 資料檔案會限制為 256 MB,資料列群組大小為 128 MB。清單檔案與資料檔案位於相同的資料夾層級,並加上字尾 manifest

unload ('select * from mytickit1.event') to 's3://amzn-s3-demo-bucket/eventsize' iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole' maxfilesize 256 MB rowgroupsize 128 MB parallel on parquet manifest;

Amazon S3 檔案清單類似以下內容。

Name Type Last modified Size eventsize0000_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 24.5 KB eventsize0001_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 24.8 KB eventsize0002_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 24.4 KB eventsize0003_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 24.0 KB eventsize0004_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 25.3 KB eventsize0005_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 24.8 KB eventsize0006_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 25.0 KB eventsize0007_part_00.parquet parquet August 2, 2023, 17:35:21 (UTC-07:00) 25.6 KB eventsizemanifest - August 2, 2023, 17:35:21 (UTC-07:00) 958.0 B

eventsizemanifest 檔案內容類似以下內容。

{ "entries": [ {"url":"s3://amzn-s3-demo-bucket/eventsize0000_part_00.parquet", "meta": { "content_length": 25130 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0001_part_00.parquet", "meta": { "content_length": 25428 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0002_part_00.parquet", "meta": { "content_length": 25025 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0003_part_00.parquet", "meta": { "content_length": 24554 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0004_part_00.parquet", "meta": { "content_length": 25918 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0005_part_00.parquet", "meta": { "content_length": 25362 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0006_part_00.parquet", "meta": { "content_length": 25647 }}, {"url":"s3://amzn-s3-demo-bucket/eventsize0007_part_00.parquet", "meta": { "content_length": 26256 }} ] }