診斷查詢以進行查詢調校 - Amazon Redshift

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

診斷查詢以進行查詢調校

使用下列查詢來識別查詢的問題或可能影響查詢效能的基礎資料表。建議您使用這些查詢中討論的查詢調校程序分析和改善查詢

識別用於調校的最高候選項目查詢

下列查詢會識別過去 7 天中已執行的前 50 個最耗時的陳述式。您可以使用結果來識別需要非常長的查詢。您可以也識別經常執行的查詢 (在結果集中出現超過一次的那些)。這些查詢往往是進行調校以改善系統效能的良好候選項目。

此查詢也提供與所識別的每個查詢關聯的提醒事件計數。這些提醒提供的詳細資訊可供您用來改善查詢的效能。如需詳細資訊,請參閱 檢閱查詢提醒

select trim(database) as db, count(query) as n_qry, max(substring (qrytext,1,80)) as qrytext, min(run_minutes) as "min" , max(run_minutes) as "max", avg(run_minutes) as "avg", sum(run_minutes) as total, max(query) as max_query_id, max(starttime)::date as last_run, sum(alerts) as alerts, aborted from (select userid, label, stl_query.query, trim(database) as database, trim(querytxt) as qrytext, md5(trim(querytxt)) as qry_md5, starttime, endtime, (datediff(seconds, starttime,endtime)::numeric(12,2))/60 as run_minutes, alrt.num_events as alerts, aborted from stl_query left outer join (select query, 1 as num_events from stl_alert_event_log group by query ) as alrt on alrt.query = stl_query.query where userid <> 1 and starttime >= dateadd(day, -7, current_date)) group by database, label, qry_md5, aborted order by total desc limit 50;

識別具有資料扭曲或未排序資料列的資料表

下列查詢會識別具有不均資料配送 (資料偏度) 或高比例未排序資料列的資料表。

低的 skew 值指出資料表資料已正確配送。如果資料表有 4.00 或更高的 skew 值,請考慮修改其資料配送樣式。如需詳細資訊,請參閱 次佳資料分佈

如果資料表的 pct_unsorted 值大於 20%,請考慮執行 VACUUM 命令。如需詳細資訊,請參閱 未排序或排序錯誤的資料列

同時另外檢閱每個資料表的 mbytespct_of_total 值。這些資料欄會識別資料表的大小,以及資料表耗用的原始磁碟空間百分比。原始磁碟空間包括 Amazon Redshift 保留供內部使用的空間,因此會大於名目磁碟容量,它是可供使用者使用的磁碟空間容量。使用此資訊驗證您的可用磁碟空間至少是您最大資料表的 2.5 倍。有此可用空間可讓系統在處理複雜查詢時將中繼結果寫入至磁碟。

select trim(pgn.nspname) as schema, trim(a.name) as table, id as tableid, decode(pgc.reldiststyle,0, 'even',1,det.distkey ,8,'all') as distkey, dist_ratio.ratio::decimal(10,4) as skew, det.head_sort as "sortkey", det.n_sortkeys as "#sks", b.mbytes, decode(b.mbytes,0,0,((b.mbytes/part.total::decimal)*100)::decimal(5,2)) as pct_of_total, decode(det.max_enc,0,'n','y') as enc, a.rows, decode( det.n_sortkeys, 0, null, a.unsorted_rows ) as unsorted_rows , decode( det.n_sortkeys, 0, null, decode( a.rows,0,0, (a.unsorted_rows::decimal(32)/a.rows)*100) )::decimal(5,2) as pct_unsorted from (select db_id, id, name, sum(rows) as rows, sum(rows)-sum(sorted_rows) as unsorted_rows from stv_tbl_perm a group by db_id, id, name) as a join pg_class as pgc on pgc.oid = a.id join pg_namespace as pgn on pgn.oid = pgc.relnamespace left outer join (select tbl, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl inner join (select attrelid, min(case attisdistkey when 't' then attname else null end) as "distkey", min(case attsortkeyord when 1 then attname else null end ) as head_sort , max(attsortkeyord) as n_sortkeys, max(attencodingtype) as max_enc from pg_attribute group by 1) as det on det.attrelid = a.id inner join ( select tbl, max(mbytes)::decimal(32)/min(mbytes) as ratio from (select tbl, trim(name) as name, slice, count(*) as mbytes from svv_diskusage group by tbl, name, slice ) group by tbl, name ) as dist_ratio on a.id = dist_ratio.tbl join ( select sum(capacity) as total from stv_partitions where part_begin=0 ) as part on 1=1 where mbytes is not null order by mbytes desc;

識別具有巢狀迴圈的查詢

下列查詢可識別已針對巢狀迴路記錄提醒事件的查詢。如需如何修正巢狀迴路條件的資訊,請參閱巢狀迴圈

select query, trim(querytxt) as SQL, starttime from stl_query where query in ( select distinct query from stl_alert_event_log where event like 'Nested Loop Join in the query plan%') order by starttime desc;

檢閱查詢的佇列等候時間

下列查詢顯示最近的查詢在運行之前等候查詢隊列中開放位置的時間。如果您看見較高的等候時間趨勢,您可能想要修改您的查詢佇列組態以獲得更好的傳輸量。如需詳細資訊,請參閱 實作手動 WLM

select trim(database) as DB , w.query, substring(q.querytxt, 1, 100) as querytxt, w.queue_start_time, w.service_class as class, w.slot_count as slots, w.total_queue_time/1000000 as queue_seconds, w.total_exec_time/1000000 exec_seconds, (w.total_queue_time+w.total_Exec_time)/1000000 as total_seconds from stl_wlm_query w left join stl_query q on q.query = w.query and q.userid = w.userid where w.queue_start_Time >= dateadd(day, -7, current_Date) and w.total_queue_Time > 0 and w.userid >1 and q.starttime >= dateadd(day, -7, current_Date) order by w.total_queue_time desc, w.queue_start_time desc limit 35;

依資料表檢閱查詢提醒

下列查詢可識別已為其記錄提醒事件的資料表,也可識別最常引發的提醒類型。

如果minutes已識別資料表的資料列的值較高,請檢查資料表,以查看它是否需要例行維護、例如ANALYZE或者VACUUM運行反對它。

如果資料列的 count 值較高但 table 值為 null,請對 STL_ALERT_EVENT_LOG 執行查詢,以取得關聯的 event 值,調查為何這麼常引發該提醒。

select trim(s.perm_table_name) as table, (sum(abs(datediff(seconds, s.starttime, s.endtime)))/60)::numeric(24,0) as minutes, trim(split_part(l.event,':',1)) as event, trim(l.solution) as solution, max(l.query) as sample_query, count(*) from stl_alert_event_log as l left join stl_scan as s on s.query = l.query and s.slice = l.slice and s.segment = l.segment and s.step = l.step where l.event_time >= dateadd(day, -7, current_Date) group by 1,3,4 order by 2 desc,6 desc;

識別具有遺漏統計資訊的資料表

下列查詢提供您要對遺漏統計資料的資料表執行的查詢計數。如果此查詢傳回任何資料列,請查看 plannode 值來判斷受影響的資料表,然後對其執行ANALYZE

select substring(trim(plannode),1,100) as plannode, count(*) from stl_explain where plannode like '%missing statistics%' group by plannode order by 2 desc;