卡卡米的記憶體: sql 的 where in 查詢效能問題

2017/8/22

sql 的 where in 查詢效能問題

markdown 問題：當我想要where in 一個很大的陣列，或者子查詢，該怎麼寫呢？在[Performance issue in update query](https://stackoverflow.com/questions/24647503/performance-issue-in-update-query)提到：各種方法，從慢到快： * 打Ｎ次query，用 where = 每次只查一個值 * 打1次query，用 where in 或是 where exists * 使用 join values * (當資料量足夠大時) 建立一個暫時表並且加上索引，然後使用join 在[SQL: When it comes to NOT IN and NOT EQUAL TO, which is more efficient and why?](https://stackoverflow.com/questions/17037508/sql-when-it-comes-to-not-in-and-not-equal-to-which-is-more-efficient-and-why/17038097#17038097)提到，各種 query 從慢到快： * where field = v1 and field = v2 and ... and field = vn * where field in (v1,v2...vn) * 使用 CTE 做 values join 或 values subquery 都比前面快100倍以上 * 暫存表 join 跟暫存表 subquery 跟 CTE values join 差不多快 (也許是資料量大小的問題) CTE:Common Table Expression 是一種臨時定義一個 table 的語法，只能在下一個 select 使用，文章中有提到： ``` WITH excluded(item) AS ( VALUES('item1'), ('item2'), ('item3'), ('item4'),('item5') ) SELECT * FROM thetable t WHERE NOT EXISTS(SELECT 1 FROM excluded e WHERE t.item = e.item); ``` 其中的 WITH AS 就是建立 CTE 的語法在[Optimizing a Postgres query with a large IN](https://dba.stackexchange.com/questions/91247/optimizing-a-postgres-query-with-a-large-in)中也提到了： * IN 最慢 * 使用 join values 是很好的 * 資料量太大就建立暫存表，並且加上index，再去join 結論：使用 with values 建立 CTE 再拿去做 join 應該是最簡單又有效的作法。

沒有留言:

張貼留言