作者:何文超,分享 MySQL 和 OceanBase 相关技术博文。 个人博客【CSDN | 雅俗数据库】
爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。
本文约 2000 字,预计阅读需要 6 分钟
在 SQL 优化的过程中,经常会通过 指定驱动表 或 修改表的关联方式 来实现。下面将以案例的形式来介绍他们的不同之处以及使用场景需要满足的条件。
SQL 耗时:11.25s
-- SQL中 IN 的条件很多,内容中已简化
SELECTDISTINCT
STORE_ID,
PRODUCT_ID
FROM (
SELECTDISTINCT
ASP.PRODUCT_ID,
t.STORE_ID
FROM
CT_ACT A
JOIN
CT_ACT_STAGE CAS ON A.ACT_ID = CAS.ACT_ID
JOIN
CT_ACT_STAGE_PRODUCT ASP ON CAS.STAGE_ID = ASP.STAGE_ID
AND ASP.PRODUCT_STATUS = '1'
AND ASP.PRODUCT_ID IN (
'10185219',
'10382854'
)
JOIN
CT_STORE_PRODUCT_REL t ON ASP.PRODUCT_ID = t.PRODUCT_ID
AND t.RELATIONSHIP_STATUS = '01'
AND t.STORE_ID IN (
'299800000149313',
'299800000148811',
'a2f162ae0fbe47c9b7b762ed27deb9b1',
'7787a5cb102744088f46b381ee667fd9'
)
);
===============================================================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------------------------------------------------------------
|0 |MERGEDISTINCT | |1 |565 |
|1 | SORT | |1 |565 |
|2 | NESTED-LOOPJOIN | |1 |565 |
|3 | NESTED-LOOPJOIN | |1 |553 |
|4 | NESTED-LOOPJOIN| |1 |276 |
|5 | TABLESCAN |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID) |1 |264 |
|6 | TABLEGET |CAS |1 |24 |
|7 | TABLESCAN |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|1 |552 |
|8 | TABLESCAN |A(UK_CT_ACT_ACT_ID) |1 |24 |
===============================================================================================
OutlineData:
BEGIN_OUTLINE_DATA
NO_USE_HASH_AGGREGATION(@"SEL$2")
LEADING(@"SEL$2" ((("nctmbasedb.ASP"@"SEL$2""nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" )"nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
PQ_DISTRIBUTE(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ) LOCALLOCAL)
NO_USE_NL_MATERIALIZATION(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
PQ_DISTRIBUTE(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ) LOCALLOCAL)
NO_USE_NL_MATERIALIZATION(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
PQ_DISTRIBUTE(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ) LOCALLOCAL)
NO_USE_NL_MATERIALIZATION(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
INDEX(@"SEL$2""nctmbasedb.ASP"@"SEL$2""IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID")
FULL(@"SEL$2""nctmbasedb.CAS"@"SEL$2")
INDEX(@"SEL$2""nctmbasedb.t"@"SEL$2""IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID")
INDEX(@"SEL$2""nctmbasedb.A"@"SEL$2""UK_CT_ACT_ACT_ID")
END_OUTLINE_DATA
Optimization Info:
-------------------------------------
ASP:table_rows:9737755, physical_range_rows:1, logical_range_rows:1, index_back_rows:1, output_rows:0, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[IDX_CT_ACT_STAGE_PRODUCT_STAGE_ID,IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_STATUS,IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID], unstable_index_name[ct_act_stage_product], estimation info[table_id:1101710651081578, (table_type:1, version:0-1729706411005113-1729706411005113, logical_rc:0, physical_rc:0), (table_type:7, version:1729706401874541-1729706401874541-1729706436349595, logical_rc:0, physical_rc:0), (table_type:5, version:1729706401874541-1729706401874541-1729706436349595, logical_rc:0, physical_rc:0), (table_type:0, version:1729706436349595-1729706436349595-9223372036854775807, logical_rc:0, physical_rc:0)]
CAS:table_rows:116467, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:1, est_method:local_storage, optimization_method=rule_based, heuristic_rule=unique_index_without_indexback
t:table_rows:6563720, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:1, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[IDX_CT_STORE_PRODUCT_REL_PRODUCT_ID,IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID], pruned_index_name[IDX_CT_STORE_PRODUCT_REL_AGREEMENT_ID,IDX_CT_STORE_PRODUCT_REL_ADD_UPD_STATUS,IDX_CT_STORE_PRODUCT_REL_RELATIONSHIP_STATUS,IDX_CT_STORE_PRODUCT_REL_STORE_ID,ct_store_product_rel], estimation info[table_id:1101710651081772, (table_type:1, version:0-1729706411005113-1729706411005113, logical_rc:136, physical_rc:136), (table_type:7, version:1729706401450022-1729706401450022-1729706435320181, logical_rc:0, physical_rc:0), (table_type:5, version:1729706401450022-1729706401450022-1729706435320181, logical_rc:0, physical_rc:0), (table_type:0, version:1729706435320181-1729706435320181-9223372036854775807, logical_rc:0, physical_rc:0)]
A:table_rows:9912, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:1, est_method:local_storage, optimization_method=rule_based, heuristic_rule=unique_index_without_indexback
表名 | 别名 | 数据量 |
---|---|---|
CT_ACT | A | 9912 |
CT_ACT_STAGE | CAS | 116467 |
CT_ACT_STAGE_PRODUCT | ASP | 9737755 |
CT_STORE_PRODUCT_REL | t | 6563720 |
CAS
表走的主键索引,故先不检查关联字段情况ASP
表是驱动表,大表作为驱动表,非最优ASP
、t
表,where
条件过滤性都较高查看 Outline Data:
注意:ASP
表是驱动表,所以不显示关联。
LEADING(@"SEL$2" ((("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" )"nctmbasedb.A"@"SEL$2" ))
小结
A
,t
,CAS
三张表走的LNESTED-LOOP JOIN
关联,其中A
表,CAS
表数据量不大,NLJ
关联 符合预期;t
表是大表,且where
过滤条件中,t.STORE_ID
是有效的过滤条件,故考虑让 t
表走hash
关联;ASP
表是驱动表,大表作为驱动表,非最优。/*+leading(A) use_nl(A,CAS,ASP,t) */
SQL 执行时间超过 30s,人为中断。
执行计划等价于:
LEADING(@"SEL$2" ((("nctmbasedb.A"@"SEL$2" "nctmbasedb.ASP"@"SEL$2" )"nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))
t
是大表,走了 USE_NL
关联,故 SQL 执行超时;A
虽然是小表,但是无直接 WHERE 过滤条件,故不能通过索引快速匹配,不适合作为驱动表;===========================================================================================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
-----------------------------------------------------------------------------------------------------------
|0 |MERGEDISTINCT | |1 |127753|
|1 | SORT | |1 |127753|
|2 | NESTED-LOOPJOIN | |1 |127753|
|3 | NESTED-LOOPJOIN | |1 |127474|
|4 | NESTED-LOOPJOIN CARTESIAN| |4956 |5911 |
|5 | TABLESCAN |A(UK_CT_ACT_ACT_ID) |9912 |3835 |
|6 | MATERIAL | |1 |264 |
|7 | TABLESCAN |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID) |1 |264 |
|8 | TABLEGET |CAS |1 |24 |
|9 | TABLESCAN |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|1 |552 |
===========================================================================================================
Used Hint:
-------------------------------------
LEADING(@"SEL$2" ((("nctmbasedb.A"@"SEL$2" ))))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))
执行耗时:0.19s
/*+leading(t (ASP,CAS)A) use_nl(t,A,CAS,ASP) */
如上等价于如下:
LEADING(@"SEL$2" (("nctmbasedb.t"@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" ))"nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
表 t
虽然是大表,但存在有效的过滤条件。
===============================================================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------------------------------------------------------------
|0 |MERGEDISTINCT | |1 |4624|
|1 | SORT | |1 |4624|
|2 | NESTED-LOOPJOIN | |1 |4624|
|3 | HASHJOIN | |1 |4612|
|4 | TABLESCAN |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|3414 |1848|
|5 | NESTED-LOOPJOIN| |1 |276 |
|6 | TABLESCAN |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID) |1 |264 |
|7 | TABLEGET |CAS |1 |24 |
|8 | TABLESCAN |A(UK_CT_ACT_ACT_ID) |1 |24 |
===============================================================================================
Used Hint:
-------------------------------------
LEADING(@"SEL$2" (("nctmbasedb.t"@"SEL$2" ("nctmbasedb.ASP"@"SEL$2""nctmbasedb.CAS"@"SEL$2" ))"nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
添加 /*+use_hash(t,ASP) */
,耗时 0.06S。
查看 Used Hint 发现:
/*+use_hash(t,ASP) */ 等价于 USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
可考虑加 /*+ USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))*/ t 表大表
可考虑加 /*+ USE_HASH(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))*/ ASP 表大表,但是驱动表,且执行计划未显示 NLJ
可考虑加 /*+ USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" )) USE_HASH(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))*/ 两张大表走 hash join
===============================================================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------------------------------------------------------------
|0 |MERGEDISTINCT | |1 |4351|
|1 | SORT | |1 |4351|
|2 | HASHJOIN | |1 |4351|
|3 | NESTED-LOOPJOIN | |1 |289 |
|4 | NESTED-LOOPJOIN| |1 |276 |
|5 | TABLESCAN |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID) |1 |264 |
|6 | TABLEGET |CAS |1 |24 |
|7 | TABLESCAN |A(UK_CT_ACT_ACT_ID) |1 |24 |
|8 | TABLESCAN |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|3414 |1848|
===============================================================================================
Used Hint:
-------------------------------------
USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
OutlineData:
-------------------------------------
BEGIN_OUTLINE_DATA
NO_USE_HASH_AGGREGATION(@"SEL$2")
LEADING(@"SEL$2" ((("nctmbasedb.ASP"@"SEL$2""nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.A"@"SEL$2" )"nctmbasedb.t"@"SEL$2" ))
USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
PQ_DISTRIBUTE(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ) LOCALLOCAL)
USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
PQ_DISTRIBUTE(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ) LOCALLOCAL)
NO_USE_NL_MATERIALIZATION(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
PQ_DISTRIBUTE(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ) LOCALLOCAL)
NO_USE_NL_MATERIALIZATION(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
INDEX(@"SEL$2""nctmbasedb.ASP"@"SEL$2""IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID")
FULL(@"SEL$2""nctmbasedb.CAS"@"SEL$2")
INDEX(@"SEL$2""nctmbasedb.A"@"SEL$2""UK_CT_ACT_ACT_ID")
INDEX(@"SEL$2""nctmbasedb.t"@"SEL$2""IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID")
END_OUTLINE_DATA
优化方式 | hint | 效果 |
---|---|---|
指定小表(A表)为驱动表 | /*+leading(A) use_nl(A,CAS,ASP,t) */ | SQL执行超过30s,人为中断 |
指定大表(t表)为驱动表 | /*+leading(t (ASP,CAS)A) use_nl(t,A,CAS,ASP) */ | 执行耗时 0.19s |
改为hash关联方式 | /*+use_hash(t,ASP) */ | 耗时 0.06S |
以上两个条件需要都满足。
LEADING(@"SEL$2" ((("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" )"nctmbasedb.A"@"SEL$2" ))
在 LEADING
提示所定义的连接顺序里,最外层括号中处于最左边的表就是驱动表。
回答:小表通常适合作为驱动表,但并非在所有情况下都绝对适合。
以下是一些需要考虑的因素:
回答:大表不一定不适合作为驱动表。
在一些特定情况下,大表作为驱动表也能获得较好的查询性能
本文关键字:#OceanBase #SQL优化