前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >OB 运维 | 存储过程慢?可能是 plan cache 失效了

OB 运维 | 存储过程慢?可能是 plan cache 失效了

作者头像
爱可生开源社区
发布2024-09-19 19:17:01
560
发布2024-09-19 19:17:01
举报
文章被收录于专栏:爱可生开源社区

作者:程柳润,爱可生 DBA 团队成员,OceanBase 数据库技术爱好者。

爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。

本文约 1000 字,预计阅读需要 3 分钟。


1问题描述

业务反馈一个存储过程执行慢,需要排查原因。由于存储过程过长且涉及多张表,所以主要分享排查思路。

2分析过程

1. 查找该存储过程的 trace_id

每次执行存储过程后会有两条记录,两条记录 SQL 的请求类型不同,先产生的记录是预编译的记录(执行耗时很小),另一个是实际执行的记录。

代码语言:javascript
复制
obclient [oceanbase]> select usec_to_time(request_time),svr_ip,trace_id,query_sql,elapsed_time from gv$sql_audit where query_sql like 'call TEST_CHENG%' order by request_time;
+----------------------------+----------------+-----------------------------------+-------------------+--------------+--------------+
| usec_to_time(request_time) | svr_ip         | trace_id                          | query_sql         | elapsed_time | request_type |
+----------------------------+----------------+-----------------------------------+-------------------+--------------+--------------+
| 2023-12-16 18:20:31.741933 | xx.xxx.xxx.xxx | YB420ABA3D1F-00060A149FDC35FD-0-0 | call TEST_CHENG() |           73 |             5|
| 2023-12-16 18:20:31.742838 | xx.xxx.xxx.xxx | YB420ABA3D1F-00060A149CFDF57C-0-0 | call TEST_CHENG() |      2362075 |             6|
+----------------------------+----------------+-----------------------------------+-------------------+--------------+--------------+
2 rows in set (0.614 sec)

2. 分析存储中较慢的 SQL

根据上一步中执行较慢的 trace_id 查看该存储过程中执行较慢的 SQL。发现存储过程中的几个 UPDATE 耗时较长。另外,IS_HIT_PLAN 都为 0,表示没有命中 plan cache。然而其 GET_PLAN_TIME(获取执行计划时间)也为 0,是不正常的,这一点会在后面解释。

代码语言:javascript
复制
obclient [oceanbase]> select usec_to_time(request_time),query_sql,elapsed_time,execute_time,AFFECTED_ROWS,EVENT,TOTAL_WAIT_TIME_MICRO,WAIT_TIME_MICRO,TOTAL_WAITS,GET_PLAN_TIME,QUEUE_TIME,IS_HIT_PLAN,plan_id,svr_ip from gv$sql_audit where trace_id='YB420ABA3D1F-00060A149CFDF57C-0-0' order by elapsed_time desc limit 6\G
*************************** 1. row ***************************
usec_to_time(request_time): 2023-12-16 18:20:31.742838
                 query_sql: call TEST_CHENG()
              elapsed_time: 2362075
              execute_time: 2362015
......
             GET_PLAN_TIME: 21
                QUEUE_TIME: 15
               IS_HIT_PLAN: 0
......
*************************** 2. row ***************************
usec_to_time(request_time): 2023-12-16 18:20:32.359304
                 query_sql: update "CC"."TEST1" "T" set "T"."CITY" = '1',"T"."COUNTY" = (select 
              elapsed_time: 630735
              execute_time: 630735
......
             GET_PLAN_TIME: 0
                QUEUE_TIME: 0
               IS_HIT_PLAN: 0
......
*************************** 3. row ***************************
usec_to_time(request_time): 2023-12-16 18:20:32.999169
                 query_sql: update "CC"."TEST1" "T" set "T"."CITY" = '0',"T"."COUNTY" = (select 
              elapsed_time: 400111
              execute_time: 400111
......
             GET_PLAN_TIME: 0
                QUEUE_TIME: 0
               IS_HIT_PLAN: 0
......
*************************** 4. row ***************************
usec_to_time(request_time): 2023-12-16 18:20:33.731726
                 query_sql: update "CC"."TEST1" "T" set "T"."CITY" = '0',"T"."ROAD" = (select 
              elapsed_time: 335918
              execute_time: 335918
......
             GET_PLAN_TIME: 0
                QUEUE_TIME: 0
               IS_HIT_PLAN: 0
......
*************************** 5. row ***************************
usec_to_time(request_time): 2023-12-16 18:20:33.402608
                 query_sql: update "CC"."TEST1" "T" set "T"."CITY" = '0',"T"."ROAD" = (select 
              elapsed_time: 319150
              execute_time: 319150
......
             GET_PLAN_TIME: 0
                QUEUE_TIME: 0
               IS_HIT_PLAN: 0
......
*************************** 6. row ***************************
usec_to_time(request_time): 2023-12-16 18:20:32.086302
                 query_sql: truncate Table  TEST1
              elapsed_time: 142128
              execute_time: 142128
......
             GET_PLAN_TIME: 0
                QUEUE_TIME: 0
               IS_HIT_PLAN: 0
......
6 rows in set (5.342 sec)

3. 查看存储过程执行情况

发现调用该存储过程之后,对 __all_tenant_dependency 等系统视图有增删改查的操作(篇幅有限,这里只截取了一部分)。这是由于编译 PL 对象会对系统视图做一些操作,gv$sql_audit 会记录下来,可以用 tenant_id=业务租户 ID 来过滤掉这些记录。

另外,在 UPDATE 前有 TRUNCATE TEST1 的操作。该操作会导致两个问题:

  • 整个存储过程的 plan cache 无法命中,每次都要重新编译。
  • 存储过程中这张表相关的 SQL 的 plan cache 也会失效,每次执行也需要重新获取执行计划。
代码语言:javascript
复制
obclient [oceanbase]> select substr(query_sql,1,60),usec_to_time(request_time),elapsed_time,svr_ip from gv$sql_audit 
where trace_id='YB420ABA3D1F-00060A149CFDF57C-0-0' order by request_time;
+---------------------------------------------------+----------------------------+--------------+----------------+
| substr(query_sql,1,60)                            | usec_to_time(request_time) | elapsed_time | svr_ip         |
+---------------------------------------------------+----------------------------+--------------+----------------+
| call TEST_CHENG();                                | 2023-12-16 18:50:30.086159 |      2514812 | xx.xxx.xxx.xxx |
| START TRANSACTION                                 | 2023-10-07 18:50:30.472741 |          194 | xx.xxx.xxx.xxx |
| delete FROM __all_tenant_dependency WHERE dep_obj_| 2023-10-07 18:50:30.473589 |         1570 | xx.xxx.xxx.xxx |
| INSERT INTO __all_tenant_dependency (tenant_id, de| 2023-10-07 18:50:30.476450 |          243 | xx.xxx.xxx.xxx |
|......                                                                                                                                                                                                                         |
|......                                                                                                                                                                                                                         |
| insert into "CC"."TEST5"( "LOG_DATE", "LOG_LEVEL",| 2023-12-16 18:50:30.504996 |          393 | xx.xxx.xxx.xxx |
| truncate Table  TEST1                             | 2023-12-16 18:50:30.505998 |       569828 | xx.xxx.xxx.xxx |
| select max("T"."CREATE_TIME") from "CC"."TEST6    | 2023-12-16 18:50:31.176795 |          295 | xx.xxx.xxx.xxx |
| select max("T"."CREATE_TIME") from "CC"."TEST6    | 2023-12-16 18:50:31.177108 |          180 | xx.xxx.xxx.xxx |
| insert into "CC"."TEST1"( "ID", "NUMBER           | 2023-12-16 18:50:31.178344 |         8039 | xx.xxx.xxx.xxx |
| SELECT * FROM __all_tenant_meta_table WHERE tenant| 2023-12-16 18:50:31.184392 |          328 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."ID" =            | 2023-12-16 18:50:31.187246 |         4696 | xx.xxx.xxx.xxx |
| SELECT partition_id, object_type, row_cnt as row_c| 2023-12-16 18:50:31.189366 |          181 | xx.xxx.xxx.xxx |
| SELECT tenant_id, table_id, partition_id, column_i| 2023-12-16 18:50:31.190145 |          224 | xx.xxx.xxx.xxx |
| SELECT tenant_id, table_id, partition_id, column_i| 2023-12-16 18:50:31.190774 |          131 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:31.192182 |         4577 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:31.197606 |         3896 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:31.202269 |       347793 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:31.559553 |       338883 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:31.901602 |       311209 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:32.218374 |         6085 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."CITY" =          | 2023-12-16 18:50:32.225213 |       338044 | xx.xxx.xxx.xxx |
| update "CC"."TEST1" "T" set "T"."ORGAN_CODE" =    | 2023-12-16 18:50:32.566794 |        15760 | xx.xxx.xxx.xxx |
| insert into "CC"."TEST8"( "MANAGER_ID", "APART_ID | 2023-12-16 18:50:32.593934 |         5915 | xx.xxx.xxx.xxx |
+---------------------------------------------------+----------------------------+--------------+----------------+
154 rows in set (4.125 sec)

4. 手工执行慢的几个 Update

手工执行慢的几个 update,发现命中 plan cache 时只需要几毫秒,而没有命中 plan cache 时需要 300 多毫秒。

手工执行时会命中执行计划
代码语言:javascript
复制
obclient [oceanbase]> select usec_to_time(request_time),query_sql,elapsed_time,execute_time,AFFECTED_ROWS,EVENT,TOTAL_WAIT_TIME_MICRO,WAIT_TIME_MICRO,TOTAL_WAITS,GET_PLAN_TIME,QUEUE_TIME,IS_HIT_PLAN,plan_id,svr_ip from gv$sql_audit where query_sql like 'update "CC"."TEST1" "T" set "T"."CITY"%' and GET_PLAN_TIME>0 order by request_time\G
usec_to_time(request_time): 2023-12-16 19:12:30.063271
                 query_sql: update "CC"."TEST1" "T" set "T"."CITY" = '1',"T"."COUNTY" = (select 
              elapsed_time: 8810
              execute_time: 8810
......
             GET_PLAN_TIME: 0
                QUEUE_TIME: 0
               IS_HIT_PLAN: 1
......
手工执行 update 前先做 truncate 操作

不会命中执行计划,此时 GET_PLAN_TIME 为 372ms。

代码语言:javascript
复制
obclient [oceanbase]> select usec_to_time(request_time),query_sql,elapsed_time,execute_time,AFFECTED_ROWS,EVENT,TOTAL_WAIT_TIME_MICRO,WAIT_TIME_MICRO,TOTAL_WAITS,GET_PLAN_TIME,QUEUE_TIME,IS_HIT_PLAN,plan_id,svr_ip from gv$sql_audit where query_sql like 'update "CC"."TEST1" "T" set "T"."CITY"%' and GET_PLAN_TIME>0 order by request_time\G
*************************** 1. row ***************************
usec_to_time(request_time): 2023-12-16 19:21:53.069019
                 query_sql: update "CC"."TEST1" "T" set "T"."CITY" = '1',"T"."COUNTY" = (select 
              elapsed_time: 378148
              execute_time: 6036
......
             GET_PLAN_TIME: 372055
                QUEUE_TIME: 28
               IS_HIT_PLAN: 0
......

注意:存储过程里的 SQL,即使没有命中 plan cache,在 gv$sql_audit 中还是会显示 GET_PLAN_TIME: 0,都体现在 execute_time。正常来说(比如手工执行的 SQL),如果没有命中 plan cacheGET_PLAN_TIME 值会比较大,并且不会计算到 execute_time 里。

3结论

TRUNCATE 导致 UPDATE 无法命中 plan cache,所以存储过程慢。

4解决方案

将存储过程里的TRUNCATE 改成了 DELETE

本文关键字:#OceanBase# #存储过程# #SQL优化# #plan cache#

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2024-09-18,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 爱可生开源社区 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1问题描述
  • 2分析过程
    • 1. 查找该存储过程的 trace_id
      • 2. 分析存储中较慢的 SQL
        • 3. 查看存储过程执行情况
          • 4. 手工执行慢的几个 Update
            • 手工执行时会命中执行计划
            • 手工执行 update 前先做 truncate 操作
        • 3结论
        • 4解决方案
        领券
        问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档