前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Oracle批量灌数后自动收集统计信息(Online Statistics Gathering for Bulk Loads)

Oracle批量灌数后自动收集统计信息(Online Statistics Gathering for Bulk Loads)

作者头像
mingjie
发布2023-10-13 10:27:00
2730
发布2023-10-13 10:27:00
举报
文章被收录于专栏:Postgresql源码分析

Online Statistics Gathering for Bulk Loads

Oracle12c后,在大批量灌数后,提供了自动收集统计信息的方式。

大批量灌数后立即查询是很多数据库的痛点,通常都会因为统计信息不准导致计划出问题,下面总结下Oracle的解决方法:

什么场景会触发自动收集统计信息?(已验证)

  1. create table as。
  2. insert into select必须带APPEND hint,普通的insert into不行。
  3. delete清空表,做bulk insert不会收集统计信息,因为表有segment,必须是空表才行。

分区表:遵循上述规则的前提下

  1. 对父表的insert只会收集父表统计信息。
  2. 对子表的insert只会收集子表统计信息。

收集哪些统计信息?

  1. 行数
  2. 页面数
  3. 平均行长度
代码语言:javascript
复制
SELECT table_name, num_rows, BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS     BLOCKS  AVG_SPACE  CHAIN_CNT AVG_ROW_LEN
-------------------- ---------- ---------- ---------- ---------- -----------
TAB1                       1000          7          0          0          23

什么时候不会触发自动收集统计信息?

  1. 非空segments(delete ×,truncate ok)
  2. 用户自定义的schema。
  3. 嵌套表、索引组织表、external。
  4. 带虚列的表。
  5. 表的统计信息已经锁定。
  6. PUBLISH为false。
  7. multitable INSERT statement,即insert all into a … into b … into c …。

文档:

  • Bulk Loads后在线收集统计信息
    • “Bulk Loads统计信息收集” 具体触发场景
      • INSERT INTO ... SELECT
      • CREATE TABLE AS SELECT
    • “Bulk Loads统计信息收集” 目标:更准确的统计信息,提升性能
      • 分析性业务往往需要灌入大量数据,老版本中是会建议用户灌完数后主动收集统计信息。
      • 但在实践中,由于疏忽或等维护窗口没有收集统计信息,是生成错误计划的主要原因。
    • “Bulk Loads统计信息收集” 分区表
      • 如果插入父表,则收集父表统计信息,不会收集子表。
      • 如果插入子表,则收集子表统计信息,不会手机父表。
    • “Bulk Loads统计信息收集” 后的直方图?
      • 数据库不会自动生成直方图,需要手动执行GATHER_TABLE_STATS。
      • 注意在bulk loads统计信息收集中,Oracle不会收集列统计信息。
    • “Bulk Loads统计信息收集” 限制,下面场景不会自动收集统计信息:
      • 目标对象已经有数据了。
      • oracle自己的schema。
      • 表类型限制:nested table, index-organized table (IOT), external table, or global temporary table。
      • 表的PUBLISH为false。
      • 表的统计信息已经锁定了。
      • multitable INSERT statement,即insert all into a … into b … into c …
    • “Bulk Loads统计信息收集” 的hint开关:
      • CREATE TABLE employees2 AS SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/ * FROM employees
      • CREATE TABLE employees2 AS SELECT /*+GATHER_OPTIMIZER_STATISTICS*/ * FROM employees

一些实验和SQL

场景一:create table as

代码语言:javascript
复制
DROP TABLE tab1 PURGE;
CREATE TABLE tab1 AS SELECT level AS id, 'Description of '||level AS description FROM dual CONNECT BY level <= 1000;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                       1000


SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
      1000          23          7            0 25-JUN-23

删除统计信息

代码语言:javascript
复制
SELECT table_name, num_rows, BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS     BLOCKS  AVG_SPACE  CHAIN_CNT AVG_ROW_LEN
-------------------- ---------- ---------- ---------- ---------- -----------
TAB1                       1000          7          0          0          23


EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1');
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'TAB1', options => 'GATHER AUTO');


SELECT table_name, num_rows, BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS     BLOCKS  AVG_SPACE  CHAIN_CNT AVG_ROW_LEN
-------------------- ---------- ---------- ---------- ---------- -----------
TAB1                       1000          7          0          0          23

场景二:insert into select(必须带APPEND才能收集)

带APPEND直插:统计信息会更新

代码语言:javascript
复制
TRUNCATE TABLE tab1;
INSERT /*+ APPEND */ INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 500;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
       500          23          5            0 25-JUN-23

不带APPEND:统计信息未更新

代码语言:javascript
复制
TRUNCATE TABLE tab1;
INSERT INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 800;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
       500          23          5            0 25-JUN-23

select count(*) from TAB1;
  COUNT(*)
----------
       800

场景三:delete清空表不会收集统计信息,表必须是空的(no segments)

使用DELET清空表,表中其实还是有数据的。表示空的(没有段)是触发收集的必要条件。

代码语言:javascript
复制
DELETE FROM tab1;
COMMIT;
INSERT /*+ APPEND */ INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 900;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
       500          23          5            0 25-JUN-23

使用truncate清空,会收集统计信息。

代码语言:javascript
复制
TRUNCATE TABLE tab1;
INSERT /*+ APPEND */ INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 1200;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                       1200

SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
      1200          23          8            0 25-JUN-23

场景四:使用hint显示关闭该功能

NO_GATHER_OPTIMIZER_STATISTICS可以关闭该功能。

代码语言:javascript
复制
TRUNCATE TABLE tab1;
INSERT /*+ APPEND */ INTO tab1 SELECT /*+ NO_GATHER_OPTIMIZER_STATISTICS */ level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 200;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                       1200

SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
      1200          23          8            0 25-JUN-23

分区表场景一:父表insert select append只收集父表统计信息

代码语言:javascript
复制
DROP TABLE tab1 PURGE;

CREATE TABLE tab1(id NUMBER, created_date DATE)
PARTITION BY RANGE (created_date)
(PARTITION tab1_2015 VALUES LESS THAN (TO_DATE('01/01/2016', 'DD/MM/YYYY')) TABLESPACE users,
 PARTITION tab1_2016 VALUES LESS THAN (TO_DATE('01/01/2017', 'DD/MM/YYYY')) TABLESPACE users);

INSERT /*+ APPEND */ INTO tab1
SELECT level, TO_DATE('01/01/2016', 'DD/MM/YYYY')
FROM dual CONNECT BY level <= 100;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        100

COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20
SELECT table_name, partition_name, num_rows FROM user_tab_partitions WHERE table_name = 'TAB1' ORDER BY partition_name;
TABLE_NAME           PARTITION_NAME         NUM_ROWS
-------------------- -------------------- ----------
TAB1                 TAB1_2015
TAB1                 TAB1_2016

删除统计信息

代码语言:javascript
复制
EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1');

SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1

分区表场景二:直接插入子分区表只收集子表统计信息

代码语言:javascript
复制
TRUNCATE TABLE tab1;
INSERT /*+ APPEND */ INTO tab1 PARTITION (TAB1_2016)
SELECT level, TO_DATE('01/01/2016', 'DD/MM/YYYY') FROM dual CONNECT BY level <= 100;
COMMIT;

COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1


COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20
SELECT table_name, partition_name, num_rows FROM user_tab_partitions WHERE table_name = 'TAB1' ORDER BY partition_name;
TABLE_NAME           PARTITION_NAME         NUM_ROWS
-------------------- -------------------- ----------
TAB1                 TAB1_2015
TAB1                 TAB1_2016                   100
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2023-06-26,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Online Statistics Gathering for Bulk Loads
  • 一些实验和SQL
    • 场景一:create table as
      • 场景二:insert into select(必须带APPEND才能收集)
        • 场景三:delete清空表不会收集统计信息,表必须是空的(no segments)
          • 场景四:使用hint显示关闭该功能
            • 分区表场景一:父表insert select append只收集父表统计信息
              • 分区表场景二:直接插入子分区表只收集子表统计信息
              领券
              问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档