作者 | 高鹏(网名八怪)
出品 | 《深入理解MySQL主从原理32讲》
本文并不准备说明如何开启记录慢查询,只是将一些重要的部分进行解析。如何记录慢查询可以自行参考官方文档:
本文使用了Percona 版本开启来了参数log_slow_verbosity,得到了更详细的慢查询信息。通常情况下信息没有这么多,但是一定是包含关系,本文也会使用Percona的参数解释说明一下这个参数的含义。
实际上慢查询中的时间就是时钟时间,是通过操作系统的命令获得的时间,如下是Linux中获取时间的方式
while (gettimeofday(&t, NULL) != 0)
{}
newtime= (ulonglong)t.tv_sec * 1000000 + t.tv_usec;
return newtime;
实际上就是通过OS的API gettimeofday函数获得的时间。
本文主要讨论long_query_time参数的含义。
如果我们将语句的执行时间定义为如下:
实际消耗时间 = 实际执行时间+锁等待消耗时间
那么long_query_time实际上界定的是实际执行时间,所以有些情况下虽然语句实际消耗的时间很长但是是因为锁等待时间较长而引起的,那么实际上这种语句也不会记录到慢查询。
我们看一下log_slow_applicable函数的代码片段:
res= cur_utime - thd->utime_after_lock;
if (res > thd->variables.long_query_time)
thd->server_status|= SERVER_QUERY_WAS_SLOW;
else
thd->server_status&= ~SERVER_QUERY_WAS_SLOW;
这里实际上清楚的说明了上面的观点,是不是慢查询就是通过这个函数进行的判断的,非常重要。我可以清晰的看到如下公式:
实际上在慢查询中记录的正是
但是是否是慢查询其评判标准却是实际执行时间及Query_time - Lock_time
其中锁等待消耗时间( Lock_time)我现在已经知道的包括:
我们可以看到在公式中utime_after_lock( 锁等待消耗时间Lock_time)的记录也就成了整个公式的关键,那么我们试着进行debug。
不管是 MDL LOCK等待消耗的时间还是 MyISAM表锁消耗的时间都是在MySQL层记录的,实际上它只是记录在函数mysql_lock_tables的末尾会调用的THD::set_time_after_lock进行的记录时间而已如下:
void set_time_after_lock()
{
utime_after_lock= my_micro_time();
MYSQL_SET_STATEMENT_LOCK_TIME(m_statement_psi, (utime_after_lock - start_utime));
}
那么这里可以解析为代码运行到mysql_lock_tables函数的末尾之前的所有的时间都记录到utime_after_lock时间中,实际上并不精确。但是实际上MDL LOCK的获取和MyISAM表锁的获取都包含在里面。所以即便是select语句也会看到Lock_time并不为0。下面是栈帧:
#0 THD::set_time_after_lock (this=0x7fff28012820) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_class.h:3414
#1 0x0000000001760d6d in mysql_lock_tables (thd=0x7fff28012820, tables=0x7fff28c16b58, count=1, flags=0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/lock.cc:366
#2 0x000000000151dc1a in lock_tables (thd=0x7fff28012820, tables=0x7fff28c165b0, count=1, flags=0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_base.cc:6700
#3 0x00000000017c4234 in Sql_cmd_delete::mysql_delete (this=0x7fff28c16b50, thd=0x7fff28012820, limit=18446744073709551615)
at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_delete.cc:136
#4 0x00000000017c84ba in Sql_cmd_delete::execute (this=0x7fff28c16b50, thd=0x7fff28012820) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_delete.cc:1389
#5 0x00000000015a7814 in mysql_execute_command (thd=0x7fff28012820, first_level=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:3729
#6 0x00000000015adcd6 in mysql_parse (thd=0x7fff28012820, parser_state=0x7ffff035b600) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5836
#7 0x00000000015a1b95 in dispatch_command (thd=0x7fff28012820, com_data=0x7ffff035bd70, command=COM_QUERY)
at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1447
#8 0x00000000015a09c6 in do_command (thd=0x7fff28012820) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1010
InnoDB引擎层调用通过thd_set_lock_wait_time调用thd_storage_lock_wait函数完成的栈帧如下:
#0 thd_storage_lock_wait (thd=0x7fff2c000bc0, value=9503561) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_class.cc:798
#1 0x00000000019a4b2a in thd_set_lock_wait_time (thd=0x7fff2c000bc0, value=9503561)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:1784
#2 0x0000000001a4b50f in lock_wait_suspend_thread (thr=0x7fff2c088200) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/lock/lock0wait.cc:363
#3 0x0000000001b0ec9b in row_mysql_handle_errors (new_err=0x7ffff0317d54, trx=0x7ffff2f2e5d0, thr=0x7fff2c088200, savept=0x0)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/row/row0mysql.cc:772
#4 0x0000000001b4fe61 in row_search_mvcc (buf=0x7fff2c087640 "\377", mode=PAGE_CUR_G, prebuilt=0x7fff2c087ac0, match_mode=0, direction=0)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/row/row0sel.cc:5940
#5 0x00000000019b3051 in ha_innobase::index_read (this=0x7fff2c087100, buf=0x7fff2c087640 "\377", key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9104
#6 0x00000000019b4374 in ha_innobase::index_first (this=0x7fff2c087100, buf=0x7fff2c087640 "\377")
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9551
#7 0x00000000019b462c in ha_innobase::rnd_next (this=0x7fff2c087100, buf=0x7fff2c087640 "\377")
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9656
#8 0x0000000000f66f1b in handler::ha_rnd_next (this=0x7fff2c087100, buf=0x7fff2c087640 "\377") at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:3099
#9 0x00000000014c61b6 in rr_sequential (info=0x7ffff03189e0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/records.cc:520
#10 0x00000000017c56c3 in Sql_cmd_delete::mysql_delete (this=0x7fff2c006ae8, thd=0x7fff2c000bc0, limit=1)
at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_delete.cc:454
#11 0x00000000017c84ba in Sql_cmd_delete::execute (this=0x7fff2c006ae8, thd=0x7fff2c000bc0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_delete.cc:1389
函数本身还是很简单自己看看就知道了就是相加而已如下:
void thd_storage_lock_wait(THD *thd, long long value)
{
thd->utime_after_lock+= value;
}
这是Percona的解释:
Specifies how much information to include in your slow log. The value is a comma-delimited string, and can contain any combination of the following values:
总之在Percona中可以修改这个参数获得更加详细的信息大概的格式如下:
# Time: 2018-05-30T09:30:12.039775Z
# User@Host: root[root] @ localhost [] Id: 10
# Schema: test Last_errno: 1317 Killed: 0
# Query_time: 19.254508 Lock_time: 0.001043 Rows_sent: 0 Rows_examined: 0 Rows_affected: 0
# Bytes_sent: 44 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 0
SET timestamp=1527672612;
select count(*) from z1 limit 1;
本节将会进行详细的解释,全部的慢查询的输出都来自于函数File_query_log::write_slow ,有兴趣的同学可以自己看看,我这里也会给出输出的位置和含义,其中含义部分可能给出的是源码中的注释。
# Time: 2018-05-30T09:30:12.039775Z
对应的代码:
my_snprintf(buff, sizeof buff,"# Time: %s\n", my_timestamp);
其中my_timestamp取值来自于
thd->current_utime();
实际上就是:
while (gettimeofday(&t, NULL) != 0)
{}
newtime= (ulonglong)t.tv_sec * 1000000 + t.tv_usec;
return newtime;
可以看到实际就是调用gettimeofday系统调用得到的系统当前时间。
注意:对于5.6来讲还有一句判断
if (current_time != last_time)
如果两次打印的时间秒钟一致则不会输出时间,只有通过后面介绍的
SET timestamp=1527753496;
来判断时间,5.7.14没有看到这样的代码。
# User@Host: root[root] @ localhost [] Id: 10
对应的代码:
buff_len= my_snprintf(buff, 32, "%5u", thd->thread_id());
if (my_b_printf(&log_file, "# User@Host: %s Id: %s\n", user_host, buff)
== (uint) -1)
goto err;
}
user_host是一串字符串,参考代码:
size_t user_host_len= (strxnmov(user_host_buff, MAX_USER_HOST_SIZE,
sctx->priv_user().str
? sctx->priv_user().str : "",
"[", sctx_user.length ? sctx_user.str :
(thd->slave_thread ? "SQL_SLAVE" : ""),
"] @ ",
sctx_host.length ? sctx_host.str : "", " [",
sctx_ip.length ? sctx_ip.str : "", "]",
NullS) - user_host_buff);
解释如下:
# Schema: test Last_errno: 1317 Killed: 0
对应的代码:
"# Schema: %s Last_errno: %u Killed: %u\n"
(thd->db().str ? thd->db().str : ""),
thd->last_errno, (uint) thd->killed,
这部分可能是大家最关心的部分,很多信息也是默认输出都会输出的。
# Query_time: 19.254508 Lock_time: 0.001043 Rows_sent: 0 Rows_examined: 0 Rows_affected: 0
# Bytes_sent: 44 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 0
对应代码:
my_b_printf(&log_file,
"# Schema: %s Last_errno: %u Killed: %u\n"
"# Query_time: %s Lock_time: %s Rows_sent: %llu"
" Rows_examined: %llu Rows_affected: %llu\n"
"# Bytes_sent: %lu",
(thd->db().str ? thd->db().str : ""),
thd->last_errno, (uint) thd->killed,
query_time_buff, lock_time_buff,
(ulonglong) thd->get_sent_row_count(),
(ulonglong) thd->get_examined_row_count(),
(thd->get_row_count_func() > 0)
? (ulonglong) thd->get_row_count_func() : 0,
(ulong) (thd->status_var.bytes_sent - thd->bytes_sent_old)
my_b_printf(&log_file,
" Tmp_tables: %lu Tmp_disk_tables: %lu "
"Tmp_table_sizes: %llu",
thd->tmp_tables_used, thd->tmp_tables_disk_used,
thd->tmp_tables_size)
snprintf(buf, 20, "%llX", thd->innodb_trx_id);及thd->innodb_trx_id
我们来看看Query_time和Lock_time的源码来源,它们来自于Query_logger::slow_log_write函数如下:
query_utime= (current_utime > thd->start_utime) ?
(current_utime - thd->start_utime) : 0;
lock_utime= (thd->utime_after_lock > thd->start_utime) ?
(thd->utime_after_lock - thd->start_utime) : 0;
下面是数据current_utime 的来源,
current_utime= thd->current_utime();
实际上就是:
while (gettimeofday(&t, NULL) != 0)
{}
newtime= (ulonglong)t.tv_sec * 1000000 + t.tv_usec;
return newtime;
获取当前时间而已
对于thd->utime_after_lock的获取我已经在前文进行了描述,不再解释。
以上三个指标来自于:
thd->tmp_tables_used
thd->tmp_tables_disk_used
thd->tmp_tables_size
这三个指标增加的位置对应在free_tmp_table函数中如下:
thd->tmp_tables_used++;
if (entry->file)
{
thd->tmp_tables_size += entry->file->stats.data_file_length;
if (entry->file->ht->db_type != DB_TYPE_HEAP)
thd->tmp_tables_disk_used++;
}
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
这一行来自于如下代码:
my_b_printf(&log_file,
"# QC_Hit: %s Full_scan: %s Full_join: %s Tmp_table: %s "
"Tmp_table_on_disk: %s\n" \
"# Filesort: %s Filesort_on_disk: %s Merge_passes: %lu\n",
((thd->query_plan_flags & QPLAN_QC) ? "Yes" : "No"),
((thd->query_plan_flags & QPLAN_FULL_SCAN) ? "Yes" : "No"),
((thd->query_plan_flags & QPLAN_FULL_JOIN) ? "Yes" : "No"),
((thd->query_plan_flags & QPLAN_TMP_TABLE) ? "Yes" : "No"),
((thd->query_plan_flags & QPLAN_TMP_DISK) ? "Yes" : "No"),
((thd->query_plan_flags & QPLAN_FILESORT) ? "Yes" : "No"),
((thd->query_plan_flags & QPLAN_FILESORT_DISK) ? "Yes" : "No"),
这里注意一个处理的技巧,这里query_plan_flags中每一位都代表一个含义,这样存储既能存储足够多的信息同时存储空间也很小,是C/C++中常用的方式。
考虑如下的执行计划
mysql> desc select *,sleep(1) from testuin a,testuin1 b where a.id1=b.id1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | NULL |
| 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 20.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
如此输出如下:
# QC_Hit: No Full_scan: Yes Full_join: Yes
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 0
这一行来自于如下代码:
char buf[3][20];
snprintf(buf[0], 20, "%.6f", thd->innodb_io_reads_wait_timer / 1000000.0);
snprintf(buf[1], 20, "%.6f", thd->innodb_lock_que_wait_timer / 1000000.0);
snprintf(buf[2], 20, "%.6f", thd->innodb_innodb_que_wait_timer / 1000000.0);
if (my_b_printf(&log_file,
"# InnoDB_IO_r_ops: %lu InnoDB_IO_r_bytes: %llu "
"InnoDB_IO_r_wait: %s\n"
"# InnoDB_rec_lock_wait: %s InnoDB_queue_wait: %s\n"
"# InnoDB_pages_distinct: %lu\n",
thd->innodb_io_reads, thd->innodb_io_read,
buf[0], buf[1], buf[2], thd->innodb_page_access)
== (uint) -1)
SET timestamp=1527753496;
这一句来自源码,注意源码注释解释就是获取的服务器的当前的时间(current_utime)。
/*
This info used to show up randomly, depending on whether the query
checked the query start time or not. now we always write current
timestamp to the slow log
*/
end= my_stpcpy(end, ",timestamp=");
end= int10_to_str((long) (current_utime / 1000000), end, 10);
if (end != buff)
{
*end++=';';
*end='\n';
if (my_b_write(&log_file, (uchar*) "SET ", 4) ||
my_b_write(&log_file, (uchar*) buff + 1, (uint) (end-buff)))
goto err;
}
本文通过查询源码解释了一些关于MySQL慢查询的相关的知识,主要解释了慢查询是基于什么标准进行记录的,同时输出中各个指标的含义,当然这仅仅是我自己得出的结果,如果有不同意见可以一起讨论。
备注栈帧1:本栈帧主要跟踪Rows_examined的变化及 join->examined_rows++;的变化
(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000ebd5f3 in main(int, char**) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/main.cc:25
breakpoint already hit 1 time
4 breakpoint keep y 0x000000000155b94f in do_select(JOIN*) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:872
breakpoint already hit 5 times
5 breakpoint keep y 0x000000000155ca39 in evaluate_join_record(JOIN*, QEP_TAB*) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:1473
breakpoint already hit 20 times
6 breakpoint keep y 0x00000000019b4313 in ha_innobase::index_first(uchar*)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9547
breakpoint already hit 4 times
7 breakpoint keep y 0x00000000019b45cd in ha_innobase::rnd_next(uchar*)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9651
8 breakpoint keep y 0x00000000019b2ba6 in ha_innobase::index_read(uchar*, uchar const*, uint, ha_rkey_function)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9004
breakpoint already hit 3 times
9 breakpoint keep y 0x00000000019b4233 in ha_innobase::index_next(uchar*)
at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9501
breakpoint already hit 5 times
#0 ha_innobase::index_next (this=0x7fff2cbc6b40, buf=0x7fff2cbc7080 "\375\n") at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:9501
#1 0x0000000000f680d8 in handler::ha_index_next (this=0x7fff2cbc6b40, buf=0x7fff2cbc7080 "\375\n") at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:3269
#2 0x000000000155fa02 in join_read_next (info=0x7fff2c007750) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:2660
#3 0x000000000155c397 in sub_select (join=0x7fff2c007020, qep_tab=0x7fff2c007700, end_of_records=false)
at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:1274
#4 0x000000000155bd06 in do_select (join=0x7fff2c007020) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:944
#5 0x0000000001559bdc in JOIN::exec (this=0x7fff2c007020) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:199
#6 0x00000000015f9ea6 in handle_query (thd=0x7fff2c000b70, lex=0x7fff2c003150, result=0x7fff2c006cd0, added_options=0, removed_options=0)
at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:184
#7 0x00000000015acd05 in execute_sqlcom_select (thd=0x7fff2c000b70, all_tables=0x7fff2c006688) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5391
#8 0x00000000015a5320 in mysql_execute_command (thd=0x7fff2c000b70, first_level=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:2889
#9 0x00000000015adcd6 in mysql_parse (thd=0x7fff2c000b70, parser_state=0x7ffff035b600) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5836
#10 0x00000000015a1b95 in dispatch_command (thd=0x7fff2c000b70, com_data=0x7ffff035bd70, command=COM_QUERY)
at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1447
#11 0x00000000015a09c6 in do_command (thd=0x7fff2c000b70) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1010
#12 0x00000000016e29d0 in handle_connection (arg=0x3859ae0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/conn_handler/connection_handler_per_thread.cc:312
#13 0x0000000001d7bfdc in pfs_spawn_thread (arg=0x38607b0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/perfschema/pfs.cc:2188
#14 0x0000003f74807aa1 in start_thread () from /lib64/libpthread.so.0
#15 0x0000003f740e8bcd in clone () from /lib64/libc.so.6
— 本文结束 —