MySql 字符集不同导致 left join 慢查询的问题解决_Mysql

在 mysql 建表时候一般会指定字符集，大多数情况下为了更好的兼容性无脑选了 utf8mb4。但是有时会因为选错，或历史遗留问题，导致使用了 utf8 字符集。当两个表的字符集不一样，在使用字符型字段进行表连接查询时，就需要特别注意下查询耗时是否符合预期。

有次使用 left join 写一个 sql，发现用时明显超过预期，经过一顿折腾才发现是两个表字符集不一样，特此记录一下。

问题分析

mysql> select count( *) from app_bind_rel t left join app_config_control_sn p on t.host_sn = p.host_sn ;
+-----------+
| count( *) |
+-----------+
|     13447 |
+-----------+
1 row in set (0.89 sec)

例如上面的 sql，左表 1w 条数据，右表 400 多条数据，在 host_sn 字段上都有索引，查询竟然用了近 900ms，怎么会这么慢？

mysql> explain select count( *) from app_bind_rel t left join app_config_control_sn p on t.host_sn = p.host_sn ;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows  | filtered | extra                                                           |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-----------------------------------------------------------------+
|  1 | simple      | t     | null       | index | null          | idx_host_sn | 122     | null | 10791 |   100.00 | using index                                                     |
|  1 | simple      | p     | null       | index | null          | idx_host_sn | 152     | null |   457 |   100.00 | using where; using index; using join buffer (block nested loop) |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-----------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

查看下执行计划，的确是使用了索引，但是细看 extra 列发现较正常的连表查询多了“using join buffer (block nested loop)”这一信息，这个具体是什么意思我们后面再说。
然后我们再看下详细的执行计划，使用 explain formart=json。

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "988640.52"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "t",
          "access_type": "index",
          "key": "idx_host_sn",
          "used_key_parts": [
            "host_sn"
          ],
          "key_length": "122",
          "rows_examined_per_scan": 10791,
          "rows_produced_per_join": 10791,
          "filtered": "100.00",
          "using_index": true,
          "cost_info": {
            "read_cost": "161.00",
            "eval_cost": "2158.20",
            "prefix_cost": "2319.20",
            "data_read_per_join": "2m"
          },
          "used_columns": [
            "host_sn"
          ]
        }
      },
      {
        "table": {
          "table_name": "p",
          "access_type": "index",
          "key": "idx_host_sn",
          "used_key_parts": [
            "host_sn"
          ],
          "key_length": "152",
          "rows_examined_per_scan": 457,
          "rows_produced_per_join": 4931487,
          "filtered": "100.00",
          "using_index": true,
          "using_join_buffer": "block nested loop",
          "cost_info": {
            "read_cost": "23.92",
            "eval_cost": "986297.40",
            "prefix_cost": "988640.52",
            "data_read_per_join": "865m"
          },
          "used_columns": [
            "host_sn"
          ],
          "attached_condition": "<if>(is_not_null_compl(p), (`db0`.`t`.`host_sn` = convert(`db0`.`p`.`host_sn` using utf8mb4)), true)"
        }
      }
    ]
  }
}

特别需要关注的是这一对 kv

"attached_condition": "<if>(is_not_null_compl(p), (`collection_bullet_0000`.`t`.`host_sn` = convert(`collection_bullet_0000`.`p`.`host_sn` using utf8mb4)), true)"

看字面意思就是当 p 表不为空的时候，执行表连接需要先将 p 表的 host_sn 字段转变为 utf8mb4 字符集。我们应该都知道在表连接中使用了函数的话，是无法使用索引的。
所以再回到上面我看到的“using join buffer (block nested loop)”问题，来解释下这是一个什么过程。

nested-loop join

mysql 官网对 nested-loop join 有做过解释，其实做开发的同学看到名字就大体知道是啥，不就是循环嵌套嘛。

mysql 中分为 nested-loop join 算法跟 block nested-loop join 算法。

例如，有如下三个表，t1、t2、t3 使用了这三种 join type。

table join type
t1 range
t2 ref
t3 all

当使用 nested-loop join 算法时，其 join 过程如下所示，其实就是简单的三层循环。

for each row in t1 matching range {
  for each row in t2 matching reference key {
    for each row in t3 {
      if row satisfies join conditions, send to client
    }
  }
}

block nested-loop join（bnl）算法是对 nested-loop join 算法的一种优化。bnl 算法缓冲外部循环中读取的行来减少内部循环中读取表的次数。例如，将 10 行数据读取到缓冲器中，并且将缓冲器传递到下一个循环内部，内部循环中读取的每一行与缓冲器中的所有 10 行进行比较。这将使读取内部表的次数减少一个数量级。

for each row in t1 matching range {
  for each row in t2 matching reference key {
    store used columns from t1, t2 in join buffer
    if buffer is full {
      for each row in t3 {
        for each t1, t2 combination in join buffer {
          if row satisfies join conditions, send to client
        }
      }
      empty join buffer
    }
  }
}

if buffer is not empty {
  for each row in t3 {
    for each t1, t2 combination in join buffer {
      if row satisfies join conditions, send to client
    }
  }
}

算法实现如上，只有当 “join buffer” 满的时候才会触发 t3 表的读取，如果 “join buffer” 的 size = 10 那么就可以减少 10 倍的 t3 表被读取次数，从内存中读取数据的效率显然要比从磁盘读取的效率高的多。从而提升 join 的效率。

但其实再好的优化毕竟也是嵌套循环，做开发的同学应该都知道 o(n²) 的时间复杂度是无法接受的。这也是我们这个查询这么慢的根因。

解决办法

解决办法其实很简单，修改右表的字符集就可以解决。

在变更数据集之前我们先用 show table status 查看下当前表的状态。

mysql> show table status like 'app_config_control_sn';
+-----------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| name                  | engine | version | row_format | rows | avg_row_length | data_length | max_data_length | index_length | data_free | auto_increment | create_time         | update_time         | check_time | collation       | checksum | create_options | comment |
+-----------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| app_config_control_sn | innodb |      10 | dynamic    |  457 |            143 |       65536 |               0 |        32768 |         0 |           1041 | 2023-04-17 03:25:45 | 2023-04-17 03:27:24 | null       | utf8_general_ci |     null |                | sn      |
+-----------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
1 row in set (0.00 sec)

接着使用如下命令变更表的字符集。

mysql> alter table app_config_control_sn convert to character set utf8mb4 collate utf8mb4_general_ci;
query ok, 457 rows affected (0.09 sec)
records: 457  duplicates: 0  warnings: 0

再次使用 show table status 命令查看下表的状态。

mysql> show table status like 'app_config_control_sn';
+-----------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+--------------------+----------+----------------+---------+
| name                  | engine | version | row_format | rows | avg_row_length | data_length | max_data_length | index_length | data_free | auto_increment | create_time         | update_time         | check_time | collation          | checksum | create_options | comment |
+-----------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+--------------------+----------+----------------+---------+
| app_config_control_sn | innodb |      10 | dynamic    |  457 |            143 |       65536 |               0 |        32768 |         0 |           1041 | 2023-04-17 03:50:11 | 2023-04-17 03:50:11 | null       | utf8mb4_general_ci |     null |                | sn      |
+-----------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+--------------------+----------+----------------+---------+
1 row in set (0.01 sec)

可以看到表的字符集已经发生了变化，那我们再次执行开始的 sql 及 explain 语句，确认下问题是否已经解决。

mysql> select count( *) from app_bind_rel t left join app_config_control_sn p on t.host_sn = p.host_sn ;
+-----------+
| count( *) |
+-----------+
|     13447 |
+-----------+
1 row in set (0.03 sec)

mysql> explain select count( *) from app_bind_rel t left join app_config_control_sn p on t.host_sn = p.host_sn ;
+----+-------------+-------+------------+-------+---------------+-------------+---------+---------------+-------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref           | rows  | filtered | extra                    |
+----+-------------+-------+------------+-------+---------------+-------------+---------+---------------+-------+----------+--------------------------+
|  1 | simple      | t     | null       | index | null          | idx_host_sn | 122     | null          | 10791 |   100.00 | using index              |
|  1 | simple      | p     | null       | ref   | idx_host_sn   | idx_host_sn | 202     | db0.t.host_sn |     2 |   100.00 | using where; using index |
+----+-------------+-------+------------+-------+---------------+-------------+---------+---------------+-------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)

可以看到耗时已经只需要 30ms 左右，这个就比较符合预期，而在执行计划中也不再会有“using join buffer (block nested loop)”信息。

其他

mysql> select count( *) from app_bind_rel t join app_config_control_sn p on t.host_sn = p.host_sn ;
+-----------+
| count( *) |
+-----------+
|       730 |
+-----------+
1 row in set (0.01 sec)

在没有变更字符集之前，当我们将 left join 修改为 join 的时候会发现耗时减少了 100 倍，只用了 10 ms，这是为什么呢？

mysql> explain select count( *) from app_bind_rel t join app_config_control_sn p on t.host_sn = p.host_sn ;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows | filtered | extra                    |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+--------------------------+
|  1 | simple      | p     | null       | index | null          | idx_host_sn | 152     | null |  457 |   100.00 | using index              |
|  1 | simple      | t     | null       | ref   | idx_host_sn   | idx_host_sn | 122     | func |    1 |   100.00 | using where; using index |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)

查看执行计划，发现使用 join 的时候不会有 “using join buffer (block nested loop)”。再细看执行计划，发现驱动表已经由 t 表变为了 p 表。

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "643.80"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "p",
          "access_type": "index",
          "key": "idx_host_sn",
          "used_key_parts": [
            "host_sn"
          ],
          "key_length": "152",
          "rows_examined_per_scan": 457,
          "rows_produced_per_join": 457,
          "filtered": "100.00",
          "using_index": true,
          "cost_info": {
            "read_cost": "4.00",
            "eval_cost": "91.40",
            "prefix_cost": "95.40",
            "data_read_per_join": "82k"
          },
          "used_columns": [
            "host_sn"
          ]
        }
      },
      {
        "table": {
          "table_name": "t",
          "access_type": "ref",
          "possible_keys": [
            "idx_host_sn"
          ],
          "key": "idx_host_sn",
          "used_key_parts": [
            "host_sn"
          ],
          "key_length": "122",
          "ref": [
            "func"
          ],
          "rows_examined_per_scan": 1,
          "rows_produced_per_join": 457,
          "filtered": "100.00",
          "using_index": true,
          "cost_info": {
            "read_cost": "457.00",
            "eval_cost": "91.40",
            "prefix_cost": "643.80",
            "data_read_per_join": "117k"
          },
          "used_columns": [
            "host_sn"
          ],
          "attached_condition": "(`db0`.`t`.`host_sn` = convert(`db0`.`p`.`host_sn` using utf8mb4))"
        }
      }
    ]
  }
}

查看详细的执行计划，可以看到

"attached_condition": "(`collection_bullet_0000`.`t`.`host_sn` = convert(`collection_bullet_0000`.`p`.`host_sn` using utf8mb4))"

这对 kv 依然是存在的，但是 "using_join_buffer": "block nested loop" 已经不存在了。这个其实主要是因为当 p 表变为驱动表的时候，会先将自己的 host_sn 字段转为 utf8mb4 字符集，再与 t 表进行关联。t 表由于本来就是 utf8mb4 字符集且存在索引，就可以正常走数据库索引了，所以查询耗时也就大大降低。而使用 left join 时候，t 表作为驱动表是无法优化改变的。

可见在表连接中即使使用了函数也不一定就没法走索引，关键还是要看用法及明确处理过程。
记得刚学习数据库的时候，老师还特别强调驱动表一定要写在左边，而随着数据库技术的不断迭代发展，数据库已经能更智能的自动帮我们优化处理过程，之前很多的数据库规则也不需要了。

到此这篇关于mysql 字符集不同导致 left join 慢查询的问题解决的文章就介绍到这了,更多相关mysql left join 慢查询内容请搜索代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持代码网！