免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1994 | 回复: 0
打印 上一主题 下一主题

mysql表的列数量与效率之间的关系 [复制链接]

论坛徽章:
9
每日论坛发贴之星
日期:2016-01-04 06:20:00数据库技术版块每日发帖之星
日期:2016-01-04 06:20:00每日论坛发贴之星
日期:2016-01-04 06:20:00数据库技术版块每日发帖之星
日期:2016-01-04 06:20:00IT运维版块每日发帖之星
日期:2016-01-04 06:20:00IT运维版块每日发帖之星
日期:2016-01-04 06:20:00综合交流区版块每日发帖之星
日期:2016-01-04 06:20:00综合交流区版块每日发帖之星
日期:2016-01-04 06:20:00数据库技术版块每周发帖之星
日期:2016-03-07 16:30:25
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2009-10-01 16:52 |只看该作者 |倒序浏览

文章的大致结论:
MyISAM表,
fixed_row_size时,列数量对效率会产生微小的影响。
dynamic_row_size时,列数量过多,会对效率产生巨大的负面作用
innodb表,
由于不存在row-size的区别,char与varchar之间对效率影响小
但是,过多的列会长生更为巨大的负面作用
It is pretty understood the tables which have long rows tend to be
slower than tables with short rows. I was interested to check if the
row length is the only thing what matters or if number of columns we
have to work with also have an important role. I was interested in peak
row processing speed so I looked at full table scan in case data fits
in OS cache completely. I created 3 tables - First containing single
tinyint column which is almost shortest type possible (CHAR(0) could be
taking less space), table with 1 tinyint column and char(99) column and
table with 100 tinyint columns. The former two tables have the same row
length but have number of column different 50 times. Finally I have
created 4th table which is also 100 columns but one of them is VARCHAR
causes raw format to be dynamic.
More specially:
PLAIN TEXT
SQL:
CREATE TABLE `t1` (
  `t1` tinyint(3) UNSIGNED NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
PLAIN TEXT
SQL:
CREATE TABLE `t1c99` (
  `t1` tinyint(3) UNSIGNED NOT NULL,
  `c99` char(99) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
PLAIN TEXT
SQL:
CREATE TABLE `t100` (
  `t1` tinyint(3) UNSIGNED NOT NULL,
  `t2` tinyint(3) UNSIGNED NOT NULL,
...
  `t99` tinyint(3) UNSIGNED NOT NULL,
  `t100` tinyint(3) UNSIGNED NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
PLAIN TEXT
SQL:
CREATE TABLE `t99v1` (
  `t1` tinyint(3) UNSIGNED NOT NULL,
  `t2` tinyint(3) UNSIGNED NOT NULL,
...
  `t99` tinyint(3) UNSIGNED NOT NULL,
  `v1` varchar(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
I populated each of the tables with 12M rows. Getting  7 bytes row size for first table  and 101 byte for second and third.
I used simple scan query: select max(t1) from t100;  for the test.
The result was as follows:
t1            -   1.00 sec      (12M rows/sec  ;  80MB/sec)
t1c99       -   1.71 sec      (7M rows/sec ;    676MB/sec)
t100        -   1.77 sec      (7M rows/sec ;    653MB/sec)
t99v1       -  12.36 sec      (1M rows/sec  ;   93MB/sec)
This shows there is surely the problem with dynamic row format table
with many columns. But is it because of large number of columns or
dynamic format on its own is slave ?
I have tested yet another table structure:
PLAIN TEXT
SQL:
CREATE TABLE `t1v1` (
  `t1` tinyint(3) UNSIGNED NOT NULL,
  `v` varchar(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
This table has row length of 20 (which was a bit of surprise to me) and it has:
t1v1           - 1.83 sec  (6.5M rows/sec;  125M/sec)
So there is surely the penalty for dynamic rows, however it is not
very significant if number of columns is small. For large number of
columns dynamic rows become very expensive and you have to watch out.
I have not looked at the code and would appreciate any developers
comments but I guess for dynamic rows tables certain conversion has to
take place when internal data structures are populated (everything but
TEXTs/BLOBs is fixed length when it is being processed). This
conversion process depends on number of columns while for fixed rows
the MyISAM storage format matches internal one so you can basically do
memory copy which does not depends on number of columns.
Another interesting observation is access speed to different
columns. the max(t1) and max(t99) were taking the same time which means
there is no penalty for accessing column which is in the end of the
table rather than at the start when it comes to MyISAM.
The common workaround working with such wide tables is to use
covering indexes. I added one to t99v1 table and repeated the query:
PLAIN TEXT
SQL:
mysql [localhost] {msandbox} (test)> SELECT max(t1+0) FROM t99v1;
+-----------+
| max(t1+0) |
+-----------+
|         0 |
+-----------+
1 row IN SET (3.26 sec)

mysql [localhost] {msandbox} (test)> EXPLAIN SELECT max(t1+0) FROM t99v1;
+----+-------------+-------+-------+---------------+------+---------+------+----------+-------------+
| id | select_type | TABLE | type  | possible_keys | KEY  | key_len | ref  | rows     | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+----------+-------------+
|  1 | SIMPLE      | t99v1 | INDEX | NULL          | t1   | 1       | NULL | 12000000 | USING INDEX |
+----+-------------+-------+-------+---------------+------+---------+------+----------+-------------+
1 row IN SET (0.00 sec)
As you can see the index scan is not as fast as table scan scanning about 3.7M rows/sec but which is still pretty fast.
So this is all about MyISAM, what is about Innodb ? Here are results
for Innodb with all data in buffer pool, to measure peak speed as well
The results for Innodb were:
t1                          -    5.11 sec      (2.3M rows/sec)
t1c99                     -    5.74 sec      (2.1M rows/sec)
t100                       -   15.16 sec     (0.8M rows/sec)
t99v1                      -  14.93 sec      (0.8M rows/sec)
t1v1                        -  5.26 sec       (2.3M rows/sec)
t99v1 (covering idx)   -  5.62 sec       (2.1M rows/sec)
As you can see Innodb is a lot slower and has behavior similar to
Dynamic Row tables in both cases. This is because Innodb does not store
data in native MyISAM format and conversion is needed in all cases. We
can also see the table scan speed can be up to 5 times slower, for very
short rows - some of this goes back to the fact Innodb rows have a lot
of transaction control overhead attached to them.
Also note the covering index scan speed is very similar to full
table scan speed - this is rather expected as table data is stored in
BTREE index very similarly to how indexes are stored.
Summary: Beware of dynamic row format tables with
many columns they might bite you with surprise slowdown. MyISAM is much
faster than Innodb when it comes to in memory full table scan.
P.S Tests were done on MySQL 5.4.2  on Intel(R) Xeon(R) CPU           E5405  @ 2.00GHz CPU.
               
               
               

本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u3/90603/showart_2063340.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP