Chinaunix

标题: B-Tree indexes [打印本页]

作者: xfwduke 时间: 2011-12-21 08:42
标题: B-Tree indexes

When people talk about an index without mentioning a type,they're probably referring to a B-Tree index,which typically uses a B-Tree data structure to store its data.

Many storage engines actually use a B+Tree index,in which each leaf node contains a link to the next for fast range traversals through nodes.

许多存储引擎实际上使用的是B+Tree索引——添加了到相邻叶子节点的指针以加速范围查找。

Storage engines store B-Tree indexes in various ways on disk,which can affect performance.

存储引擎有好几种方式把B-Tree索引存储在磁盘上——不同的存储方式对性能会有影响。

MyISAM使用了前缀压缩技术（prefix compression technique）来减小索引；InnoDB为了某些优化措施，没有压缩索引。MyISAM的索引标示的是行的存储位置；InnoDB则标示行的主键值。这些不同的实现方式各有优缺点。

The general idea of a B-Tree is that all the values are stored in order, and each leaf page is the same distance from the root.

B-Tree索引有两个通用的思想：所有的数据有序存储；所有叶子节点到根节点的距离相同。

Leaf pages are special,because they have pointers to the indexed data instead of pointers to other pages.(Diffrent storage engines have different types of "pointers" to the data.)

叶子节点存放了指向数据的指针，而不是指向其他节点（不同的存储引擎有不同的“指针”实现，如在InnoDB的聚簇索引中，叶子节点里面存的是实际的数据值——所以，这里所说的“指针”，应该是指可以直接获取到最终数据）。

结合一个例子来说明B-Tree索引的使用细则

CREATE TABLE People (

last_name varchar(50) not null,

first_name varchar(50) not null,

dob date not null,

gender enum('m','f') not null,

key(last_name,first_name,dob

);

在表上建了一个联合索引，联合索引的数据组织方式有点类似Perl的多值排序——先按第一关键字排序，第一关键字相同再按第二关键字排序。

Types of queries that can use a B-Tree index

Match the full value

全索引匹配。如可以很快找到名叫Cuba Allen，生日是1960-01-01的人。单项查找。

Match a leftmost prefix

最大左前缀匹配（直译）。如可以很快找到last_name是Allen的所有人。

这里感觉应该是：从左开始的全字匹配。即，只使用联合索引第一字段做完全匹配查找，或者只使用联合索引第一、二字段做完全匹配查找——按联合索引定义时的字段顺序逐字段的全字匹配都能使用索引进行加速。

如，可以很快找到last_name是Allen，first_name是Cuba的所有人。

Match a column prefix

字段部分匹配。如，可以很快找到last_name以 J 开头的所有人。

这里详细点应该是：部分匹配字段要么是联合索引第一字段，要么排在前面的字段做的是全字匹配。

如，可以很快找到last_name是Allen，first_name以J开头的所有人。

Match a range of values

范围匹配。如，可以很快找到last_name在Allen和Barrymore之间的所有人。

Match one part exactly and match a range on another part

做范围匹配的字段要么是联合索引的第一字段，要么排在前面的字段做的是全字匹配。

Index-only queries

只查找索引字段的数据——后文详解。

Limitation of B-Tree indexes

They are not useful if the lookup does not start from the leftmost side of the indexed columns.

如，不能用索引加速查找”last_name以en结尾“的所有人。

因为不是从索引最左（最开头）开始做匹配。

select * from People where last_name like '%en';

You can't skip columns in the index.

如，不能用索引加速查找”first_name是Cuba，生日是1960-01-01的所有人“。

因为这个查询跳过了第一字段。

The storage engine can't optimize accesses with any clumns to the right of the first range condition.

在查询中如果使用了通配符”%“，那么只有在通配符右边的部分会使用索引加速。

如，

select * from People where last_name='Smith' and first_name like 'J%' and dob='1970-02-12';

这个查询中，只有last_name和first_name会使用索引，因为first_name中有通配符，而dob在联合索引定义时，排在了first_name右边（后面）。

由于B-Tree index的使用细则和索引定义时的列顺序关系很大，所以为了提高性能，有时需要在相同的几列上按不同的顺序定义好几个索引，以供丌同的情况使用。

欢迎光临 Chinaunix (http://bbs.chinaunix.net/)