- 论坛徽章:
- 6
|
今天在阅读hbase权威指南的时候看到一句话,hbase使用的是LSM存储数据,和b-tree相比:
The stored data is always in an optimized layout. So, you have a predictable and consistent boundary on the number of disk seeks to access a key, and reading any number of records following that key doesn’t incur any extra seeks. In general, what could be emphasized about an LSM-tree-based system is cost transparency: you know that if you have five storage files, access will take a maximum of five disk seeks, whereas you have no way to determine the number of disk seeks an RDBMS query will take, even if it is indexed.
为什么有rowkey在进行数据检索的时候还是依然会耗费大量的cost呢?是因为hbase没有单独记录索引信息么?所以每次数据查询都只有进行全表扫描?所以得出的结论是对少量数据的及时查询效果会很不好,但是对大批量的数据处理和写入性能会有大幅度的提升 |
|