论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2013-11-09 12:49 |只看该作者 |倒序浏览

我一直纠结于Linux Kernel中使用最为频繁的两棵树：Radix Tree 和 Red Black Tree，目前仅对于其数据结构的操作和应用场景有肤浅的了解，仅作抛砖引玉，引用Red Black Tree的Documentation在Kernel Source如下：

Red-black trees are a type of self-balancing binary search tree, used for storing sortable key/value data pairs.  This differs from radix trees (which are used to efficiently store sparse arrays and thus use long integer indexes to insert/access/delete nodes) and hash tables (which are not kept sorted to be easily traversed in order, and must be tuned for a specific size and hash function where rbtrees scale gracefully storing arbitrary keys).

问题：
Red Black在维持平衡二叉树的特点上又有极强的可扩展性(High Scalability)，我在Ceph，BtrFS和Hadoop的HDFS中都看到了它，莫非正因为它的扩展性很好的符合大数据和云存储环境？
Radix Tree作为前缀树(Prefix Tree或Trie)的变种，莫非就主要用于存储稀疏的有公共前缀的Key的数据？为什么不用HASH？难道因为HASH的及其差劲的可扩展，只适合固定的Input Size？

在Memory Management的重要数据结构 Page Cache的原型中发现这两棵树：
struct address_space {
            struct inode                      *host;                /* owner: inode, block_device */
            struct radix_tree_root page_tree;       /* radix tree of all pages */
            spinlock_t                         tree_lock;          /* and lock protecting it */
            unsigned int                   i_mmap_writable;/* count VM_SHARED mappings */
            struct rb_root                   i_mmap;                            /* tree of private and shared mappings */
            struct list_head             i_mmap_nonlinear;/*list VM_NONLINEAR mappings */
            struct mutex                   i_mmap_mutex;             /* protect tree, count, list */
            /* Protected by tree_lock together with the radix tree */
            unsigned long                nrpages;             /* number of total pages */
            pgoff_t                               writeback_index;/* writeback starts here */
            const struct address_space_operations *a_ops;             /* methods */
            unsigned long                flags;                   /* error bits/gfp mask */
            struct backing_dev_info *backing_dev_info; /* device readahead, etc */
            spinlock_t                         private_lock;    /* for use by the address_space */
            struct list_head             private_list;    /* ditto */
            void                                     *private_data;  /* ditto */
} __attribute__((aligned(sizeof(long))));

各位谁能解释为什么存储所有关于某个Mapping的地址空间的Physical Pages用Radix Tree (成员page_tree)，而成员i_mmap用Red Black Tree？万谢啊！！！

无独有偶，我在Hadoop的核心HDFS的Native C用于实现File System中又发现了Red Black Tree。。。

文库|博客

zjyy1017

白手起家

论坛徽章:: 0

2楼 [报告]

发表于 2013-11-11 14:32 |只看该作者

在Wikipedia中查到的Radix Tree和一般的平衡二叉搜索树以及HASH的区别：
(In the following comparisons, it is assumed that the keys are of length k and the data structure contains n members.)
Unlike balanced trees, radix trees permit lookup, insertion, and deletion in O(k) time rather than O(log n). This doesn't seem like an advantage, since normally k ≥ log n, but in a balanced tree every comparison is a string comparison requiring O(k) worst-case time, many of which are slow in practice due to long common prefixes (in the case where comparisons begin at the start of the string). In a trie, all comparisons require constant time, but it takes m comparisons to look up a string of length m. Radix trees can perform these operations with fewer comparisons, and require many fewer nodes.
Radix trees also share the disadvantages of tries, however: as they can only be applied to strings of elements or elements with an efficiently reversible mapping to strings, they lack the full generality of balanced search trees, which apply to any data type with a total ordering. A reversible mapping to strings can be used to produce the required total ordering for balanced search trees, but not the other way around. This can also be problematic if a data type only provides a comparison operation, but not a (de)serialization operation.
Hash tables are commonly said to have expected O(1) insertion and deletion times, but this is only true when considering computation of the hash of the key to be a constant time operation. When hashing the key is taken into account, hash tables have expected O(k) insertion and deletion times, but may take longer in the worst-case depending on how collisions are handled. Radix trees have worst-case O(k) insertion and deletion. The successor/predecessor operations of radix trees are also not implemented by hash tables.

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

asuka2001

富足长乐

论坛徽章:: 17

3楼 [报告]

发表于 2013-11-11 19:57 |只看该作者

回复 1# zjyy1017

i_mmap是用作区间树，主要目的是用做 object-based reverse-mapping。可以详见：
http://lwn.net/Articles/509994/
http://lwn.net/Articles/23732/

而基树上挂的是从属于这个文件的 page cache，用文件偏移量检索对应的 page cache页，两者的使用用途完全不一样！

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

debugfan

白手起家

论坛徽章:: 0

4楼 [报告]

发表于 2015-11-07 19:09 |只看该作者

本帖最后由 debugfan 于 2015-11-07 19:10 编辑

之所以用radix树不用hash表，应该是内核性能要求非常高，按wiki上的说明：hash的插入和删除时间通常认为是O(1)，但这个没有把计算查询key的hash值的时间考虑进去，如果考虑进去，就是O(k)，k为查询的key长度，hash还有可能遇到较坏和最坏情况。而radix树最坏的插入和删除时间是O(k)。也就是说：hash表最好的查询和删除时间与radix最坏的情况相当。因此，在查询和删除速度要求非常高的情况下都尽量用radix树而不用hash表。
至于i_mmap用红黑树，可能是历史原因，早期用红黑树简单实现，fork出去的代码就没管这块了，或者是觉得它的查询和删除速度不是很重要。而我现在看到的i_mmap也用的是radix树，叫做prio_tree，一种扩展的radix树，全称：radix priority search tree。

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

返回列表

Chinaunix › 论坛 › 程序设计 › 内核源码 › 请教：Radix Tree 和 Red Black Tree 的具体应用场景

[数据结构] 请教：Radix Tree 和 Red Black Tree 的具体应用场景 [复制链接]

浏览过的版块