Fast Set Intersection in Memory
Bolin Ding and Arnd Christian König
26 January 2011
Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data structures to represent sets such that their intersection can be computed in a worst-case efficient way. In general, given k (preprocessed) sets, with totally n elements, we will show how to compute their intersection in expected time O(n / sqrt(w) + kr), where r is the intersection size and w is the number of bits in a machine-word. In addition,we introduce a very simple version of this algorithm that has weaker asymptotic guarantees but performs even better in practice; both algorithms outperform the state of the art techniques for both synthetic and real data sets and workloads.作者: cokeboL 时间: 2013-04-01 17:09
求楼主贴用各种算法的代码和实际测试结果作者: shan_ghost 时间: 2013-04-01 17:23
根据微软的文档,它的算法是基于已排序的列表(Algorithms based on Ordered Lists)的。