免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
1234下一页
最近访问板块 发新帖
查看: 10322 | 回复: 34
打印 上一主题 下一主题

dali 内存数据库系统结构 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2003-07-01 21:59 |只看该作者 |倒序浏览
很久以前翻译的 现在觉得里面写的都不错
拿出来与大家分享

其中会有翻译不好的地方
欢迎指正

dali是bell实验室的内存数据库产品
里面有很多新特性

如支持大于内存的数据库容量等

论坛徽章:
0
2 [报告]
发表于 2003-07-01 21:59 |只看该作者

dali 内存数据库系统结构

The Architecture of the Dalí Main Memory Storage Manager
DALI SYSTEM DESIGNED INFOMATION
DALI SYSTEM DESIGNED INFOMATION        1
Abstract        1
  1。INTRODUCTION        2
2.ARCHITECTURE        5
2.1        Layers of Abstraction        6
2.2        Pointers and Offsets        7
2.3        Storage Allocation        8
3TRANSACTION MANAGEMENT IN DALI        9
3.3        Transactions and Operations        12
3.4        Logging Model        13
3.5        Ping-pong Checkpointing        14
3.6        Abort Processing        16
3.7        Recovery        16
4        FAULT TOLERANCE        18
4.1        Protection from Bad Writes        18
4.2        Protection from Process Death        20
5        COLLECTIONS AND INDEXES        21
5.2        Extendible Hash        22
5.3        T-tree Indexes        23
6        DALI RELATIONAL MANAGER        24
7        CONCLUSION        25
7.1        Acknowledgments        25
THE ARCHITECTURE OF THE DALI MAIN MEMORY STORAGE MANAGER        26
9        Dali: Motivation and Principles        26
Motivation        26
Principles        26
Transactional Model        27
Build A High-Perfomance Application        28

论坛徽章:
0
3 [报告]
发表于 2003-07-01 22:00 |只看该作者

dali 内存数据库系统结构

Abstract


The performance needs of many database applications require that the entire database be stored in main memory. The Dalí system is a main memory storage manager designed to provide the persistence(持久) (that is, the retention of data after a crash), availability, and safety guarantees(保证) that users typically expect from a disk-resident保留 database, including support for transactions. Because it is tuned to support in-memory data, Dalí offers very high performance. User processes map the entire database into their address space and access data directly, thereby avoiding expensive remote procedure calls and buffer manager interactions typical of accesses in the disk-resident commercial systems available today. Dalí recovers from a system or process failure by restoring the database to a consistent state. It also provides unique concurrency control and memory protection features, as well as index management and a relational application programming interface.
用户进程直接把整个数据库映射到它们的地址空间并直接访问数据。因此避免了在当今典型磁盘商用数据库中的昂贵的远程过程调用和缓冲区管理器之间的交互。Dalí可以像索引管理器和关系应用程序接口一样提供唯一的并发控制和内存保护特征,

缩写与术语ATT 活动事务表CLR 补偿日志记录Free tree 存储自由内存空间的树Latch 一种短期锁,通过高速的相互排斥的机制实现Logical undo log record 一种记录,包含怎样取消操作的描述Physical undo log record 一种记录,包含怎样取消物理改变S 共享模式SQL 结构化查询语言X 排斥模式

论坛徽章:
0
4 [报告]
发表于 2003-07-01 22:01 |只看该作者

dali 内存数据库系统结构

1。INTRODUCTION
A number of database applications particularly in the telecommunications industry and other industries involved in real-time content delivery require very high performance access to data. Such applications typically need high transaction rates, coupled with very low latency for transactions, and they impose stringent durability and availability requirements. Consider, for example, a real phone-company application in which phone call data is recorded and queries can be issued against the data. The application must be able to process several thousand (albeit small) requests (lookups/updates) per second, with less than 50 milliseconds latency for lookups and less than a few minutes of down time per year.
大量的数据库应用研究,特别是在电信工业和其它工业,对实时内容交付要求对数据的访问有很高性能。这些应用要求有很高的传输率,加上低的传输延迟,他们要求有严格的持久性和高的可靠性。考虑一下,例如在一个实时的电话公司应用中,电话数据可以与数据传送一样的速度被记录和查询。这个应用要求必须每秒可以处理几千个请求,并要求有少于50毫秒的延迟和一年低于几分钟的停机时间。
The increasing availability of large, relatively cheap memory suggests that more database applications could reside entirely or almost entirely in main memory. The performance of these types of applications will benefit from having data cached in main memory. If, in addition, the storage manager supporting such applications is tailored to main memory, performance can be increased significantly, as shown in Lehman et al.1
不断增加的可以得到的大量廉价的内存使更多的应用程序可以全部或是几乎全部放在内存中这引起类型的程序将可以因为数据放在内存中而获得性能的提高。
The storage manager of a database system provides its core functionality, such as concurrency control, recovery mechanisms, storage allocation/free space management, transaction management, and index management. Numerous storage managers have been implemented for disk-resident data, including the storage managers of Exodus2 and Starburst.1 With the exception of the Starburst main-memory storage component, 1 however, we are not aware of any storage manager that is tailored for main-memory resident data. (System M, described by Salem and Garcia-Molina,3 is a transaction processing testbed for memory-resident data, but it is not a full-feature storage manager.)
数据库系统的存储管理器提供数据库的核心功能,如并发控制,恢复机制,存储空间的分配/释放,事务管理和索引管理。在数据保留在磁盘上的数据库系统中已有无数的存储管理器,如Exodus2和Starburst.1

论坛徽章:
0
5 [报告]
发表于 2003-07-01 22:01 |只看该作者

dali 内存数据库系统结构

The Dalí system4  named in honor of Salvadore Dalí, for his famous painting, The Persistence of Memory is a storage manager implemented at Bell Labs. It was developed and optimized for environments in which the database resides in the main memory. A number of principles that have evolved with Dalí over the past three years now guide its design and evolution, including:
持久内存系统是在BELL LAB实现的一个存储管理器。它为整个数据库保存在内存中而开发并优化。在过去的三年中大量的定理已被发明并用于指导DALI的开发。
·        Direct access to data,         直接访问数据
·        No interprocess communication,         内有进程间通信
·        Fault tolerance, and                 容错性和
·        Consistency of response time.         一致的返回时间

论坛徽章:
0
6 [报告]
发表于 2003-07-01 22:01 |只看该作者

dali 内存数据库系统结构

The first of these principles is direct access to data. Dalí uses a memory-mapped architecture in which the database is mapped into the virtual address space of the process. This allows the user to acquire pointers directly to information stored in the database. A related principle is "no interprocess communication" for basic system services. All concurrency control and logging services are provided through shared memory rather than communication with a server.
Another guiding principle of Dalí is its ability to create fault-tolerant applications. This is best exemplified by its use of the transactional paradigm, the dominant technology for providing fault tolerance in critical applications. In fact, Dalí provides an advanced, explicitly multilevel transaction model that has facilitated the production of high-concurrency indexing and storage structures. To help ensure the integrity of data stored in shared memory, Dalí also supports recovery from process and system failures and helps prevent corruption of data through its use of code words and memory protection.
第一个定理是直接访问数据。DALI使用内存映射体系,把整个数据库映射到进程的虚地址空间。这使用户可以直接通过指针访问储存在MMDB中的信息。相关的定理是基本的系统服务没有内部进程间通信。所有的并发控制和日志服务通过共享内存而不是与服务器进行内部通信。
另一个指导方向是DALI可以创建容错性应用。这方面最好的例子是在事务方面使用的范例,有显见的技术在严格的应用中提供容错性应用。事实上DALI提供一个高级的、明确的具有易用的高并发性的索引和存储结构的产品的多级事务模型。为了帮助确认数据完整的保存在共识内存中,DALI同样支持从进程失败和系统失败中恢复数据,并使用关键字和内存保存来阻止脏数据。
A key requirement for applications that expect to store all their data in main memory is consistency of response time. Dalí also supports consistency by providing fine-grained concurrency control and minimal interference with the checkpointer owing to latching. Other principles that have guided Dalí's implementation have been a toolkit approach and support for multiple interface levels. The toolkit approach implies, for example, that logging facilities can be turned off if data does not need to be persistent and that locking can be turned off if data is private to a process. To optimize critical application components with special implementations, Dalí exposes low-level components to the user. The developers of most applications, however, will prefer the high-level relational interface.
应用程序的一个关键需求是希望它所有保存在内存中的数据可以有一致的访问时间。DALI通过提供细粒度的并发性控制和由于有锁机制的检查点的最小接口支持一致性。其它引导DALI实现的定理已经成为一个工具箱方法并支持多接口层。例如在工具箱方法实现中,如果数据不需要有持久性的话日志机制可以被关闭,如果数据只是一个进程私有的话那么锁机制也可以关闭。在特定实现中为了为严格应用优化,DALI提供低层组件给用户,但是对多数应用,可以更会对高层的关系接口更感兴趣。

论坛徽章:
0
7 [报告]
发表于 2003-07-01 22:02 |只看该作者

dali 内存数据库系统结构

Although Dalí can be used in systems where the database is larger than its main memory (as long as the database fits in the virtual address space of the process), the architecture of Dalí from storage allocation and indexing to its recovery facilities has been designed to deliver high performance when the database fits into the physical main memory of the computer.
尽管DALI可以在数据库比内存大[最大值可以等于虚地址窨]的情况下使用,但是但是DALI的存储分配和恢复机制的索引体系是为数据库放在主存中进行优化的。
This paper provides a brief overview of the components that constitute the Dalí storage manager. We provide a more detailed description of Dalí's architecture in Bohannon et al.5 and on our Web page at http://www.bell-labs.com/org/project/dali. DataBlitz, a new product based on the Dalí system, will become available in March 1997.
这篇论文提供对DALI系统存储管理器的一个简要描述,在我们的网站http://www.bell-labs.com/org/project/dal上提供更多的设计思想

论坛徽章:
0
8 [报告]
发表于 2003-07-01 22:43 |只看该作者

dali 内存数据库系统结构

2.ARCHITECTURE
The database in the Dalí architecture consists of one or more database files and a special system database file. User data is stored in database files, but all data related to database support, such as log and lock data, is stored in the system database file. This enables the uniform use of storage allocation routines[常规,] for (persistent) user data, as well as (nonpersistent) system data such as locks and logs. The system database file also stores information about the other database files in the system.
在DALI系统中数据库可以看成是由一个或多个数据文件和一个特殊的系统文件组成。用户数据保存在数据文件中,但所有与数据库支持相关的数据,如日志和锁数据,保存在系统数据文件中。这可以对持久的用户数据和非持久性的系统数据[如锁和日志]一致使用常规的存储分配器。系统数据文件同样保存系统中其它数据文件的信息。

Database files opened by a process are directly mapped into the address space of that process, as shown in Figure 1 . In Dalí, either memory-mapped files or shared-memory segments can be used to provide this mapping. Different processes may map different sets of database files and may map the same database file to different locations in their address space. This flexibility[适应性] precludes[阻止] using virtual memory addresses as physical pointers to data in database files, but it does provide two important benefits. First, a database file can be resized easily. Second, the total active database space on the system may exceed the addressing space of a single process. This is useful on machines with 32-bit addressing in which physical memory can significantly exceed the amount of memory addressable by a single process (for example, Sun's SPARCCenter product line).
正如图1所示,被进程打开的数据库文件被直接映射到进程的地址空间。在DALI中,内存映射文件或者共识内存段都可以提供这个映射。不同的进程可以映射数据库文件中不同的集合,并且可以映射相同的数据文件到它们自己的不同的进程空间中。这种适应性阻止使用虚地址空间作为物理指针指向数据库文件中的数据,但是它确实提供两个重要的利益:第一个,数据库文件可以很容易的改变大小。2系统中活动的数据库总空间可超过单进程的地址空间。这对32位内存寻址空间的机器来说可以显著超过单个进程总共可以使用的地址空间。

However, using Dalí in a 64-bit machine may significantly change both of these considerations, leading us to consider using physical addressing. If a single database file can be limited to something like 64 GB, then each process could still map close to a billion database files, which can be expected to far exceed the total database space.

然而,在64位机器上使用DALI时在这些方面必须有显著的改变,让我们考虑使用物理地址空间。如果单个数据文件可以限制到64GB,那么每一个进程仍然可以紧密映射到上万个数据库文件,而这些文件远超过总的数据库空间。

image011.jpg (23.89 KB, 下载次数: 177)

image011.jpg

论坛徽章:
0
9 [报告]
发表于 2003-07-01 22:44 |只看该作者

dali 内存数据库系统结构

2.1  Layers of Abstraction
Dalí's architecture, illustrated in Figure 2 , is organized in multiple layers of abstraction to support the toolkit approach discussed earlier. At the highest level, users can interact with Dalí's relational manager. Below that level is what we call the heap-file/indexing layer, which provides support for fixed-length and variable-length collections, as well as template-based indexing abstractions. In general, at this level, a user does not need to interact with individual locks or latches. (A latch is a short-term lock implemented by a high-speed mutual[相互的] exclusion [排它] mechanism.) Instead, the user specifies a policy such as "no locking" or "lock-plus-handle-phantoms" to the lower level.
如图2所示,为了支持早些时候讨论过的工具箱,DALI的结构由多个抽象层组成。在最高层,用户可以与DALI的关系管理器交互。下一层我们叫它堆文件/索引层,这一层提供定长和变长记录的收集,同时也提供对基于模板的索引的抽象。通常在这一层,用户不需要与不同的锁和临时锁[latches]进行交互(一个latches是一种短期锁,通过高速的互斥机制实现)。相反,在更低一层用户必须指定“不用锁”或“加强锁并处理幻象”策略。

Services for logging, locking, latching, multilevel recovery, and storage allocation are exposed at the lowest level. New indexing methods can be built on this layer, as can special-purpose data structures for either an application or a database management system. By definition, this level has the most complex user interface, but it has proven[证明] itself during the creation of the higher-level interfaces and database systems described above.

日志,锁,短期锁(latching),多层恢复,存储分配机制这些服务出现在最底层。因此可以为一个程序或一个数据库管理系统指定特殊的数据结构。通过定义,这一层拥有最复杂的用户接口,可以通过它创建高层接口和以上描述的数据库系统。

image013.jpg (15.53 KB, 下载次数: 173)

image013.jpg

论坛徽章:
0
10 [报告]
发表于 2003-07-01 22:44 |只看该作者

dali 内存数据库系统结构

2.2  Pointers and Offsets

It is crucial[紧要的] to the performance of Dalí that database pointers be mapped efficiently to virtual memory addresses. In Dalí, each process maintains a database-offset table, which specifies the location in memory where each database file is mapped. The table is implemented as an array indexed by the (integer) database file identifier.
数据库指针被高效的映射到虚拟内存中对DALI的性能是很紧要的。在DALI中,每个进程维护一个数据库借助偏移表,该表指定每个数据库文件映射到内存中的位置。这个表用数据实现,并用数据库文件描述符作索引。

The primary type of database pointer in Dalí contains a database file identifier and an offset within the database file. Dereferencing a database pointer p simply involves adding the offset contained in p to the virtual memory address at which the database file is mapped, looked up from the offset table. A second form of database pointer is available if the database file is known from context. For example, all pointers from a certain index might reside in a particular database file. In this case, we may store just the offset within the database file as the pointer. Both offsets and full pointers are implemented as simple C++ template classes, which allow them to be used as "smart pointers."

DALI中的原始数据库指针保持着数据库文件的描述符和在数据库文件中的偏移。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP