- 论坛徽章:
- 15
|
yangPSO 发表于 2014-11-28 16:35 ![]()
回复 9# humjb_1983
之前我提供的流程有点问题(因为只有已经有dirty标记的page才能进入到那个流程),writeback中应该不会根据pte的dirty标志来设置page的dirty标记。
真正保证mmap dirty page被写入磁盘的机制应该是这样的:
mmap后,第一次写入,此时会page fault,然后mkdirty,此后,pte和page中都同时设置好了dirty标记,相应的dirty page能保证被写入磁盘。(这个流程没有疑问,之前已经清楚)
后续再次写入时,分两种情况:
1、再次写入时,原来的dirty page还没来得及写入,此时应该直接修改掉原有page中的内容即可,由于原有的pte和page中dirty标记都还在,所以,此时肯定也能保证相应dirty page被写入。
2、再次写入时,原理的dirty page已经writeback过了,相应的dirty标记都已经清除,就是我们这里正讨论的情况。这种情况应该是通过如下机制保证的:
在第一次写入后的dirty page的wirteback流程中会设置该page为页保护,就是楼上兄弟说的:
clear_page_dirty_for_io
page_mkclean
page_mkclean_file
page_mkclean_one
entry = pte_wrprotect(entry);
然后,当该page被再次修改时,会触发page fault,而在page fault的流程中,会判断mmap页写保护的情况,这种情况下,会根据pte的dirty标记设置page的dirty标记,并将该page重新设置为可写的:
__do_page_fault
handle_pte_fault
do_wp_page
...
else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
(VM_WRITE|VM_SHARED))) {
/*
* Only catch write-faults on shared writable pages,
* read-only shared pages can get COWed by
* get_user_pages(.write=1, .force=1).
*/
if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
struct vm_fault vmf;
int tmp;
vmf.virtual_address = (void __user *)(address &
PAGE_MASK);
vmf.pgoff = old_page->index;
vmf.flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
vmf.page = old_page;
/*
* Notify the address space that the page is about to
* become writable so that it can prohibit this or wait
* for the page to get into an appropriate state.
*
* We do this without the lock held, so that it can
* sleep if it needs to.
*/
page_cache_get(old_page);
pte_unmap_unlock(page_table, ptl);
tmp = vma->vm_ops->page_mkwrite(vma, &vmf);
...
用systemTap打点,相应的堆栈如下:
Returning from: 0xffffffff811292b0 : set_page_dirty+0x0/0x70 [kernel]
Returning to : 0xffffffff811b0a1f : __block_page_mkwrite+0xdf/0x120 [kernel]
0xffffffffa022d39a : ext4_page_mkwrite+0x17a/0x390 [ext4]
0xffffffff8113f89e : do_wp_page+0x5ee/0x8d0 [kernel]
0xffffffff8114036d : handle_pte_fault+0x2dd/0xb70 [kernel]
0xffffffff81140de4 : handle_mm_fault+0x1e4/0x2b0 [kernel]
0xffffffff81042b39 : __do_page_fault+0x139/0x490 [kernel]
0xffffffff814fbfce : do_page_fault+0x3e/0xa0 [kernel]
0xffffffff814f9325 : page_fault+0x25/0x30 [kernel]
ffffea0003b22398
|
|