page_launder()函数疑问?
本帖最后由 _nosay 于 2016-02-22 17:30 编辑Linux-2.4.0源码中的page_launder()函数:
我开始以为这个函数分2遍扫描,是在利用“第二次机会算法”,但看来不是,因为页面既然都到了inactive_dirty_list链表,就相当于被判了“死刑”,必须接受“死刑”往交换区写一次,才能变干净,也就没必要像“第二次机会算法”那样排个“你先死我后死”了。所以我猜测“第二次机会算法”应该是用于挑选将要移到inactive_dirty_list链表的页面,而不是用于挑选从inactive_dirty_list链表往交换区写的页面。
这个函数的注释里面有这样一句话:
Since we want to refill those pages as soon as possible, we'll make two loops over the inactive list, one to move the already cleaned pages to the inactive_clean lists and one to (often asynchronously) clean the dirty inactive pages.
如果第1遍扫描之后,空闲页面还是短缺,这个函数会执行第2遍扫描,将链表中的页面全部写到交换区(即洗成干净页面),以后的一段时间,再执行到该函数,一般第1遍扫描就能解决空闲页面短缺的问题,而且只是将上次第2遍扫描过程洗干净的页面直接转移到page->zone->inactive_clean_list链表即可,所以保证了这个函数大部分时间执行很快。
这个函数我看了好久,现在终于感觉有点理解了,请问是我理解的这样吗{:qq25:} ?
即使是这样,我还有疑问:
① 这种牺牲1次,提高大多次,好处体现在哪?反而有个缺点就是:在第2遍扫描时,如果inactive_dirty_list链表很长,内核不是会出现“卡顿”的现象吗?而且页面在链表中移来移去的,从整体上讲,性能也是有损失的(除非内核能保证,第2遍扫描大多数在CPU空闲的时候执行)。
② 第2遍扫描只是将链表中的页面都写到交换区,并没有顺便移到page->zone->inactive_clean_list链表,表示本次执行page_launder()洗净的页面数量cleaned_pages仍然是第1遍扫描的结果,这样不是可能会让后续的代码,由于洗净的页面还是满足不了需求而多走一些“弯路”吗?
③ 按照逻辑,难道不是干净页面在inactive_dirty_list链表最多只可能有1段区域,并且肯定在最前面吗?所以第1遍扫描如果遇到脏页面了,后面肯定都是脏页面,这样依次都移到inactive_dirty_list链表尾部,不是绕个圈又变成原来的形状了吗?而且第2遍扫描是将链表中剩下的页面全部洗干净,也不是从前往后扫一部分就不扫了,那即使“躲”到后面,又有什么意义呢,更别说由于大家都想“躲”,而造成恢复原来的队形了。 想帮您看看,可是我的内核代码里已经没有 page_launder() 这个函数了,楼主可以把这个函数贴出来看看吗....
回复 2# Buddy_Zhang1
我最近这样往论坛发代码,不知道会不会被公司抓起来{:qq25:} !我可是冒了好大危险的,你一定要帮我好好想想呀。
/**
* page_launder - clean dirty inactive pages, move to inactive_clean list
* @gfp_mask: what operations we are allowed to do
* @sync: should we wait synchronously for the cleaning of pages
*
* When this function is called, we are most likely low on free +
* inactive_clean pages. Since we want to refill those pages as
* soon as possible, we'll make two loops over the inactive list,
* one to move the already cleaned pages to the inactive_clean lists
* and one to (often asynchronously) clean the dirty inactive pages.
*
* In situations where kswapd cannot keep up, user processes will
* end up calling this function. Since the user process needs to
* have a page before it can continue with its allocation, we'll
* do synchronous page flushing in that case.
*
* This code is heavily inspired by the FreeBSD source code. Thanks
* go out to Matthew Dillon.
*/
#define MAX_LAUNDER (4 * (1 << page_cluster))
int page_launder(int gfp_mask, int sync)
{
int launder_loop, maxscan, cleaned_pages, maxlaunder;
int can_get_io_locks;
struct list_head * page_lru;
struct page * page;
/*
* We can only grab the IO locks (eg. for flushing dirty
* buffers to disk) if __GFP_IO is set.
*/
can_get_io_locks = gfp_mask & __GFP_IO;
launder_loop = 0;
maxlaunder = 0;
cleaned_pages = 0;
dirty_page_rescan:
spin_lock(&pagemap_lru_lock);
maxscan = nr_inactive_dirty_pages;
while ((page_lru = inactive_dirty_list.prev) != &inactive_dirty_list &&
maxscan-- > 0) {
page = list_entry(page_lru, struct page, lru);
/* Wrong page on list?! (list corruption, should not happen) */
if (!PageInactiveDirty(page)) {
printk("VM: page_launder, wrong page on list.\n");
list_del(page_lru);
nr_inactive_dirty_pages--;
page->zone->inactive_dirty_pages--;
continue;
}
/* Page is or was in use?Move it to the active list. */
if (PageTestandClearReferenced(page) || page->age > 0 ||
(!page->buffers && page_count(page) > 1) ||
page_ramdisk(page)) {
del_page_from_inactive_dirty_list(page);
add_page_to_active_list(page);
continue;
}
/*
* The page is locked. IO in progress?
* Move it to the back of the list.
*/
if (TryLockPage(page)) {
list_del(page_lru);
list_add(page_lru, &inactive_dirty_list);
continue;
}
/*
* Dirty swap-cache page? Write it out if
* last copy..
*/
if (PageDirty(page)) {
int (*writepage)(struct page *) = page->mapping->a_ops->writepage;
int result;
if (!writepage)
goto page_active;
/* First time through? Move it to the back of the list */
if (!launder_loop) {
list_del(page_lru);
list_add(page_lru, &inactive_dirty_list);
UnlockPage(page);
continue;
}
/* OK, do a physical asynchronous write to swap.*/
ClearPageDirty(page);
page_cache_get(page);
spin_unlock(&pagemap_lru_lock);
result = writepage(page);
page_cache_release(page);
/* And re-start the thing.. */
spin_lock(&pagemap_lru_lock);
if (result != 1)
continue;
/* writepage refused to do anything */
set_page_dirty(page);
goto page_active;
}
/*
* If the page has buffers, try to free the buffer mappings
* associated with this page. If we succeed we either free
* the page (in case it was a buffercache only page) or we
* move the page to the inactive_clean list.
*
* On the first round, we should free all previously cleaned
* buffer pages
*/
if (page->buffers) {
int wait, clearedbuf;
int freed_page = 0;
/*
* Since we might be doing disk IO, we have to
* drop the spinlock and take an extra reference
* on the page so it doesn't go away from under us.
*/
del_page_from_inactive_dirty_list(page);
page_cache_get(page);
spin_unlock(&pagemap_lru_lock);
/* Will we do (asynchronous) IO? */
if (launder_loop && maxlaunder == 0 && sync)
wait = 2; /* Synchrounous IO */
else if (launder_loop && maxlaunder-- > 0)
wait = 1; /* Async IO */
else
wait = 0; /* No IO */
/* Try to free the page buffers. */
clearedbuf = try_to_free_buffers(page, wait);
/*
* Re-take the spinlock. Note that we cannot
* unlock the page yet since we're still
* accessing the page_struct here...
*/
spin_lock(&pagemap_lru_lock);
/* The buffers were not freed. */
if (!clearedbuf) {
add_page_to_inactive_dirty_list(page);
/* The page was only in the buffer cache. */
} else if (!page->mapping) {
atomic_dec(&buffermem_pages);
freed_page = 1;
cleaned_pages++;
/* The page has more users besides the cache and us. */
} else if (page_count(page) > 2) {
add_page_to_active_list(page);
/* OK, we "created" a freeable page. */
} else /* page->mapping && page_count(page) == 2 */ {
add_page_to_inactive_clean_list(page);
cleaned_pages++;
}
/*
* Unlock the page and drop the extra reference.
* We can only do it here because we ar accessing
* the page struct above.
*/
UnlockPage(page);
page_cache_release(page);
/*
* If we're freeing buffer cache pages, stop when
* we've got enough free memory.
*/
if (freed_page && !free_shortage())
break;
continue;
} else if (page->mapping && !PageDirty(page)) {
/*
* If a page had an extra reference in
* deactivate_page(), we will find it here.
* Now the page is really freeable, so we
* move it to the inactive_clean list.
*/
del_page_from_inactive_dirty_list(page);
add_page_to_inactive_clean_list(page);
UnlockPage(page);
cleaned_pages++;
} else {
page_active:
/*
* OK, we don't know what to do with the page.
* It's no use keeping it here, so we move it to
* the active list.
*/
del_page_from_inactive_dirty_list(page);
add_page_to_active_list(page);
UnlockPage(page);
}
}
spin_unlock(&pagemap_lru_lock);
/*
* If we don't have enough free pages, we loop back once
* to queue the dirty pages for writeout. When we were called
* by a user process (that /needs/ a free page) and we didn't
* free anything yet, we wait synchronously on the writeout of
* MAX_SYNC_LAUNDER pages.
*
* We also wake up bdflush, since bdflush should, under most
* loads, flush out the dirty pages before we have to wait on
* IO.
*/
if (can_get_io_locks && !launder_loop && free_shortage()) {
launder_loop = 1;
/* If we cleaned pages, never do synchronous IO. */
if (cleaned_pages)
sync = 0;
/* We only do a few "out of order" flushes. */
maxlaunder = MAX_LAUNDER;
/* Kflushd takes care of the rest. */
wakeup_bdflush(0);
goto dirty_page_rescan;
}
/* Return the number of pages moved to the inactive_clean list. */
return cleaned_pages;
}
回复 2# Buddy_Zhang1
书上面说,这个函数是由多个线程执行的。
如果不是私有代码,引用公开的lxr就可以了,不过现在2.4版本的应该不多了。 回复 5# nswcfd
{:qq13:} 我看的是《Linux内核源代码情景分析》,它对应的是2.4.0版本。
页:
[1]