_nosay 发表于 2016-02-22 16:31

page_launder()函数疑问?

本帖最后由 _nosay 于 2016-02-22 17:30 编辑

Linux-2.4.0源码中的page_launder()函数:
我开始以为这个函数分2遍扫描,是在利用“第二次机会算法”,但看来不是,因为页面既然都到了inactive_dirty_list链表,就相当于被判了“死刑”,必须接受“死刑”往交换区写一次,才能变干净,也就没必要像“第二次机会算法”那样排个“你先死我后死”了。所以我猜测“第二次机会算法”应该是用于挑选将要移到inactive_dirty_list链表的页面,而不是用于挑选从inactive_dirty_list链表往交换区写的页面。

这个函数的注释里面有这样一句话:
Since we want to refill those pages as soon as possible, we'll make two loops over the inactive list, one to move the already cleaned pages to the inactive_clean lists and one to (often asynchronously) clean the dirty inactive pages.

如果第1遍扫描之后,空闲页面还是短缺,这个函数会执行第2遍扫描,将链表中的页面全部写到交换区(即洗成干净页面),以后的一段时间,再执行到该函数,一般第1遍扫描就能解决空闲页面短缺的问题,而且只是将上次第2遍扫描过程洗干净的页面直接转移到page->zone->inactive_clean_list链表即可,所以保证了这个函数大部分时间执行很快。

这个函数我看了好久,现在终于感觉有点理解了,请问是我理解的这样吗{:qq25:} ?

即使是这样,我还有疑问:
① 这种牺牲1次,提高大多次,好处体现在哪?反而有个缺点就是:在第2遍扫描时,如果inactive_dirty_list链表很长,内核不是会出现“卡顿”的现象吗?而且页面在链表中移来移去的,从整体上讲,性能也是有损失的(除非内核能保证,第2遍扫描大多数在CPU空闲的时候执行)。
② 第2遍扫描只是将链表中的页面都写到交换区,并没有顺便移到page->zone->inactive_clean_list链表,表示本次执行page_launder()洗净的页面数量cleaned_pages仍然是第1遍扫描的结果,这样不是可能会让后续的代码,由于洗净的页面还是满足不了需求而多走一些“弯路”吗?
③ 按照逻辑,难道不是干净页面在inactive_dirty_list链表最多只可能有1段区域,并且肯定在最前面吗?所以第1遍扫描如果遇到脏页面了,后面肯定都是脏页面,这样依次都移到inactive_dirty_list链表尾部,不是绕个圈又变成原来的形状了吗?而且第2遍扫描是将链表中剩下的页面全部洗干净,也不是从前往后扫一部分就不扫了,那即使“躲”到后面,又有什么意义呢,更别说由于大家都想“躲”,而造成恢复原来的队形了。

Buddy_Zhang1 发表于 2016-02-22 17:29

想帮您看看,可是我的内核代码里已经没有 page_launder() 这个函数了,楼主可以把这个函数贴出来看看吗....

_nosay 发表于 2016-02-22 17:33

回复 2# Buddy_Zhang1

我最近这样往论坛发代码,不知道会不会被公司抓起来{:qq25:} !我可是冒了好大危险的,你一定要帮我好好想想呀。
/**
* page_launder - clean dirty inactive pages, move to inactive_clean list
* @gfp_mask: what operations we are allowed to do
* @sync: should we wait synchronously for the cleaning of pages
*
* When this function is called, we are most likely low on free +
* inactive_clean pages. Since we want to refill those pages as
* soon as possible, we'll make two loops over the inactive list,
* one to move the already cleaned pages to the inactive_clean lists
* and one to (often asynchronously) clean the dirty inactive pages.
*
* In situations where kswapd cannot keep up, user processes will
* end up calling this function. Since the user process needs to
* have a page before it can continue with its allocation, we'll
* do synchronous page flushing in that case.
*
* This code is heavily inspired by the FreeBSD source code. Thanks
* go out to Matthew Dillon.
*/
#define MAX_LAUNDER                 (4 * (1 << page_cluster))
int page_launder(int gfp_mask, int sync)
{
        int launder_loop, maxscan, cleaned_pages, maxlaunder;
        int can_get_io_locks;
        struct list_head * page_lru;
        struct page * page;

        /*
       * We can only grab the IO locks (eg. for flushing dirty
       * buffers to disk) if __GFP_IO is set.
       */
        can_get_io_locks = gfp_mask & __GFP_IO;

        launder_loop = 0;
        maxlaunder = 0;
        cleaned_pages = 0;

dirty_page_rescan:
        spin_lock(&pagemap_lru_lock);
        maxscan = nr_inactive_dirty_pages;
        while ((page_lru = inactive_dirty_list.prev) != &inactive_dirty_list &&
                                maxscan-- > 0) {
                page = list_entry(page_lru, struct page, lru);

                /* Wrong page on list?! (list corruption, should not happen) */
                if (!PageInactiveDirty(page)) {
                        printk("VM: page_launder, wrong page on list.\n");
                        list_del(page_lru);
                        nr_inactive_dirty_pages--;
                        page->zone->inactive_dirty_pages--;
                        continue;
                }

                /* Page is or was in use?Move it to the active list. */
                if (PageTestandClearReferenced(page) || page->age > 0 ||
                                (!page->buffers && page_count(page) > 1) ||
                                page_ramdisk(page)) {
                        del_page_from_inactive_dirty_list(page);
                        add_page_to_active_list(page);
                        continue;
                }

                /*
               * The page is locked. IO in progress?
               * Move it to the back of the list.
               */
                if (TryLockPage(page)) {
                        list_del(page_lru);
                        list_add(page_lru, &inactive_dirty_list);
                        continue;
                }

                /*
               * Dirty swap-cache page? Write it out if
               * last copy..
               */
                if (PageDirty(page)) {
                        int (*writepage)(struct page *) = page->mapping->a_ops->writepage;
                        int result;

                        if (!writepage)
                                goto page_active;

                        /* First time through? Move it to the back of the list */
                        if (!launder_loop) {
                                list_del(page_lru);
                                list_add(page_lru, &inactive_dirty_list);
                                UnlockPage(page);
                                continue;
                        }

                        /* OK, do a physical asynchronous write to swap.*/
                        ClearPageDirty(page);
                        page_cache_get(page);
                        spin_unlock(&pagemap_lru_lock);

                        result = writepage(page);
                        page_cache_release(page);

                        /* And re-start the thing.. */
                        spin_lock(&pagemap_lru_lock);
                        if (result != 1)
                                continue;
                        /* writepage refused to do anything */
                        set_page_dirty(page);
                        goto page_active;
                }

                /*
               * If the page has buffers, try to free the buffer mappings
               * associated with this page. If we succeed we either free
               * the page (in case it was a buffercache only page) or we
               * move the page to the inactive_clean list.
               *
               * On the first round, we should free all previously cleaned
               * buffer pages
               */
                if (page->buffers) {
                        int wait, clearedbuf;
                        int freed_page = 0;
                        /*
                       * Since we might be doing disk IO, we have to
                       * drop the spinlock and take an extra reference
                       * on the page so it doesn't go away from under us.
                       */
                        del_page_from_inactive_dirty_list(page);
                        page_cache_get(page);
                        spin_unlock(&pagemap_lru_lock);

                        /* Will we do (asynchronous) IO? */
                        if (launder_loop && maxlaunder == 0 && sync)
                                wait = 2;        /* Synchrounous IO */
                        else if (launder_loop && maxlaunder-- > 0)
                                wait = 1;        /* Async IO */
                        else
                                wait = 0;        /* No IO */

                        /* Try to free the page buffers. */
                        clearedbuf = try_to_free_buffers(page, wait);

                        /*
                       * Re-take the spinlock. Note that we cannot
                       * unlock the page yet since we're still
                       * accessing the page_struct here...
                       */
                        spin_lock(&pagemap_lru_lock);

                        /* The buffers were not freed. */
                        if (!clearedbuf) {
                                add_page_to_inactive_dirty_list(page);

                        /* The page was only in the buffer cache. */
                        } else if (!page->mapping) {
                                atomic_dec(&buffermem_pages);
                                freed_page = 1;
                                cleaned_pages++;

                        /* The page has more users besides the cache and us. */
                        } else if (page_count(page) > 2) {
                                add_page_to_active_list(page);

                        /* OK, we "created" a freeable page. */
                        } else /* page->mapping && page_count(page) == 2 */ {
                                add_page_to_inactive_clean_list(page);
                                cleaned_pages++;
                        }

                        /*
                       * Unlock the page and drop the extra reference.
                       * We can only do it here because we ar accessing
                       * the page struct above.
                       */
                        UnlockPage(page);
                        page_cache_release(page);

                        /*
                       * If we're freeing buffer cache pages, stop when
                       * we've got enough free memory.
                       */
                        if (freed_page && !free_shortage())
                                break;
                        continue;
                } else if (page->mapping && !PageDirty(page)) {
                        /*
                       * If a page had an extra reference in
                       * deactivate_page(), we will find it here.
                       * Now the page is really freeable, so we
                       * move it to the inactive_clean list.
                       */
                        del_page_from_inactive_dirty_list(page);
                        add_page_to_inactive_clean_list(page);
                        UnlockPage(page);
                        cleaned_pages++;
                } else {
page_active:
                        /*
                       * OK, we don't know what to do with the page.
                       * It's no use keeping it here, so we move it to
                       * the active list.
                       */
                        del_page_from_inactive_dirty_list(page);
                        add_page_to_active_list(page);
                        UnlockPage(page);
                }
        }
        spin_unlock(&pagemap_lru_lock);

        /*
       * If we don't have enough free pages, we loop back once
       * to queue the dirty pages for writeout. When we were called
       * by a user process (that /needs/ a free page) and we didn't
       * free anything yet, we wait synchronously on the writeout of
       * MAX_SYNC_LAUNDER pages.
       *
       * We also wake up bdflush, since bdflush should, under most
       * loads, flush out the dirty pages before we have to wait on
       * IO.
       */
        if (can_get_io_locks && !launder_loop && free_shortage()) {
                launder_loop = 1;
                /* If we cleaned pages, never do synchronous IO. */
                if (cleaned_pages)
                        sync = 0;
                /* We only do a few "out of order" flushes. */
                maxlaunder = MAX_LAUNDER;
                /* Kflushd takes care of the rest. */
                wakeup_bdflush(0);
                goto dirty_page_rescan;
        }

        /* Return the number of pages moved to the inactive_clean list. */
        return cleaned_pages;
}

_nosay 发表于 2016-02-22 18:04

回复 2# Buddy_Zhang1

书上面说,这个函数是由多个线程执行的。


   

nswcfd 发表于 2016-02-23 10:52

如果不是私有代码,引用公开的lxr就可以了,不过现在2.4版本的应该不多了。

_nosay 发表于 2016-02-23 14:07

回复 5# nswcfd

{:qq13:} 我看的是《Linux内核源代码情景分析》,它对应的是2.4.0版本。
页: [1]
查看完整版本: page_launder()函数疑问?