免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1402 | 回复: 0
打印 上一主题 下一主题

Anatomizing Linux scheduler: CFS 2 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2009-08-26 10:36 |只看该作者 |倒序浏览
Linux uses sched_entity to represent a scheduling entity, which is contained by task_struct. So when a sched_entity is used, we can easily locate the corresponding task.
   struct task_struct  {
      ...
      struct sched_entity se;
      ...
   };
When a pointer to scheduling entity se, the task p is obtained:
   p = container_of(se, struct task_struct, se);
Now how cfs_rq.fair_clock and se.wait_runtime to calculate? Firstly let us talk about the design principle.
A task is in run queue or in sleep queue. When it is ready, it is in run queue. This implies that the task is ready to run. It may not execute. When it wants some unavailable resources(e.g. reading data from disk), it will dequeue from run queue and enqueue to sleep queue. The CfS scheduler is going to garantee the above-mentioned fairness in the runqueues. Now let's assume that there are 3 tasks(A, B, C) in the runqueue. Also we assume that the time interval achieved fairness is delta_time.
        A, B, C -- task
        delta_time -- time interval achieved fairness
        Note: A, B, C also represent the corresponding sched_entity of task A, B, C if not specified
Now we can get the following equation:

        delta_time = delta_exec_A + delta_exec_B + delta_exec_C
           where delta_X is the executing time of task A, B, or C
The wait_runtime of A, B, C is:
        delta_wait_A = delta_time - delta_exec_A
        delta_wait_B = delta_time - delta_exec_B
        delta_wait_C = delta_time - delta_exec_C
For efficiecy, we should calculate all needed values in task context other than tranversing the run queues. This means that we should not do as follows:
        delta_wait_A = delta_exec_B + delta_exec_C;
        delta_wait_C = delta_exec_A + delta_exec_B;
The fair_key of sched_entity is as above-mentioned:
        se.fair_key = cfs_rq.fair_clock - se.run_waittime
So the delta_wait_A is larger, se.fair_key is smaller. Note that cfs_rq.fair_clock is a clock to cfs scheduler. Since cfs_rq.fair_clock is clock, it increments with real elapsed time. delta_exec_A, delta_excu_B, delta_exec_C is real time. The cfs_rq.fair_clock is calculated as follows:
       cfs_rq.fair_clock += delta_exec_time * NICE_O_LOAD / cfs_rq.raw_weighted_load
         where cfs_rq.raw_weighted_load = se(A).load_weight + se(B).load_weight + se(C).load_weight;
The time interval can be calculated as follows: now let's consider the scheduling procedure.  A(1) -> B(1) -> C(1) -> A(2) -> ...
As above mentioned, the complete fairness can be achieved if in any time interval A, B, C should execute according to the load_weight(representing the priority). A(1) is executed at time T1 and successive executed at Time T2. The time interval is :
       T2 - T1
T2 can get from system jiffies. It is 'now' time and equals to current jiffies. T1 can be stored in the previous executed time. Then the time interval is:
       now - prev_stored_now
So when A is executed,
       cfs_rq.fair_clock += (Now - prev_stored_now) * NICE_0_LOAD / cfs_rq.raw_weighted_load
when B is executed,
       cfs_rq.fair_clock += (Now - prev_stored_now) * NICE_0_LOAD / cfs_rq.raw_weighted_load
and etc.
to be continued...


本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u3/103132/showart_2037281.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP