免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1075 | 回复: 0

The analysis of I/O schedulers(2) [复制链接]

论坛徽章:
0
发表于 2010-02-05 16:02 |显示全部楼层

                The analysis of I/O schedulers(2)
       
       
       
       
AS:
The full
name of this I/O scheduler is anticipatory scheduling. In a simple
word, when it is the time to select which the next request to be
dispatched, the I/O scheduler will make a decision whether it is
worth to delay to dispatch the next request which has been selected
because another next request might arrive soon and this request might
be more worth to be dispatched than the next request which has been
selected currently. For example, suppose there are two processes
creating I/O requests concurrently, at any moment, the disk head can
just stay at one position, let's suppose at this time the I/O
scheduler is serving process A. It is usual that a process at first
reads some data from disk, then waits for a moment, then reads again.
The time during the two read operation is called deceptive idleness.
Why the author used the word “deceptive”is that this process is
not in the real state of idleness, because very soon, the process
will create I/O requests at once. Let's go back to our example,
suppose the disk head is serving process A. But a moment later, there
is no requests from the process A, at the same time, process B is
creating requests. The I/O scheduler should make decision now. Should
the disk head move to serve process B at once or should the disk head
stay for a moment to wait for additional requests from A? It is
obvious that if the process A will create requests at once, the disk
head will avoid unnecessary movement between A and B. And the shrink
of disk movement between A and B will be decreased.
Above is
the basic background of anticipatory scheduling. In Linux kernel, AS
is based on DEADLINE. In another word, the basic design idea of AS is
CSCAN, with the component of anticipatory.
So, there
are also two queues like DEADLINE in AS. One is sorted queue, the
other is expired queue( I use deadline queue when I describe
DEADLINE). The basic principle of AS is the same with DEADLINE,
whenever the I/O scheduler selects the requests to be dispatched, it
first checks whether the request on the expired queue has expired, if
yes,  then dispatch this request at once. If no, the request on the
sorted queue will be selected, then I/O scheduler will run a routine
called as_can_anticipate to decide whether anticipation should do.
Let's
discuss it in detail.
First, the
I/O scheduler should get the next request to dispatch. But how to get
the next request? Which request in the sorted queue should be the
next request? The routine as_find_next_rq will do this job. In a
simple word, the heuristic to get the next request is below: the next
request should be the nearest request to the current disk head. In
DEADLINE, nearest means the next request of the sorted queue, in AS,
it is a little more complicated, the I/O scheduler will compare the
left and right(or back and front) requests of the current one, if the
distance between left and the current is smaller than the half
distance between the right and current, the left request will be
selected. The design idea behind this is very simple, in the original
CSCAN, the disk head will move in the same direction, but in AS, the
disk head has the chance to move backward.
Once the
I/O scheduler has selected the request to be dispatched from the
sorted queue, it is still not the time to dispatch. Then, AS will
look at the expired queue to check whether the time of the first
request has expired. As I have said, if it is, the request of the
expired queue will be selected and dispatched at once. Here, let's
suppose the time is not expired. Since the request has been selected,
AS now will check whether the anticipation should be done. The
routine as_can_anticipate will do this job. Thought I do not
understand all of these conditions, I just list what I understand
about them, maybe the best way is to follow the source code.
/*
there
are some other tricks in the routine of as_can_anticipate, such as if
the last request is write, anticipation will not be done. Here I just
analysis the key part of this routine, actually, as_can_anticipate
invokes the routine as_can_break_anticipation, in another word, if
as_can_break_anticipation returns 1, which means the anticipation
will not be done.
[color="#0000ff"]*/
static int
as_can_break_anticipation(struct as_data *ad, struct request *rq)
{
        struct
io_context *ioc;
        struct
as_io_context *aic;
        ioc
= ad->io_context;
        BUG_ON(!ioc);
        spin_lock(&ioc->lock);
[color="#0000ff"]/*
this
is what I understand of AS, if the current request is from the same
process, anticipation will not be done, this request should be
dispatched into device driver at once.
[color="#0000ff"]*/
        if
(rq && ioc == RQ_IOC(rq)) {
                /*
request from same process */
               
spin_unlock(&ioc->lock);
                return
1;
        }
/*
each
time when a request is dispatched into device driver,
ad->ioc_finished will be set to 0, when the request has been
finished, and if the latest dispatched request is from the same
process with this request, this value will be set to 1. I think we
can simply understand it as when ad->ioc_finished equals 1, it
means the previous requests from the process has been finished. So,
this condition can be understood as if the anticipation time the
process has been expired, the anticipation will not be done.
*/
        if
(ad->ioc_finished && as_antic_expired(ad)) {
                /*
                 *
In this situation status should really be FINISHED,
                 *
however the timer hasn't had the chance to run yet.
                 */
               
spin_unlock(&ioc->lock);
                return
1;
        }
/*
I
am not sure what's the meaning of these codes, why is aic NULL? Is it
because of the first request from the process?
*/
        aic
= ioc->aic;
        if
(!aic) {
               
spin_unlock(&ioc->lock);
                return
0;
        }
/*
aic->nr_queued
means more than one request have been queued in AS' queue. Here, if
the request is synchronize, there is at most one request in the queue
for a process. Also, aic->nr_dispatched means more than one
request have been dispatched into device driver and they are not
finished yet.
What
the anticipation wants to do is to wait the next request, since there
have been more than one from a process in the queue, it is
unnecessary to do anticipation any more.
*/
        if
(atomic_read(&aic->nr_queued) > 0) {
                /*
process has more requests queued */
               
spin_unlock(&ioc->lock);
                return
1;
        }
        if
(atomic_read(&aic->nr_dispatched) > 0) {
                /*
process has more requests dispatched */
               
spin_unlock(&ioc->lock);
                return
1;
        }
/*
If
the request is sync and it is near the last request, anticipation
will not be done.
*/
        if
(rq && rq_is_sync(rq) && as_close_req(ad, aic, rq))
{
                /*
                 *
Found a close request that is not one of ours.
                 *
                 *
This makes close requests from another process update
                 *
our IO history. Is generally useful when there are
                 *
two or more cooperating processes working in the same
                 *
area.
                 */
/*
Here
the AS updates ad->exit_prob and ad->exit_no_coop, as its
definition, ad->exit_prob means the probability a task will exit
while being waited on. ad->exit_no_coop means probability an
exited task will not be part of a later cooperating request. Though I
am not quiet sure about the deep meaning of this two parameters, I
can use a very simple sentence to describe, if the chance that the
process has been died is very high, it is worthless to wait.
*/
                if
(!test_bit(AS_TASK_RUNNING, &aic->state)) {
                        if
(aic->ttime_samples == 0)
                              
ad->exit_prob = (7*ad->exit_prob +
256)/8;
                       
ad->exit_no_coop =
(7*ad->exit_no_coop)/8;
                }
                as_update_iohist(ad,
aic, rq);
               
spin_unlock(&ioc->lock);
                return
1;
        }
        if
(!test_bit(AS_TASK_RUNNING, &aic->state)) {
                /*
process anticipated on has exited */
                if
(aic->ttime_samples == 0)
                        ad->exit_prob
= (7*ad->exit_prob + 256)/8;
                if
(ad->exit_no_coop > 128) {
                       
spin_unlock(&ioc->lock);
                        return
1;
                }
        }
        if
(aic->ttime_samples == 0) {
                if
(ad->new_ttime_mean > ad->antic_expire) {
                       
spin_unlock(&ioc->lock);
                        return
1;
                }
                if
(ad->exit_prob * ad->exit_no_coop > 128*256) {
                       
spin_unlock(&ioc->lock);
                        return
1;
                }
        }
else if (aic->ttime_mean > ad->antic_expire) {
                /*
the process thinks too much between requests */
               
spin_unlock(&ioc->lock);
                return
1;
        }
        spin_unlock(&ioc->lock);
        return
0;
}
Even I
list the source code above, I do not answer the question about when
to do anticipation exactly. The first condition is thinktime.
Thinktime means the time between two requests. If the previous
request is being served in device driver while the next request has
been decided to dispatch into device driver, thinktime would be 0.
Please notice that the next request should not of course be the
request from the same process, it might be near the current disk head
while it belongs to another process. On the other hand, if the
precious request has been served, the time difference between the
previous request and the current request to be dispatched is the
thinktime. AS use such mathematic formulations to calculate
ttime_mean:
aic->ttime_samples
= (7*aic->ttime_samples + 256) / 8;

aic->ttime_total
= (7*aic->ttime_total + 256*ttime) / 8; // ttime is thinktime

aic->ttime_mean
= (aic->ttime_total + 128) / aic->ttime_samples;

It is
obvious to notice that if thinktime is large, ttime_mean will be
large, if ttime_mean is larger than antic_expire which is the maximum
time of anticipation, it means no anticipation should be done.
Actually, you can understand it in this way, if the process is I/O
insensitive, the interval time between requests from the process is
very small, even 0(if the requests are sync, 0 is almost impossible),
as a result, think_time will be small, it will always be smaller than
antic_expire, which means it is worth to be waited.
The other
element to decide whether it is worth to be waited is seek_mean. As
we know, locality I/O accession is a default property of process, but
this is not always true, especially for random I/O. So, the AS should
always consider the distance between the two requests. If the
distance of the two requests is small, it might be worth to wait for
the next request when the previous request has been served. The
mathematic formulation is below:
aic->seek_samples
= (7*aic->seek_samples + 256) / 8;

aic->seek_total
= (7*aic->seek_total + (u64)256*sdist) / 8;

total
= aic->seek_total + (aic->seek_samples/2);

do_div(total,
aic->seek_samples);

aic->seek_mean
= (sector_t)total;

the
mathematic model to calculate seek_mean and ttime_mean is the same.
As you may notice I don't explain a routine as_close_req in the
source codes above, actually, if the distance between two requests is
smaller than seek_mean, it means these two requests are near to each
other. Of course, there is a limitation for seek_mean, you can find
it yourself in source code.
As
considers the thinktime, seek distance the exit probability of this
process to evaluate whether it is worth to be waited. I think these 3
elements, especially the first two is the key component of the
algorithm AS. Each time when a new request is added to the queue,
ttime_mean and seek_mean of the process is updated. At the same time,
the AS will check whether the anticipation should be broken if there
it is. In a simple word, whenever there is a new request, the AS
should check whether this one is the right request being waited.

               
               
               
               
               

本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u2/86301/showart_2178611.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP