免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 2538 | 回复: 2
打印 上一主题 下一主题

关于块设备阻塞问题,请教高手 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2012-02-20 15:37 |只看该作者 |倒序浏览
现象:
Feb 20 15:33:33 localhost kernel: INFO: task blkid:2610 blocked for more than 120 seconds.
Feb 20 15:33:33 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 20 15:33:33 localhost kernel: blkid         D ffff88007f024500     0  2610      1 0x00000080
Feb 20 15:33:33 localhost kernel: ffff880078e47c18 0000000000000082 0000000000000000 ffffe8ffffa28988
Feb 20 15:33:33 localhost kernel: ffff880078e47c38 ffffffff814c88f5 ffff880078e47bf8 00000001001ceeb1
Feb 20 15:33:33 localhost kernel: ffff88007a0986b8 ffff880078e47fd8 0000000000010518 ffff88007a0986b8
Feb 20 15:33:33 localhost kernel: Call Trace:
Feb 20 15:33:33 localhost kernel: [<ffffffff814c88f5>] ? thread_return+0x6bd/0x778
Feb 20 15:33:33 localhost kernel: [<ffffffff814c97ae>] __mutex_lock_slowpath+0x13e/0x180
Feb 20 15:33:33 localhost kernel: [<ffffffff8125afca>] ? kobject_get+0x1a/0x30
Feb 20 15:33:33 localhost kernel: [<ffffffff81248a1d>] ? get_disk+0x7d/0xf0
Feb 20 15:33:33 localhost kernel: [<ffffffff814c964b>] mutex_lock+0x2b/0x50
Feb 20 15:33:33 localhost kernel: [<ffffffff811a4338>] __blkdev_get+0x68/0x3c0
Feb 20 15:33:33 localhost kernel: [<ffffffff811a46b0>] ? blkdev_open+0x0/0xc0
Feb 20 15:33:33 localhost kernel: [<ffffffff811a46a0>] blkdev_get+0x10/0x20
Feb 20 15:33:33 localhost kernel: [<ffffffff811a4721>] blkdev_open+0x71/0xc0
Feb 20 15:33:33 localhost kernel: [<ffffffff81169b80>] __dentry_open+0x110/0x370
Feb 20 15:33:33 localhost kernel: [<ffffffff81208442>] ? selinux_inode_permission+0x72/0xb0
Feb 20 15:33:33 localhost kernel: [<ffffffff8120051f>] ? security_inode_permission+0x1f/0x30
Feb 20 15:33:33 localhost kernel: [<ffffffff81169ef7>] nameidata_to_filp+0x57/0x70
Feb 20 15:33:33 localhost kernel: [<ffffffff8117d313>] do_filp_open+0x5f3/0xd40
Feb 20 15:33:33 localhost kernel: [<ffffffff8126c118>] ? __percpu_counter_add+0x68/0x90
Feb 20 15:33:33 localhost kernel: [<ffffffff811892f2>] ? alloc_fd+0x92/0x160
Feb 20 15:33:33 localhost kernel: [<ffffffff81169929>] do_sys_open+0x69/0x140
Feb 20 15:33:33 localhost kernel: [<ffffffff81169a40>] sys_open+0x20/0x30
Feb 20 15:33:33 localhost kernel: [<ffffffff81013172>] system_call_fastpath+0x16/0x1b

blk_queue_make_request(pDeiosQueue,host_make_request);

int host_make_request(struct request_queue *q, struct bio *bio)
{
        int i = 0;       
        ds_io_obj_t *pIO = NULL;
        ds_sgl_t *pSgl = NULL;
             struct scsi_device *pScsiDev = q->queuedata;
       
        struct bio_vec *pBVec = NULL;

        pIO = kmalloc(sizeof(ds_io_obj_t),GFP_KERNEL);
        pSgl = ds_alloc_sgl();
        if(pIO==NULL || pSgl==NULL)
        {
                set_bit(BIO_EOF, &bio->bi_flags);
                bio_endio(bio,bio->bi_flags);
                return 0;
        }
       
        bio_for_each_segment(pBVec, bio, i)
        {               
                pSgl.length = pBVec->bv_len;
                pSgl.virtAddr = (uint64_t)((uint64_t)page_address(pBVec->bv_page) + pBVec->bv_offset);       
        }
        if (i>1)
        {
                pIO->requestBuffer = (void*)pSgl;
        }
        else
        {
                pIO->requestBuffer = (void*)pSgl[0].virtAddr;
                ds_free_sgl(pSgl);
        }
   
        pIO->useSgl = i;
       
        if (bio_data_dir(bio) == WRITE)
        {
                pIO->opCode = DS_WRITE;
        }
        else
        {
                pIO->opCode = DS_READ;
        }
        pIO->lba = bio->bi_sector;
        pIO->requestBufflen = bio->bi_size;
        pIO->errorCode = DS_RW_SUCCESS;
        pIO->cmnd_done = host_cmd_done;
        pIO->special = (void*)bio;
        pIO->dsDev = (void*)pScsiDev;
        ds_fd_obj_t * pFdDev=NULL;
        pFdDev = hal_find_fd_devicePointer((void*)pScsiDev);
        pIO->dispatchcommand = pFdDev->io_fn_proc;
   
        pIO->dispatchcommand(pIO);

   
        return 0;
}

觉得很奇怪,查了相关的资料,可能是某些page没有被释放,以至于阻塞,
http://219.148.35.28/archiver/tid-3554016.html这个案例和我的比较像,不过我用的是自己构造的队列,请指点,快一周了,还是搞不掂

论坛徽章:
0
2 [报告]
发表于 2012-02-21 11:38 |只看该作者
原因找到了,与上述代码无关,误用了一个scsi命令的指针,问题和现象相差太远,折腾了这么久

论坛徽章:
0
3 [报告]
发表于 2012-02-21 13:54 |只看该作者
杯具,居然又重现了,看样子是不让人活了
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP