【求助】块驱动读写造成死机
本帖最后由 ggmove 于 2012-04-25 15:37 编辑求高人帮帮忙,对块设备读写到一定数据量时(大概400M)就会死机,不像之前出现指针错误那样会打印信息,死机后所有操作无效,重启后log也没什么信息,下面的log是其中一次出现的(概率不高),一般的log就是打印驱动里面的printk信息,一点帮助都获取不到,不知道是不是内存使用过多还是怎么样,不是说linux很少死机的吗,不知道怎么回事,求达人帮帮忙!一般块驱动做了什么操作会出现这种死机现象啊,完全搞不懂啊,谢谢了~
用的vmware虚拟机,fedora 4(内核2.6.18)下面运行的驱动程序
Apr 25 10:07:19 localhost kernel: ------------[ cut here ]------------
Apr 25 10:07:19 localhost kernel: kernel BUG at arch/i386/mm/highmem.c:43!
Apr 25 10:07:19 localhost kernel: invalid opcode: 0000 [#1]
Apr 25 10:07:19 localhost kernel: Modules linked in: pns pxeinfo autofs4 hidp rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables acpiphp video button battery asus_acpi ac ipv6 lp parport_pc parport floppy nvram ehci_hcd uhci_hcd sg snd_ens1371 gameport snd_rawmidi snd_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event pcnet32 snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore mii snd_page_alloc pcspkr i2c_piix4 i2c_core dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd mptspi scsi_transport_spi mptscsih sd_mod scsi_mod mptbase
Apr 25 10:07:19 localhost kernel: CPU: 0
Apr 25 10:07:19 localhost kernel: EIP: 0060:[<c011559d>] Not tainted VLI
Apr 25 10:07:19 localhost kernel: EFLAGS: 00010206 (2.6.18 #1)
Apr 25 10:07:19 localhost kernel: EIP is at kmap_atomic+0x55/0x79
Apr 25 10:07:19 localhost kernel: eax: 0000001c ebx: fffb5000 ecx: c17836c0 edx: c0002ed4
Apr 25 10:07:19 localhost kernel: esi: 00000007 edi: e112abe8 ebp: d800a000 esp: dfef2e0c
Apr 25 10:07:19 localhost kernel: ds: 007b es: 007b ss: 0068
Apr 25 10:07:19 localhost kernel: Process kjournald (pid: 3936, ti=dfef2000 task=db75b5b0 task.ti=dfef2000)
Apr 25 10:07:20 localhost kernel: Stack: d5b207c0 eee921c0 f8bcbe07 f8bcc578 001bf168 00000000 00000008 eee921c0
Apr 25 10:07:20 localhost kernel: 00000000 001bf168 00000000 eee9224a 00000000 d5b207c0 eee921c0 e112abe8
Apr 25 10:07:20 localhost kernel: d5b207c0 f8bcbf59 f8bcc38d 00000008 000000ff c01b4b5a d5b207c0 d5b207c0
Apr 25 10:07:20 localhost kernel: Call Trace:
Apr 25 10:07:20 localhost kernel: [<f8bcbe07>] pns_xfer_bio+0x20a/0x334
Apr 25 10:07:20 localhost kernel: [<f8bcbf59>] pns_make_request+0x28/0x45
Apr 25 10:07:20 localhost kernel: [<c01b4b5a>] generic_make_request+0x176/0x186
Apr 25 10:07:20 localhost kernel: [<c01034fa>] common_interrupt+0x1a/0x20
Apr 25 10:07:20 localhost kernel: [<c013ebe6>] mempool_alloc+0x37/0xd3
Apr 25 10:07:20 localhost kernel: [<c01b5f1f>] submit_bio+0x9e/0xa3
Apr 25 10:07:20 localhost kernel: [<c01597d6>] bio_alloc_bioset+0x9b/0xf3
Apr 25 10:07:20 localhost kernel: [<c015699a>] submit_bh+0xe1/0xff
Apr 25 10:07:20 localhost kernel: [<c015789a>] ll_rw_block+0x88/0xa4
Apr 25 10:07:20 localhost kernel: [<f8870303>] journal_commit_transaction+0x372/0xe8f
Apr 25 10:07:20 localhost kernel: [<c02d7b63>] _spin_unlock_irq+0x5/0x7
Apr 25 10:07:20 localhost kernel: [<c02d628d>] schedule+0x4a7/0x503
Apr 25 10:07:20 localhost kernel: [<c0128e5f>] autoremove_wake_function+0x0/0x2d
Apr 25 10:07:20 localhost kernel: [<f8873dfb>] kjournald+0xb5/0x205
Apr 25 10:07:20 localhost kernel: [<c0128e5f>] autoremove_wake_function+0x0/0x2d
Apr 25 10:07:20 localhost kernel: [<f8873d46>] kjournald+0x0/0x205
Apr 25 10:07:20 localhost kernel: [<c0128dad>] kthread+0xad/0xd8
Apr 25 10:07:20 localhost kernel: [<c0128d00>] kthread+0x0/0xd8
Apr 25 10:07:20 localhost kernel: [<c0101005>] kernel_thread_helper+0x5/0xb
Apr 25 10:07:20 localhost kernel: Code: 09 5b 89 c8 5e e9 fd f4 02 00 8b 15 14 12 3e c0 8d 46 43 bb 00 f0 ff ff c1 e0 0c 29 c3 8d 04 b5 00 00 00 00 29 c2 83 3a 00 74 08 <0f> 0b 2b 00 7f 5b 2f c0 2b 0d 10 03 41 c0 c1 f9 05 c1 e1 0c 0b
Apr 25 10:07:20 localhost kernel: EIP: [<c011559d>] kmap_atomic+0x55/0x79 SS:ESP 0068:dfef2e0c
Apr 25 10:07:20 localhost kernel: <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
Apr 25 10:07:20 localhost kernel: in_atomic():1, irqs_disabled():0
Apr 25 10:07:20 localhost kernel: [<c012b58c>] down_read+0x12/0x1f
Apr 25 10:07:20 localhost kernel: [<c012412f>] blocking_notifier_call_chain+0xe/0x29
Apr 25 10:07:20 localhost kernel: [<c011b88b>] do_exit+0x1c/0x719
Apr 25 10:07:20 localhost kernel: [<c0103d77>] die+0x1b3/0x25b
Apr 25 10:07:20 localhost kernel: [<c0103dfa>] die+0x236/0x25b
Apr 25 10:07:20 localhost kernel: [<c010439e>] do_invalid_op+0x0/0x9d
Apr 25 10:07:20 localhost kernel: [<c010442f>] do_invalid_op+0x91/0x9d
Apr 25 10:07:20 localhost kernel: [<c011559d>] kmap_atomic+0x55/0x79
Apr 25 10:07:20 localhost kernel: [<c020ca3b>] vt_console_print+0x0/0x210
Apr 25 10:07:20 localhost kernel: [<c0119736>] release_console_sem+0x182/0x1bc
Apr 25 10:07:20 localhost kernel: [<c01035d1>] error_code+0x39/0x40
Apr 25 10:07:20 localhost kernel: [<c011559d>] kmap_atomic+0x55/0x79
Apr 25 10:07:20 localhost kernel: [<f8bcbe07>] pns_xfer_bio+0x20a/0x334
Apr 25 10:07:20 localhost kernel: [<f8bcbf59>] pns_make_request+0x28/0x45
Apr 25 10:07:20 localhost kernel: [<c01b4b5a>] generic_make_request+0x176/0x186
Apr 25 10:07:20 localhost kernel: [<c01034fa>] common_interrupt+0x1a/0x20
Apr 25 10:07:20 localhost kernel: [<c013ebe6>] mempool_alloc+0x37/0xd3
Apr 25 10:07:20 localhost kernel: [<c01b5f1f>] submit_bio+0x9e/0xa3
Apr 25 10:07:20 localhost kernel: [<c01597d6>] bio_alloc_bioset+0x9b/0xf3
Apr 25 10:07:20 localhost kernel: [<c015699a>] submit_bh+0xe1/0xff
Apr 25 10:07:20 localhost kernel: [<c015789a>] ll_rw_block+0x88/0xa4
Apr 25 10:07:20 localhost kernel: [<f8870303>] journal_commit_transaction+0x372/0xe8f
Apr 25 10:07:20 localhost kernel: [<c02d7b63>] _spin_unlock_irq+0x5/0x7
Apr 25 10:07:20 localhost kernel: [<c02d628d>] schedule+0x4a7/0x503
Apr 25 10:07:20 localhost kernel: [<c0128e5f>] autoremove_wake_function+0x0/0x2d
Apr 25 10:07:20 localhost kernel: [<f8873dfb>] kjournald+0xb5/0x205
Apr 25 10:07:20 localhost kernel: [<c0128e5f>] autoremove_wake_function+0x0/0x2d
Apr 25 10:07:20 localhost kernel: [<f8873d46>] kjournald+0x0/0x205
Apr 25 10:07:20 localhost kernel: [<c0128dad>] kthread+0xad/0xd8
Apr 25 10:07:20 localhost kernel: [<c0128d00>] kthread+0x0/0xd8
Apr 25 10:07:20 localhost kernel: [<c0101005>] kernel_thread_helper+0x5/0xb
Apr 25 10:07:20 localhost kernel: note: kjournald exited with preempt_count 1 是不是因为使用到了高端内存然后又使用不当的地方,你看看highmem.c:43行是啥判断,看看不就知道了。
顺便借楼主宝地发个招聘::lol
公司(记忆科技)发展新业务,需要招聘底层软件开发,linux、windows均可。
由于部门刚成立,有很大的发展机会,公司也很重视该新业务,对愿意奋斗、创业者
是一个很好的契机。公司情况可以自己去网上查查。
具体要求如下,有意者把简历发给我,联系方式
记忆集团 Ramaxel Group
Tel: 0755-86016186转8080
Fax: 0755-86200763
Email:jinming@ramaxel.com
底软开发:
职位要求:
1、本科及以上学历,计算机、通信、电子、自动化、自控等相关专业;
2、三年及以上linux或windows驱动开发经验(研究生两年以上即可,特别优秀者可以放宽);
3、精通C语言;
4、精通内核态编程常用的技术、技巧;
5、有块设备、PCIe设备、SCSI设备驱动开发经验者优先;
6、有Nand flash、Nor flash、SSD驱动、固件编程经验者优先;
7、有存储系统研发经验者优先;
8、有FPGA编程经验者优先;
9、能熟悉阅读英文资料最佳;
10、有较强的学习能力及学习激情;
11、有良好的团队合作精神,有很强的上进心及责任感;
岗位职责:
1、linux、windows环境下驱动设计、开发;
2、针对具体驱动、硬件进行生产、厂验工具的设计开发工作;
3、进行相关算法设计及实现;
4、配合硬件、逻辑进行问题分析定位;
待遇:
面议
工作地点:
深圳
回复 2# IbrahimKing
用kmalloc申请内存应该是默认低端内存,如果低端内存用完了,好像也不会去申请高端内存的吧,怎么会报这种错误呢,看了一下那边的代码,有点看不懂。。。
把这行用工具解析成具体的行,看看做了啥,有没有不合理的。pns_xfer_bio+0x20a
回复 3# ggmove
回复 4# IbrahimKing
这种错误还有工具可以解析啊。。。:lol: ,求教啊,我都是自己在那慢慢算,一点点看的,不知道能不能说下具体的工具名称啊,谢谢了:em16:
页:
[1]