免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1660 | 回复: 0
打印 上一主题 下一主题

How To Locate An Oops [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2008-05-22 00:23 |只看该作者 |倒序浏览
Linux:  How To Locate An Oops
   
   
   
   
  
  
    August 16, 2004 - 8:32am
  Submitted by
Jeremy
-->
  Submitted by
Jeremy
on August 16, 2004 - 8:32am.
  
  
   

  
  
  
Denis Vlasenko posted an interesting howto document to the
lkml
.  He begins, "okay, so you've got an oops and want to find out what happened?"
He then walks step by step through the process of reading an oops,
tracking it down to the proper source file, dissasembling the c code,
locating the oops in the generated assembly, and finally matching the
assembly back up with the precise line in the c code where the oops
happened.
Several others commented on the document, providing a few of their own tips as well.  Among them, 2.4 stable Linux kernel [
forum
] maintainer Marcelo Tosatti [
interview
] offered a simpler alternative method of matching assembly with the precise line in c code, then went on to comment ,"the
document looks great, but could go deeper into things like like
hardware-flaky bitflips, stack junk (explain why the stack can be
"unreliable"), etc. to be even more useful.
"  Further information on debugging an oops can be found distributed with the Linux kernel in Documentation/
oops-tracing.txt
.
From: Denis Vlasenko [email blocked]
To:  linux-kernel
Subject: [RFC] HOWTO find oops location
Date:         Sat, 14 Aug 2004 11:53:06 +0300
Hi folks,
Is this draft HOWTO useful? Comments?
--- cut here --- --- cut here --- --- cut here --- --- cut here ---
Okay, so you've got an oops and want to find out what happened?
In this HOWTO, I presume you did not delete and did not
tamper with your kernel build tree. Also, I recommend you
to enable these options in the .config:
CONFIG_DEBUG_SLAB=y
CONFIG_FRAME_POINTER=y
First one makes use-after-free bug hunt easy, second gives
you much more reliable stacktraces.
Ok, let's take a look at example OOPS. ^^^^ marks are mine.
Unable to handle kernel NULL pointer dereference at virtual address 00000e14
printing eip:
c0162887
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
Modules linked in: eeprom snd_seq_oss snd_seq_midi_event..........
CPU:    0
EIP:    0060:[]    Not tainted
EFLAGS: 00010206   (2.6.7-nf2)
EIP is at prune_dcache+0x147/0x1c0
          ^^^^^^^^^^^^^^^^^^^^^^^^
eax: 00000e00   ebx: d1bde050   ecx: f1b3c050   edx: f1b3ac50
esi: f1b3ac40   edi: c1973000   ebp: 00000036   esp: c1973ef8
ds: 007b   es: 007b   ss: 0068
Process kswapd0 (pid: 65, threadinfo=c1973000 task=c1986050)
Stack: d7721178 c1973ef8 0000007a 00000000 c1973000 f7ffea48 c0162d1f 0000007a
       c0139a2b 0000007a 000000d0 00025528 049dbb00 00000000 000001fa 00000000
       c0364564 00000001 0000000a c0364440 c013add1 00000080 000000d0 00000000
Call Trace:
[] shrink_dcache_memory+0x1f/0x30
[] shrink_slab+0x14b/0x190
[] balance_pgdat+0x1b1/0x200
[] kswapd+0xc7/0xe0
[] autoremove_wake_function+0x0/0x60
[] ret_from_fork+0x6/0x14
[] autoremove_wake_function+0x0/0x60
[] kswapd+0x0/0xe0
[] kernel_thread_helper+0x5/0x14
Code: 8b 50 14 85 d2 75 27 89 34 24 e8 4a 2b 00 00 8b 73 0c 89 1c
Let's try to find out where did that exactly happened.
Grep in your kernel tree for prune_dcache. Aha, it is defined in
fs/dcache.c! Ok, execute these two commands:
# objdump -d fs/dcache.o > fs/dcache.disasm
# make fs/cache.s
Now in fs/ you should have:
dcache.c - source code
dcache.o - compiled object file
dcache.s - assembler output of C compiler ('half-compiled' code)
dcache.disasm - disasembled object file
Open dcache.disasm and find "prune_dcache":
00000540 :
     540:       55                      push   %ebp
We need to find prune_dcache+0x147. Using shell,
# printf "0x%x\n" $((0x540+0x147))
0x687
and in dcache.disasm:
     683:       85 c0                   test   %eax,%eax
     685:       74 07                   je     68e
     687:       8b 50 14                mov    0x14(%eax),%edx   
     68e:       89 34 24                mov    %esi,(%esp)
     691:       e8 fc ff ff ff          call   692
     696:       8b 73 0c                mov    0xc(%ebx),%esi
     699:       89 1c 24                mov    %ebx,(%esp)
     69c:       e8 9f f9 ff ff          call   40
Comparing with "Code: 8b 50 14 85 d2 75 27 " - match!
We need to find matching line in dcache.s and, eventually, in dcache.c.
It's easy to find prune_dcache in dcache.s:
prune_dcache:
        pushl   %ebp
but even though it is not too hard to find matching instruction:
        movl    8(%edi), %eax
        decl    20(%edi)
        testb   $8, %al
        jne     .L593
.L517:
        movl    68(%ebx), %eax
        testl   %eax, %eax
        je      .L532
        movl    20(%eax), %edx  d_lock);
                /*
                 * We found an inuse dentry which was not removed from
                 * dentry_unused because of laziness during lookup.  Do not free
                 * it - just keep it off the dentry_unused list.
                 */
                if (atomic_read(&dentry->d_count)) {
                        spin_unlock(&dentry->d_lock);
                        continue;
                }
                /* If the dentry was recently referenced, don't free it. */
                if (dentry->d_flags & DCACHE_REFERENCED) {
                        dentry->d_flags &= ~DCACHE_REFERENCED;
                        list_add(&dentry->d_lru, &dentry_unused);
                        dentry_stat.nr_unused++;
                        spin_unlock(&dentry->d_lock);
                        continue;
                }
                prune_one_dentry(dentry);
        }
        spin_unlock(&dcache_lock);
}
What now?! Well, I have a silly method which helps to find
C code line corresponding to that asm one. Edit your
prune_dcache in dcache.c like this:
static void prune_dcache(int count)
{
        spin_lock(&dcache_lock);
        for (; count ; count--) {
                struct dentry *dentry;
                struct list_head *tmp;
asm("#1");
                tmp = dentry_unused.prev;
asm("#2");
                if (tmp == &dentry_unused)
                        break;
asm("#3");
                list_del_init(tmp);
asm("#4");
                prefetch(dentry_unused.prev);
asm("#5");
                dentry_stat.nr_unused--;
asm("#6");
...
...
asm("#e");
                prune_one_dentry(dentry);
        }
asm("#f");
        spin_unlock(&dcache_lock);
}
and do "make fs/dcache.s" again. Look into new dcache.s.
Nasty surprize:
APP
        #e
#NO_APP
        testb   $16, %al
        jne     .L495
        orl     $16, %eax
        leal    72(%ecx), %esi
        movl    %eax, 4(%ebx)
        movl    4(%esi), %edx
        movl    72(%ecx), %eax
        testl   %eax, %eax
        movl    %eax, (%edx)
        je      .L493
        movl    %edx, 4(%eax)
.L493:
        movl    $2097664, 4(%esi)
.L495:
        leal    40(%ebx), %ecx
        movl    40(%ebx), %eax
        movl    4(%ecx), %edx
        movl    %edx, 4(%eax)
        movl    %eax, (%edx)
        movl    $2097664, 4(%ecx)
        movl    $1048832, 40(%ebx)
        decl    dentry_stat
        movl    8(%ebx), %esi
        testl   %esi, %esi
        je      .L536
        leal    56(%ebx), %eax
        movl    $0, 8(%ebx)
        movl    56(%ebx), %edx
        movl    4(%eax), %ecx
        movl    %ecx, 4(%edx)
        movl    %edx, (%ecx)
        movl    %eax, 4(%eax)
        movl    %eax, 56(%ebx)
        movl    8(%edi), %eax
        decl    20(%edi)
        testb   $8, %al
        jne     .L592
.L518:
        movl    8(%edi), %eax
        decl    20(%edi)
        testb   $8, %al
        jne     .L593
.L517:
        movl    68(%ebx), %eax
        testl   %eax, %eax
        je      .L532
        movl    20(%eax), %edx    d_child);
        dentry_stat.nr_dentry--;        /* For d_free, below */
        dentry_iput(dentry);
        parent = dentry->d_parent;
        d_free(dentry);
        if (parent != dentry)
                dput(parent);
        spin_lock(&dcache_lock);
}
Argh! An inline function. Do asm trick to it too:
static inline void prune_one_dentry(struct dentry * dentry)
{
        struct dentry * parent;
asm("#A");
        __d_drop(dentry);
asm("#B");
        list_del(&dentry->d_child);
asm("#C");
        dentry_stat.nr_dentry--;        /* For d_free, below */
asm("#D");
        dentry_iput(dentry);
asm("#E");
...
...
}
"make fs/dcache.s", rinse, repeat. You will discover that OOPS
happened after #D mark, inside dentry_iput wich is an inline too.
Will this ever end? Lickily, yes. After yet another round of asm
insertion, we arrive at:
static inline void dentry_iput(struct dentry * dentry)
{
        struct inode *inode = dentry->d_inode;
        if (inode) {
asm("#K");
                dentry->d_inode = NULL;
asm("#L");
                list_del_init(&dentry->d_alias);
asm("#M");
                spin_unlock(&dentry->d_lock);
asm("#N");
                spin_unlock(&dcache_lock);
asm("#O");
                if (dentry->d_op && dentry->d_op->d_iput)
{
asm("#P");
                        dentry->d_op->d_iput(dentry, inode);
}
                else
...
Which corresponds to this part of new dcache.s:
.L517:
#APP
        #O
#NO_APP
        movl    68(%ebx), %eax
        testl   %eax, %eax
        je      .L532
        movl    20(%eax), %edx   d_op && dentry->d_op->d_iput)" condition
check, and it is oopsing trying to do second check. dentry->d_op
contains bogus pointer value 0x00000e00.
--
vda
               
               
               

本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u1/33173/showart_698092.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP