免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
123下一页
最近访问板块 发新帖
查看: 6594 | 回复: 20
打印 上一主题 下一主题

[内核模块] 碰到一个死锁的情况,大侠们帮看看我的分析思路对吗?怎么越分析越糊涂 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2013-10-31 19:00 |只看该作者 |正序浏览
本帖最后由 FlankerSky 于 2013-10-31 19:00 编辑

新手,没有深入的分析过死锁,以前碰到过类似嵌套的那种死锁,能看出来,这次这个真是死活都看不出来,下面是我的分析,各位大侠指点下:

1.基本情况描述:
红色的那部分是一个内核线程(kks_tx),它的作用是处理上层下来的数据包(上层下来的包是在netfilter的local_out处被我加到队列里去的),上层应用发下来的数据包,出于业务需要,有可能会被我写的内核模块直接响应,所以下面会调用netif_receive_skb给上层应用发送数据包。在tcp的入口tcp_v4_rcv这里是有一个自旋锁的,kks_tx在获得自旋锁后,继续执行,但是接下来我就看不懂了(红色上面的那部分),是被e1000e的中断给中断掉了吗?然后接着往下执行到tcp_v4_rcv,就会造成死锁???
2.但是我的代码是跑在vmware里的,不管我把vmware设置成单核还是多核都会出现这种情况,根据LDD3中说的,在非smp上,自旋锁不是没有任何作用吗?
而且在LDD3中也说,自旋锁期间是禁止本地CPU中断的啊,那这样就说不通了啊。

我是越分析越糊涂,大侠们帮我看看我的思路对不对啊?是不是方向错了?谢谢了!




[ 3182.546342] BUG: soft lockup - CPU#0 stuck for 22s! [KksTx:2742]
[ 3182.547942] Modules linked in: kksfilter(O) e1000e(O) vmhgfs(O) vsock(O) acpiphp vmwgfx ttm drm snd_ens1371 gameport snd_ac97_codec ac97_bus snd_pcm vmw_balloon snd_seq_midi snd_rawmidi snd_seq_midi_event psmouse snd_seq serio_raw snd_timer snd_seq_device snd joydev soundcore snd_page_alloc bnep rfcomm bluetooth parport_pc ppdev vmci(O) shpchp mac_hid i2c_piix4 lp parport usbhid hid floppy mptspi mptscsih mptbase vmxnet(O) vmw_pvscsi vmxnet3 [last unloaded: e1000e]
[ 3182.560267] irq event stamp: 0
[ 3182.561060] hardirqs last  enabled at (0): [<  (null)>]   (null)
[ 3182.562649] hardirqs last disabled at (0): [<c1058564>] copy_process+0x484/0x1170
[ 3182.564637] softirqs last  enabled at (0): [<c1058564>] copy_process+0x484/0x1170
[ 3182.566563] softirqs last disabled at (0): [<  (null)>]   (null)
[ 3182.568115] Modules linked in: kksfilter(O) e1000e(O) vmhgfs(O) vsock(O) acpiphp vmwgfx ttm drm snd_ens1371 gameport snd_ac97_codec ac97_bus snd_pcm vmw_balloon snd_seq_midi snd_rawmidi snd_seq_midi_event psmouse snd_seq serio_raw snd_timer snd_seq_device snd joydev soundcore snd_page_alloc bnep rfcomm bluetooth parport_pc ppdev vmci(O) shpchp mac_hid i2c_piix4 lp parport usbhid hid floppy mptspi mptscsih mptbase vmxnet(O) vmw_pvscsi vmxnet3 [last unloaded: e1000e]
[ 3182.580028]
[ 3182.580431] Pid: 2742, comm: KksTx Tainted: G           O 3.2.6 #18 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[ 3182.583631] EIP: 0060:[<c12ba6df>] EFLAGS: 00000282 CPU: 0
[ 3182.585043] EIP is at delay_tsc+0x1f/0x70
[ 3182.586068] EAX: 7170b643 EBX: f0a98d34 ECX: f0a98d34 EDX: 000007db
[ 3182.587670] ESI: 00000000 EDI: 00000001 EBP: f680dbb4 ESP: f680dba4
[ 3182.589252]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 3182.590632] Process KksTx (pid: 2742, ti=f680c000 task=e604ea80 task.ti=f4de6000)
[ 3182.592516] Stack:
[ 3182.593046]  7170b5d7 f0a98d34 280c08d0 00000000 f680dbbc c12ba62e f680dbdc c12c150b
[ 3182.595284]  00000001 a087b7e8 00000001 f0a98d34 c1e896c0 f0a98d00 f680dbf8 c159eefd
[ 3182.597516]  00000000 00000002 00000000 c14f6f2a f4c91600 f680dc4c c14f6f2a 00000050
[ 3182.599751] Call Trace:
[ 3182.600390]  [<c12ba62e>] __delay+0xe/0x10
[ 3182.601454]  [<c12c150b>] do_raw_spin_lock+0xab/0xf0
[ 3182.602706]  [<c159eefd>] _raw_spin_lock_nested+0x3d/0x50
[ 3182.604087]  [<c14f6f2a>] ? tcp_v4_rcv+0x76a/0xaa0
[ 3182.605285]  [<c14f6f2a>] tcp_v4_rcv+0x76a/0xaa0
[ 3182.606451]  [<c14d4b8f>] ip_local_deliver_finish+0xdf/0x380
[ 3182.607866]  [<c14d4aec>] ? ip_local_deliver_finish+0x3c/0x380
[ 3182.609318]  [<c14d4fbf>] ip_local_deliver+0x7f/0x90
[ 3182.610578]  [<c14d46be>] ip_rcv_finish+0x16e/0x560
[ 3182.611804]  [<c14d522a>] ip_rcv+0x25a/0x320
[ 3182.612880]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.614193]  [<c14a63d2>] __netif_receive_skb+0x4c2/0x570
[ 3182.615547]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.616918]  [<c14a707b>] netif_receive_skb+0xcb/0xe0
[ 3182.618187]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.619517]  [<f848e33e>] SendSkb2Middle+0x162/0x176 [kksfilter]
[ 3182.621019]  [<f848f7ac>] KksNatHandler+0x612/0x86d [kksfilter]
[ 3182.622516]  [<c12ba291>] ? sscanf+0x11/0x14
[ 3182.623592]  [<f84903a4>] ? kks_inet_addr+0x37/0x4f [kksfilter]
[ 3182.625069]  [<f848fad6>] kks_rx+0x4e/0xb9 [kksfilter]
[ 3182.626362]  [<f848fc52>] hook_local_in+0x111/0x1d7 [kksfilter]
[ 3182.627838]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.629145]  [<c14cd663>] nf_iterate+0x63/0x90
[ 3182.630274]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.631582]  [<c14cd722>] nf_hook_slow+0x92/0x150
[ 3182.632761]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.634069]  [<c14d521a>] ip_rcv+0x24a/0x320
[ 3182.635155]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.636465]  [<c14a63d2>] __netif_receive_skb+0x4c2/0x570
[ 3182.638624]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.639996]  [<c14a707b>] netif_receive_skb+0xcb/0xe0
[ 3182.641250]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.642564]  [<c14a71a7>] napi_skb_finish+0x37/0x50
[ 3182.643781]  [<c14a7701>] napi_gro_receive+0xa1/0xb0
[ 3182.645060]  [<f9d41726>] e1000_receive_skb+0xc6/0x170 [e1000e]
[ 3182.646577]  [<c149a19d>] ? __kfree_skb+0x3d/0x90
[ 3182.647779]  [<f9d43166>] e1000_clean_rx_irq+0x206/0x340 [e1000e]
[ 3182.649419]  [<f9d49b74>] e1000e_poll+0x64/0x2d0 [e1000e]
[ 3182.650768]  [<c14a790d>] net_rx_action+0x12d/0x240
[ 3182.651988]  [<c1060e30>] ? local_bh_enable+0xd0/0xd0
[ 3182.653245]  [<c1060ec9>] __do_softirq+0x99/0x1d0
[ 3182.654429]  [<c1060e30>] ? local_bh_enable+0xd0/0xd0
[ 3182.655688]  <IRQ>
[ 3182.656264]  [<c106123e>] ? irq_exit+0x7e/0xa0
[ 3182.657406]  [<c15a6f2b>] ? do_IRQ+0x4b/0xc0
[ 3182.658496]  [<c15a6d75>] ? common_interrupt+0x35/0x3c
[ 3182.659796]  [<c14f00d8>] ? tcp_connect+0x318/0x490
[ 3182.661011]  [<c14e6841>] ? tcp_validate_incoming+0x71/0x340
[ 3182.662421]  [<c14f8010>] ? tcp_check_req+0x260/0x4b0
[ 3182.663676]  [<c14ecaf7>] ? tcp_rcv_state_process+0x47/0xb80
[ 3182.665078]  [<c14f8093>] ? tcp_check_req+0x2e3/0x4b0
[ 3182.666338]  [<c14f7ced>] ? tcp_child_process+0x8d/0x150
[ 3182.667660]  [<c14f5d8a>] ? tcp_v4_do_rcv+0x2aa/0x3c0
[ 3182.668922]  [<c12c149b>] ? do_raw_spin_lock+0x3b/0xf0
[ 3182.670202]  [<c159eefd>] ? _raw_spin_lock_nested+0x3d/0x50
[ 3182.671603]  [<c14f6f4a>] ? tcp_v4_rcv+0x78a/0xaa0
[ 3182.672800]  [<c14d4b8f>] ? ip_local_deliver_finish+0xdf/0x380
[ 3182.674263]  [<c14d4aec>] ? ip_local_deliver_finish+0x3c/0x380
[ 3182.675713]  [<c14d4fbf>] ? ip_local_deliver+0x7f/0x90
[ 3182.676991]  [<c14d46be>] ? ip_rcv_finish+0x16e/0x560
[ 3182.678259]  [<c14d522a>] ? ip_rcv+0x25a/0x320
[ 3182.679371]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.680670]  [<c14a63d2>] ? __netif_receive_skb+0x4c2/0x570
[ 3182.682057]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.683436]  [<c14a707b>] ? netif_receive_skb+0xcb/0xe0
[ 3182.684730]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.686027]  [<f848e33e>] ? SendSkb2Middle+0x162/0x176 [kksfilter]
[ 3182.687568]  [<f848f867>] ? KksNatHandler+0x6cd/0x86d [kksfilter]
[ 3182.689094]  [<c1095000>] ? futex_wait_requeue_pi+0x1c0/0x380
[ 3182.690526]  [<f848fa67>] ? kks_tx+0x60/0x81 [kksfilter]
[ 3182.691859]  [<f848fa07>] ? KksNatHandler+0x86d/0x86d [kksfilter]
[ 3182.693386]  [<c1079df8>] ? kthread+0x78/0x80
[ 3182.694488]  [<c1079d80>] ? __init_kthread_worker+0x60/0x60
[ 3182.695876]  [<c15a6d82>] ? kernel_thread_helper+0x6/0x10

[ 3182.697220] Code: c3 8d 74 26 00 8d bc 27 00 00 00 00 55 89 e5 57 56 53 83 ec 04 3e 8d 74 26 00 64 8b 35 ec 5e 91 c1 89 c7 8d 76 00 0f ae e8 0f 31 <8d> 74 26 00 89 45 f0 eb 0d f3 90 64 8b 1d ec 5e 91 c1 39 de 75
[ 3182.704103] Call Trace:
[ 3182.704733]  [<c12ba62e>] __delay+0xe/0x10
[ 3182.705761]  [<c12c150b>] do_raw_spin_lock+0xab/0xf0
[ 3182.707016]  [<c159eefd>] _raw_spin_lock_nested+0x3d/0x50
[ 3182.708359]  [<c14f6f2a>] ? tcp_v4_rcv+0x76a/0xaa0
[ 3182.709555]  [<c14f6f2a>] tcp_v4_rcv+0x76a/0xaa0
[ 3182.710715]  [<c14d4b8f>] ip_local_deliver_finish+0xdf/0x380
[ 3182.712119]  [<c14d4aec>] ? ip_local_deliver_finish+0x3c/0x380
[ 3182.713565]  [<c14d4fbf>] ip_local_deliver+0x7f/0x90
[ 3182.714807]  [<c14d46be>] ip_rcv_finish+0x16e/0x560
[ 3182.716023]  [<c14d522a>] ip_rcv+0x25a/0x320
[ 3182.717090]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.718410]  [<c14a63d2>] __netif_receive_skb+0x4c2/0x570
[ 3182.719756]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.721120]  [<c14a707b>] netif_receive_skb+0xcb/0xe0
[ 3182.722391]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.723693]  [<f848e33e>] SendSkb2Middle+0x162/0x176 [kksfilter]
[ 3182.725185]  [<f848f7ac>] KksNatHandler+0x612/0x86d [kksfilter]
[ 3182.726665]  [<c12ba291>] ? sscanf+0x11/0x14
[ 3182.727788]  [<f84903a4>] ? kks_inet_addr+0x37/0x4f [kksfilter]
[ 3182.729257]  [<f848fad6>] kks_rx+0x4e/0xb9 [kksfilter]
[ 3182.730574]  [<f848fc52>] hook_local_in+0x111/0x1d7 [kksfilter]
[ 3182.732044]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.733344]  [<c14cd663>] nf_iterate+0x63/0x90
[ 3182.734457]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.735758]  [<c14cd722>] nf_hook_slow+0x92/0x150
[ 3182.736979]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.738293]  [<c14d521a>] ip_rcv+0x24a/0x320
[ 3182.739365]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.740660]  [<c14a63d2>] __netif_receive_skb+0x4c2/0x570
[ 3182.741997]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.743364]  [<c14a707b>] netif_receive_skb+0xcb/0xe0
[ 3182.744622]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.745921]  [<c14a71a7>] napi_skb_finish+0x37/0x50
[ 3182.747146]  [<c14a7701>] napi_gro_receive+0xa1/0xb0
[ 3182.748385]  [<f9d41726>] e1000_receive_skb+0xc6/0x170 [e1000e]
[ 3182.749856]  [<c149a19d>] ? __kfree_skb+0x3d/0x90
[ 3182.751043]  [<f9d43166>] e1000_clean_rx_irq+0x206/0x340 [e1000e]
[ 3182.752559]  [<f9d49b74>] e1000e_poll+0x64/0x2d0 [e1000e]
[ 3182.753901]  [<c14a790d>] net_rx_action+0x12d/0x240
[ 3182.755126]  [<c1060e30>] ? local_bh_enable+0xd0/0xd0
[ 3182.756384]  [<c1060ec9>] __do_softirq+0x99/0x1d0
[ 3182.757555]  [<c1060e30>] ? local_bh_enable+0xd0/0xd0
[ 3182.758823]  <IRQ>  [<c106123e>] ? irq_exit+0x7e/0xa0
[ 3182.760124]  [<c15a6f2b>] ? do_IRQ+0x4b/0xc0
[ 3182.761192]  [<c15a6d75>] ? common_interrupt+0x35/0x3c
[ 3182.762478]  [<c14f00d8>] ? tcp_connect+0x318/0x490
[ 3182.763694]  [<c14e6841>] ? tcp_validate_incoming+0x71/0x340
[ 3182.765101]  [<c14f8010>] ? tcp_check_req+0x260/0x4b0
[ 3182.766370]  [<c14ecaf7>] ? tcp_rcv_state_process+0x47/0xb80
[ 3182.767775]  [<c14f8093>] ? tcp_check_req+0x2e3/0x4b0
[ 3182.769032]  [<c14f7ced>] ? tcp_child_process+0x8d/0x150
[ 3182.770366]  [<c14f5d8a>] ? tcp_v4_do_rcv+0x2aa/0x3c0
[ 3182.771626]  [<c12c149b>] ? do_raw_spin_lock+0x3b/0xf0
[ 3182.772905]  [<c159eefd>] ? _raw_spin_lock_nested+0x3d/0x50
[ 3182.774297]  [<c14f6f4a>] ? tcp_v4_rcv+0x78a/0xaa0
[ 3182.775494]  [<c14d4b8f>] ? ip_local_deliver_finish+0xdf/0x380
[ 3182.776945]  [<c14d4aec>] ? ip_local_deliver_finish+0x3c/0x380
[ 3182.778409]  [<c14d4fbf>] ? ip_local_deliver+0x7f/0x90
[ 3182.779693]  [<c14d46be>] ? ip_rcv_finish+0x16e/0x560
[ 3182.780952]  [<c14d522a>] ? ip_rcv+0x25a/0x320
[ 3182.782064]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.783379]  [<c14a63d2>] ? __netif_receive_skb+0x4c2/0x570
[ 3182.784762]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.786125]  [<c14a707b>] ? netif_receive_skb+0xcb/0xe0
[ 3182.787536]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.788836]  [<f848e33e>] ? SendSkb2Middle+0x162/0x176 [kksfilter]
[ 3182.790382]  [<f848f867>] ? KksNatHandler+0x6cd/0x86d [kksfilter]
[ 3182.791890]  [<c1095000>] ? futex_wait_requeue_pi+0x1c0/0x380
[ 3182.793313]  [<f848fa67>] ? kks_tx+0x60/0x81 [kksfilter]
[ 3182.794645]  [<f848fa07>] ? KksNatHandler+0x86d/0x86d [kksfilter]
[ 3182.796155]  [<c1079df8>] ? kthread+0x78/0x80
[ 3182.797244]  [<c1079d80>] ? __init_kthread_worker+0x60/0x60
[ 3182.798634]  [<c15a6d82>] ? kernel_thread_helper+0x6/0x10
[ 3182.799988] Kernel panic - not syncing: softlockup: hung tasks
[ 3182.801440] Pid: 2742, comm: KksTx Tainted: G           O 3.2.6 #18
[ 3182.803002] Call Trace:
[ 3182.803629]  [<c1594f9f>] ? printk+0x1d/0x1f
[ 3182.804696]  [<c1594e75>] panic+0x5c/0x169
[ 3182.805746]  [<c10bd4d1>] watchdog_timer_fn+0x151/0x160
[ 3182.807060]  [<c107e28d>] __run_hrtimer+0x6d/0x1b0
[ 3182.808255]  [<c10bd380>] ? __touch_watchdog+0x20/0x20
[ 3182.809533]  [<c107ecd5>] hrtimer_interrupt+0xe5/0x260
[ 3182.810826]  [<c107ed51>] ? hrtimer_interrupt+0x161/0x260
[ 3182.812169]  [<c15a6ff4>] smp_apic_timer_interrupt+0x54/0x88
[ 3182.813575]  [<c12bb8d8>] ? trace_hardirqs_off_thunk+0xc/0x14
[ 3182.815016]  [<c159fe02>] apic_timer_interrupt+0x36/0x3c
[ 3182.816336]  [<c12ba6df>] ? delay_tsc+0x1f/0x70
[ 3182.817465]  [<c12ba62e>] __delay+0xe/0x10
[ 3182.818497]  [<c12c150b>] do_raw_spin_lock+0xab/0xf0
[ 3182.819733]  [<c159eefd>] _raw_spin_lock_nested+0x3d/0x50
[ 3182.821081]  [<c14f6f2a>] ? tcp_v4_rcv+0x76a/0xaa0
[ 3182.822287]  [<c14f6f2a>] tcp_v4_rcv+0x76a/0xaa0
[ 3182.823440]  [<c14d4b8f>] ip_local_deliver_finish+0xdf/0x380
[ 3182.824856]  [<c14d4aec>] ? ip_local_deliver_finish+0x3c/0x380
[ 3182.826314]  [<c14d4fbf>] ip_local_deliver+0x7f/0x90
[ 3182.827549]  [<c14d46be>] ip_rcv_finish+0x16e/0x560
[ 3182.828763]  [<c14d522a>] ip_rcv+0x25a/0x320
[ 3182.829831]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.831143]  [<c14a63d2>] __netif_receive_skb+0x4c2/0x570
[ 3182.832484]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.833848]  [<c14a707b>] netif_receive_skb+0xcb/0xe0
[ 3182.835108]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.836408]  [<f848e33e>] SendSkb2Middle+0x162/0x176 [kksfilter]
[ 3182.837899]  [<f848f7ac>] KksNatHandler+0x612/0x86d [kksfilter]
[ 3182.839384]  [<c12ba291>] ? sscanf+0x11/0x14
[ 3182.840453]  [<f84903a4>] ? kks_inet_addr+0x37/0x4f [kksfilter]
[ 3182.841913]  [<f848fad6>] kks_rx+0x4e/0xb9 [kksfilter]
[ 3182.843194]  [<f848fc52>] hook_local_in+0x111/0x1d7 [kksfilter]
[ 3182.844659]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.845954]  [<c14cd663>] nf_iterate+0x63/0x90
[ 3182.847065]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.848361]  [<c14cd722>] nf_hook_slow+0x92/0x150
[ 3182.849531]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.850840]  [<c14d521a>] ip_rcv+0x24a/0x320
[ 3182.851909]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.853209]  [<c14a63d2>] __netif_receive_skb+0x4c2/0x570
[ 3182.854561]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.855921]  [<c14a707b>] netif_receive_skb+0xcb/0xe0
[ 3182.857175]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.858483]  [<c14a71a7>] napi_skb_finish+0x37/0x50
[ 3182.859698]  [<c14a7701>] napi_gro_receive+0xa1/0xb0
[ 3182.860939]  [<f9d41726>] e1000_receive_skb+0xc6/0x170 [e1000e]
[ 3182.862420]  [<c149a19d>] ? __kfree_skb+0x3d/0x90
[ 3182.863594]  [<f9d43166>] e1000_clean_rx_irq+0x206/0x340 [e1000e]
[ 3182.865114]  [<f9d49b74>] e1000e_poll+0x64/0x2d0 [e1000e]
[ 3182.866474]  [<c14a790d>] net_rx_action+0x12d/0x240
[ 3182.867692]  [<c1060e30>] ? local_bh_enable+0xd0/0xd0
[ 3182.868950]  [<c1060ec9>] __do_softirq+0x99/0x1d0
[ 3182.870123]  [<c1060e30>] ? local_bh_enable+0xd0/0xd0
[ 3182.871397]  <IRQ>  [<c106123e>] ? irq_exit+0x7e/0xa0
[ 3182.872700]  [<c15a6f2b>] ? do_IRQ+0x4b/0xc0
[ 3182.873767]  [<c15a6d75>] ? common_interrupt+0x35/0x3c
[ 3182.875060]  [<c14f00d8>] ? tcp_connect+0x318/0x490
[ 3182.876276]  [<c14e6841>] ? tcp_validate_incoming+0x71/0x340
[ 3182.877682]  [<c14f8010>] ? tcp_check_req+0x260/0x4b0
[ 3182.878945]  [<c14ecaf7>] ? tcp_rcv_state_process+0x47/0xb80
[ 3182.880351]  [<c14f8093>] ? tcp_check_req+0x2e3/0x4b0
[ 3182.881613]  [<c14f7ced>] ? tcp_child_process+0x8d/0x150
[ 3182.882945]  [<c14f5d8a>] ? tcp_v4_do_rcv+0x2aa/0x3c0
[ 3182.884199]  [<c12c149b>] ? do_raw_spin_lock+0x3b/0xf0
[ 3182.885473]  [<c159eefd>] ? _raw_spin_lock_nested+0x3d/0x50
[ 3182.886867]  [<c14f6f4a>] ? tcp_v4_rcv+0x78a/0xaa0
[ 3182.888062]  [<c14d4b8f>] ? ip_local_deliver_finish+0xdf/0x380
[ 3182.889510]  [<c14d4aec>] ? ip_local_deliver_finish+0x3c/0x380
[ 3182.890966]  [<c14d4fbf>] ? ip_local_deliver+0x7f/0x90
[ 3182.892245]  [<c14d46be>] ? ip_rcv_finish+0x16e/0x560
[ 3182.893504]  [<c14d522a>] ? ip_rcv+0x25a/0x320
[ 3182.894626]  [<c14d4550>] ? inet_del_protocol+0x30/0x30
[ 3182.895925]  [<c14a63d2>] ? __netif_receive_skb+0x4c2/0x570
[ 3182.897308]  [<c14a5fec>] ? __netif_receive_skb+0xdc/0x570
[ 3182.898680]  [<c14a707b>] ? netif_receive_skb+0xcb/0xe0
[ 3182.899980]  [<c14a6fcf>] ? netif_receive_skb+0x1f/0xe0
[ 3182.901281]  [<f848e33e>] ? SendSkb2Middle+0x162/0x176 [kksfilter]
[ 3182.902820]  [<f848f867>] ? KksNatHandler+0x6cd/0x86d [kksfilter]
[ 3182.904330]  [<c1095000>] ? futex_wait_requeue_pi+0x1c0/0x380
[ 3182.905777]  [<f848fa67>] ? kks_tx+0x60/0x81 [kksfilter]
[ 3182.907106]  [<f848fa07>] ? KksNatHandler+0x86d/0x86d [kksfilter]
[ 3182.908614]  [<c1079df8>] ? kthread+0x78/0x80
[ 3182.909703]  [<c1079d80>] ? __init_kthread_worker+0x60/0x60
[ 3182.911102]  [<c15a6d82>] ? kernel_thread_helper+0x6/0x10

论坛徽章:
1
2015年辞旧岁徽章
日期:2015-03-03 16:54:15
21 [报告]
发表于 2013-11-09 11:08 |只看该作者
回复 19# jinsdb


我只是鸡蛋中挑骨头的指出现在netif_receive_skb不仅仅执行在中断上下文,还有可能执行的内核线程中。   
netif_receive_skb的那段注释最近的改动是2010年, 而run_ksoftirqd最近的改动是2012年,所以注释中的may only现在已经
不准确了。

论坛徽章:
15
射手座
日期:2014-02-26 13:45:082015年迎新春徽章
日期:2015-03-04 09:54:452015年辞旧岁徽章
日期:2015-03-03 16:54:15羊年新春福章
日期:2015-02-26 08:47:552015年亚洲杯之卡塔尔
日期:2015-02-03 08:33:45射手座
日期:2014-12-31 08:36:51水瓶座
日期:2014-06-04 08:33:52天蝎座
日期:2014-05-14 14:30:41天秤座
日期:2014-04-21 08:37:08处女座
日期:2014-04-18 16:57:05戌狗
日期:2014-04-04 12:21:33技术图书徽章
日期:2014-03-25 09:00:29
20 [报告]
发表于 2013-11-08 11:01 |只看该作者
jinsdb 发表于 2013-11-07 23:09
to kmindg & humjb_1983

看来大家比较纠结"netif_receive_skb是运行在软中断上下文" 这句话,仔细看一下 ...

呵呵,误会了吧,我没有否认netif_receive_skb运行在软中断上下文,之前的一处纠结在于:ksoftirq最终也可能调用netif_receive_skb,所以该函数不能完全说“只能运行在软中断上下文”。

论坛徽章:
0
19 [报告]
发表于 2013-11-07 23:09 |只看该作者
to kmindg & humjb_1983

看来大家比较纠结"netif_receive_skb是运行在软中断上下文" 这句话,仔细看一下netif_receive_skb的代码注释吧,linux内核多个内核版本都是清晰了标注了
“This function may only be called from softirq context ”

/**
2270 *      netif_receive_skb - process receive buffer from network
2271 *      @skb: buffer to process
2272 *
2273 *      netif_receive_skb() is the main receive data processing function.
2274 *      It always succeeds. The buffer may be dropped during processing
2275 *      for congestion control or by the protocol layers.
2276 *
2277 *      This function may only be called from softirq context and interrupts
2278 *      should be enabled.
2279 *
2280 *      Return values (usually ignored):
2281 *      NET_RX_SUCCESS: no congestion
2282 *      NET_RX_DROP: packet was dropped
2283 */
2284int netif_receive_skb(struct sk_buff *skb)
2285{
2286        struct packet_type *ptype, *pt_prev;
2287        struct net_device *orig_dev;
2288        struct net_device *null_or_orig;
2289        int ret = NET_RX_DROP;
2290        __be16 type;
2291
2292        if (!skb->tstamp.tv64)
2293                net_timestamp(skb);
2294
2295        if (skb->vlan_tci && vlan_hwaccel_do_receive(skb))
2296                return NET_RX_SUCCESS;
2297
2298        /* if we've gotten here through NAPI, check netpoll */
2299        if (netpoll_receive_skb(skb))
2300                return NET_RX_DROP;


netif_receive_skb是在网卡驱动接收到数据包,将skb投递协议栈处理的时候调用的,所以netif_receive_skb是是在网卡软中断上下文调用的,包括后续
的投递的协议栈(如果是TCP包)的ip_rcv,tcp_v4_rcv,tcp_v4_do_rcv,协议栈的处理过程都是在软中断上下文完成,第一眼看到这个调用堆栈居然有人这么调用,
感到蛮诧异的,更没有想到还有的人会认为netif_receive_skb不应该运行在软中断上下文。

ksoftirqd是软中断的辅助处理线程,理论上所有的软中断,都有机会(可能)被ksoftirqd调用执行,是否这样就否认软中断上下文的存在?如果有人不相信netif_receive_skb
是在软中断上下文调用的,在netif_receive_skb函数中加个sleep编个内核试试看。内核提供宏in_softirq()  可以方便确认当前函数运行在哪一种上下文,为什么函数要明确调用上下文,这和资源的互斥访问相关,资源的互斥访问就得用锁,分不清调用上下文的,也必然不会用锁。

论坛徽章:
1
丑牛
日期:2013-10-23 14:49:51
18 [报告]
发表于 2013-11-07 11:03 |只看该作者
在系统原有的do_softirq中,netif_receive_skb 不可重入,不管是在ksoftirqd中运行还是中断里,都不可重入,所以如果你的netif_receive_skb前关闭了中断,应该不是AA死锁
回复 6# humjb_1983


   

论坛徽章:
15
射手座
日期:2014-02-26 13:45:082015年迎新春徽章
日期:2015-03-04 09:54:452015年辞旧岁徽章
日期:2015-03-03 16:54:15羊年新春福章
日期:2015-02-26 08:47:552015年亚洲杯之卡塔尔
日期:2015-02-03 08:33:45射手座
日期:2014-12-31 08:36:51水瓶座
日期:2014-06-04 08:33:52天蝎座
日期:2014-05-14 14:30:41天秤座
日期:2014-04-21 08:37:08处女座
日期:2014-04-18 16:57:05戌狗
日期:2014-04-04 12:21:33技术图书徽章
日期:2014-03-25 09:00:29
17 [报告]
发表于 2013-11-04 16:46 |只看该作者
回复 16# FlankerSky
呵呵,不用客气的,多交流学习~~

   

论坛徽章:
0
16 [报告]
发表于 2013-11-04 15:06 |只看该作者
回复 14# humjb_1983

目前用的netif_rx还可以,暂时没发现问题,当然还需长时间测试。然后,我再着手实现下你说的那种方法,多谢兄台指点了,帮了我很大忙!感激不尽!

   

论坛徽章:
15
射手座
日期:2014-02-26 13:45:082015年迎新春徽章
日期:2015-03-04 09:54:452015年辞旧岁徽章
日期:2015-03-03 16:54:15羊年新春福章
日期:2015-02-26 08:47:552015年亚洲杯之卡塔尔
日期:2015-02-03 08:33:45射手座
日期:2014-12-31 08:36:51水瓶座
日期:2014-06-04 08:33:52天蝎座
日期:2014-05-14 14:30:41天秤座
日期:2014-04-21 08:37:08处女座
日期:2014-04-18 16:57:05戌狗
日期:2014-04-04 12:21:33技术图书徽章
日期:2014-03-25 09:00:29
15 [报告]
发表于 2013-11-04 10:37 |只看该作者
kmindg 发表于 2013-11-02 16:01
回复 12# jinsdb

确实不对~~,netif_receive_skb的确可以运行于ksoftirq内核线程中。
对于目前讨论的问题,ksoftirq也是调用do_softirq入口,netif_receive_skb不可重入同样是由do_softirq保证的~~

论坛徽章:
15
射手座
日期:2014-02-26 13:45:082015年迎新春徽章
日期:2015-03-04 09:54:452015年辞旧岁徽章
日期:2015-03-03 16:54:15羊年新春福章
日期:2015-02-26 08:47:552015年亚洲杯之卡塔尔
日期:2015-02-03 08:33:45射手座
日期:2014-12-31 08:36:51水瓶座
日期:2014-06-04 08:33:52天蝎座
日期:2014-05-14 14:30:41天秤座
日期:2014-04-21 08:37:08处女座
日期:2014-04-18 16:57:05戌狗
日期:2014-04-04 12:21:33技术图书徽章
日期:2014-03-25 09:00:29
14 [报告]
发表于 2013-11-04 10:34 |只看该作者
FlankerSky 发表于 2013-11-01 16:09
回复 10# humjb_1983

看了一些资料,觉得在netif_receive_skb前禁止所有中断确实不太合适。但是,不直接 ...

直接调netif_rx应该也是不合适的,效果不比netif_receive_skb好,不确认你具体要实现怎样的功能~~,如果一定要直接调用netif_receive_skb,可以尝试模仿do_softirq中的处理,在调用前先调用local_bh_disable(),增加软中断计数,防止软中断重入,这样比直接关中断副作用稍小,但关闭时间也不能太长,需要保证~

论坛徽章:
1
2015年辞旧岁徽章
日期:2015-03-03 16:54:15
13 [报告]
发表于 2013-11-02 16:01 |只看该作者
回复 12# jinsdb


netif_receive_skb是运行在软中断上下文了
这句话对么? ksoftirqd不是也会处理软中断么? ksoftirqd应该是内核线程吧? 求解
  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP