- 论坛徽章:
- 9
|
本帖最后由 Tinnal 于 2014-08-27 11:00 编辑
1,看这段代码有个疑问:用printk_cpu == this_cpu检查是否嵌套好像还不是很准确,如果在上面释放锁标红的地方被NMI抢占且NMI调用printk的话,应该会发生死锁吧,不过这个概率应该很小?
重新看了一个内核源码,在NMI的处理上,的确现在的处理流程会有问题。
邮件列表也有相关的讨论:
Liu, Chuansheng 在2012/7/4 想到的是检测到在NMI环境里头,就别打印了,详情如下:
https://lkml.org/lkml/2012/7/4/273
Petr Mladek 在2014/5/9/时,发了一个1000多行的patchset,想达到的效果是想让NMI也能支持打印。
https://lkml.org/lkml/2014/5/9/118
但linus对这个事情很反对:
On Tue, Jun 10, 2014 at 9:46 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
> There is also a big risk that if we push back this bugfix, nobody will actually do
> that desired rewrite.
>
> Lets be crazy and Cc Linus on that.
Quite frankly, I hate seeing something like this:
kernel/printk/printk.c | 1218 +++++++++++++++++++++++++----------
for something that is stupid and broken. Printing from NMI context
isn't really supposed to work, and we all *know* it's not supposed to
work.
I'd much rather disallow it, and if there is one or two places that
really want to print a warning and know that they are in NMI context,
have a special workaround just for them, with something that does
*not* try to make printk in general work any better.
Dammit, NMI context is special. I absolutely refuse to buy into the
broken concept that we should make more stuff work in NMI context.
Hell no, we should *not* try to make more crap work in NMI. NMI people
should be careful.
Make a trivial "printk_nmi()" wrapper that tries to do a trylock on
logbuf_lock, and *maybe* the existing sequence of
if (console_trylock_for_printk())
console_unlock();
then works for actually triggering the printout. But the wrapper
should be 15 lines of code for "if possible, try to print things", and
*not* a thousand lines of changes.
Linus
On Wed, Jun 18, 2014 at 05:58:40AM -1000, Linus Torvalds wrote:
> On Jun 18, 2014 4:36 AM, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> wrote:
> >
> > I could easily add an option to RCU to allow people to tell it not to
> > use NMIs to dump the stack.
>
> I don't think it should be an "option".
>
> We should stop using nmi as if it was something "normal". It isn't. Code
> running in nmi context should be special, and should be very very aware
> that it is special. That goes way beyond "don't use printk". We seem to
> have gone way way too far in using nmi context.
>
> So we should get *rid* of code in nmi context rather than then complain
> about printk being buggy.
他们最大的争议是在RCU检测里,用的到NMI,同时也需要打栈。
邮件中间也提出过免锁printk方案,以下目前suse的修正方案(专门给NMI提供了空间存储)这个方案比Petr Mladek的方案代码量少点,只加了2百多行:
http://kernel.suse.com/cgit/kern ... 99f05dbd9c9742b14c9。
唉,看来这个话题是又是一个持久战呀,目前3.16的printk代码都还不支持NMI里头调用。 |
|