Chinaunix

标题: linux无故死机 [打印本页]

作者: lys0212linux    时间: 2011-06-16 22:30
标题: linux无故死机
21:56分左右机器启动起来,22:23分左右突然ping不通,机房说屏幕黑屏,键盘也是用不了。21:56到22:23间的message信息如下:这是什么回事?
Jun 16 21:56:49 localhost kernel: EXT3-fs (sdc1): using internal journal
Jun 16 21:56:49 localhost kernel: EXT3-fs (sdc1): recovery complete
Jun 16 21:56:49 localhost kernel: EXT3-fs (sdc1): mounted filesystem with ordered data mode
Jun 16 21:57:46 localhost postfix/smtpd[1826]: sql_select option missing
Jun 16 21:57:46 localhost postfix/smtpd[1826]: auxpropfunc error no mechanism available
Jun 16 22:00:28 localhost postfix/smtpd[2055]: sql_select option missing
Jun 16 22:00:28 localhost postfix/smtpd[2055]: auxpropfunc error no mechanism available
Jun 16 22:00:41 localhost postfix/smtpd[2062]: sql_select option missing
Jun 16 22:00:41 localhost postfix/smtpd[2062]: auxpropfunc error no mechanism available
Jun 16 22:03:32 localhost kernel: nginx[1562]: segfault at 8 ip 00007f69b1d7bfa7 sp 00007fff76d80380 error 4 in libunwind.so.7.0.0[7f69b1d71000+f000]
Jun 16 22:11:36 localhost kernel: e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang:
Jun 16 22:11:36 localhost kernel:  TDH                  <d7>
Jun 16 22:11:36 localhost kernel:  TDT                  <a1>
Jun 16 22:11:36 localhost kernel:  next_to_use          <a1>
Jun 16 22:11:36 localhost kernel:  next_to_clean        <1b>
Jun 16 22:11:36 localhost kernel: buffer_info[next_to_clean]:
Jun 16 22:11:36 localhost kernel:  time_stamp           <1000ada99>
Jun 16 22:11:36 localhost kernel:  next_to_watch        <1d>
Jun 16 22:11:36 localhost kernel:  jiffies              <1000adff8>
Jun 16 22:11:36 localhost kernel:  next_to_watch.status <0>
Jun 16 22:11:36 localhost kernel: MAC Status             <2080783>
Jun 16 22:11:36 localhost kernel: PHY Status             <792d>
Jun 16 22:11:36 localhost kernel: PHY 1000BASE-T Status  <7c00>
Jun 16 22:11:36 localhost kernel: PHY Extended Status    <3000>
Jun 16 22:11:36 localhost kernel: PCI Status             <10>
Jun 16 22:11:38 localhost kernel: e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang:
Jun 16 22:11:38 localhost kernel:  TDH                  <d7>
Jun 16 22:11:38 localhost kernel:  TDT                  <a1>
Jun 16 22:11:38 localhost kernel:  next_to_use          <a1>
Jun 16 22:11:38 localhost kernel:  next_to_clean        <1b>
Jun 16 22:11:38 localhost kernel: buffer_info[next_to_clean]:
Jun 16 22:11:38 localhost kernel:  time_stamp           <1000ada99>
Jun 16 22:11:38 localhost kernel:  next_to_watch        <1d>
Jun 16 22:11:38 localhost kernel:  jiffies              <1000ae7c8>
Jun 16 22:11:38 localhost kernel:  next_to_watch.status <0>
Jun 16 22:11:38 localhost kernel: MAC Status             <2080783>
Jun 16 22:11:38 localhost kernel: PHY Status             <792d>
Jun 16 22:11:38 localhost kernel: PHY 1000BASE-T Status  <7c00>
Jun 16 22:11:38 localhost kernel: PHY Extended Status    <3000>
Jun 16 22:11:38 localhost kernel: PCI Status             <10>
Jun 16 22:11:40 localhost kernel: e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang:
Jun 16 22:11:40 localhost kernel:  TDH                  <d7>
Jun 16 22:11:40 localhost kernel:  TDT                  <a1>
Jun 16 22:11:40 localhost kernel:  next_to_use          <a1>
Jun 16 22:11:40 localhost kernel:  next_to_clean        <1b>
Jun 16 22:11:40 localhost kernel: buffer_info[next_to_clean]:
Jun 16 22:11:40 localhost kernel:  time_stamp           <1000ada99>
Jun 16 22:11:40 localhost kernel:  next_to_watch        <1d>
Jun 16 22:11:40 localhost kernel:  jiffies              <1000aef98>
Jun 16 22:11:40 localhost kernel:  next_to_watch.status <0>
Jun 16 22:11:40 localhost kernel: MAC Status             <2080783>
Jun 16 22:11:40 localhost kernel: PHY Status             <792d>
Jun 16 22:11:40 localhost kernel: PHY 1000BASE-T Status  <7c00>
Jun 16 22:11:40 localhost kernel: PHY Extended Status    <3000>
Jun 16 22:11:40 localhost kernel: PCI Status             <10>
Jun 16 22:11:42 localhost kernel: e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang:
Jun 16 22:11:42 localhost kernel:  TDH                  <d7>
Jun 16 22:11:42 localhost kernel:  TDT                  <a1>
Jun 16 22:11:42 localhost kernel:  next_to_use          <a1>
Jun 16 22:11:42 localhost kernel:  next_to_clean        <1b>
Jun 16 22:11:42 localhost kernel: buffer_info[next_to_clean]:
Jun 16 22:11:42 localhost kernel:  time_stamp           <1000ada99>
Jun 16 22:11:42 localhost kernel:  next_to_watch        <1d>
Jun 16 22:11:42 localhost kernel:  jiffies              <1000af768>
Jun 16 22:11:42 localhost kernel:  next_to_watch.status <0>
Jun 16 22:11:42 localhost kernel: MAC Status             <2080783>
Jun 16 22:11:42 localhost kernel: PHY Status             <792d>
Jun 16 22:11:42 localhost kernel: PHY 1000BASE-T Status  <7c00>
Jun 16 22:11:42 localhost kernel: PHY Extended Status    <3000>
Jun 16 22:11:42 localhost kernel: PCI Status             <10>
Jun 16 22:11:44 localhost kernel: e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang:
Jun 16 22:11:44 localhost kernel:  TDH                  <d7>
Jun 16 22:11:44 localhost kernel:  TDT                  <a1>
Jun 16 22:11:44 localhost kernel:  next_to_use          <a1>
Jun 16 22:11:44 localhost kernel:  next_to_clean        <1b>
Jun 16 22:11:44 localhost kernel: buffer_info[next_to_clean]:
Jun 16 22:11:44 localhost kernel:  time_stamp           <1000ada99>
Jun 16 22:11:44 localhost kernel:  next_to_watch        <1d>
Jun 16 22:11:44 localhost kernel:  jiffies              <1000aff38>
Jun 16 22:11:44 localhost kernel:  next_to_watch.status <0>
Jun 16 22:11:44 localhost kernel: MAC Status             <2080783>
Jun 16 22:11:44 localhost kernel: PHY Status             <792d>
Jun 16 22:11:44 localhost kernel: PHY 1000BASE-T Status  <7c00>
Jun 16 22:11:44 localhost kernel: PHY Extended Status    <3000>
Jun 16 22:11:44 localhost kernel: PCI Status             <10>
Jun 16 22:11:44 localhost kernel: ------------[ cut here ]------------
Jun 16 22:11:44 localhost kernel: WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x278/0x290()
Jun 16 22:11:44 localhost kernel: Hardware name: NF190D
Jun 16 22:11:44 localhost kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Jun 16 22:11:44 localhost kernel: Modules linked in: ext3 jbd xt_multiport xt_iprange xt_state iptable_filter ip_tables nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack nfs fscache nfsd lockd nfs_acl auth_rpcgss sunrpc ipv6 dm_mirror dm_region_hash dm_log dm_mod serio_raw pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support i5k_amb i5000_edac edac_core ioatdma dca sg e1000e shpchp ext4 mbcache jbd2 sr_mod cdrom aic94xx libsas sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas floppy radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Jun 16 22:11:44 localhost kernel: Pid: 0, comm: kworker/0:1 Not tainted 2.6.39 #1
Jun 16 22:11:44 localhost kernel: Call Trace:
Jun 16 22:11:44 localhost kernel: <IRQ>  [<ffffffff8106015f>] warn_slowpath_common+0x7f/0xc0
Jun 16 22:11:44 localhost kernel: [<ffffffff81060256>] warn_slowpath_fmt+0x46/0x50
Jun 16 22:11:44 localhost kernel: [<ffffffff8140e528>] dev_watchdog+0x278/0x290
Jun 16 22:11:44 localhost kernel: [<ffffffff810886e8>] ? sched_clock_cpu+0xb8/0x110
Jun 16 22:11:44 localhost kernel: [<ffffffff8106fc6e>] run_timer_softirq+0x15e/0x3a0
Jun 16 22:11:44 localhost kernel: [<ffffffff8140e2b0>] ? __netdev_watchdog_up+0x80/0x80
Jun 16 22:11:44 localhost kernel: [<ffffffff810281fd>] ? lapic_next_event+0x1d/0x30
Jun 16 22:11:44 localhost kernel: [<ffffffff81066eab>] __do_softirq+0xab/0x200
Jun 16 22:11:44 localhost kernel: [<ffffffff8108629e>] ? hrtimer_interrupt+0x15e/0x240
Jun 16 22:11:44 localhost kernel: [<ffffffff814c10fc>] call_softirq+0x1c/0x30
Jun 16 22:11:44 localhost kernel: [<ffffffff8100d355>] do_softirq+0x65/0xa0
Jun 16 22:11:44 localhost kernel: [<ffffffff81066cd5>] irq_exit+0xb5/0xc0
Jun 16 22:11:44 localhost kernel: [<ffffffff814c1a40>] smp_apic_timer_interrupt+0x70/0x9b
Jun 16 22:11:44 localhost kernel: [<ffffffff814c08b3>] apic_timer_interrupt+0x13/0x20
Jun 16 22:11:44 localhost kernel: <EOI>  [<ffffffff81013bf5>] ? mwait_idle+0xa5/0x1e0
Jun 16 22:11:44 localhost kernel: [<ffffffff814bbd7a>] ? atomic_notifier_call_chain+0x1a/0x20
Jun 16 22:11:44 localhost kernel: [<ffffffff8100b0b7>] cpu_idle+0xb7/0x110
Jun 16 22:11:44 localhost kernel: [<ffffffff814aeba2>] start_secondary+0x234/0x236
Jun 16 22:11:44 localhost kernel: ---[ end trace 20247cd990aa3302 ]---
Jun 16 22:11:44 localhost kernel: e1000e 0000:04:00.0: eth0: Reset adapter
Jun 16 22:11:47 localhost kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jun 16 22:13:29 localhost kernel: uhci_hcd 0000:00:1d.0: host controller process error, something bad happened!
Jun 16 22:13:29 localhost kernel: uhci_hcd 0000:00:1d.0: host controller halted, very bad!
Jun 16 22:13:29 localhost kernel: uhci_hcd 0000:00:1d.0: HC died; cleaning up
Jun 16 22:13:29 localhost kernel: usb 2-2: USB disconnect, device number 2
作者: loveradmin    时间: 2011-06-17 10:37
那你检查的多了,比如说系统里的应用程序,cpu 问题,硬盘IO 你都要考虑进去,不过建议你通知机房重启后检查/var/log/demsg
作者: 300second    时间: 2011-06-17 11:26
重启后检查/var/log/demsg




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2