道可道非常道 发表于 2009-03-31 12:10

系统中in.mpathd错误,这主要是什么问题,如何解决,谢谢

系统中in.mpathd错误,这主要是什么问题,如何解决,谢谢
产生频率较高


Mar 30 12:02:42 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:02:42 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:03:09 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:03:09 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 12:04:02 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:04:02 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:05:28 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:05:28 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 12:05:57 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:05:57 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:10:37 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:10:37 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 12:11:16 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:11:16 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:15:41 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:15:41 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 12:19:27 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:19:27 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:20:23 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:20:23 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 12:23:27 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:23:27 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:23:42 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:23:42 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 12:27:29 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 12:27:29 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 12:31:18 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 12:31:18 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 13:06:22 sunserver in.mpathd: Cannot meet requested failure detection time of 10000 ms on (inet ce2) new failure detection time for group "ipmp0" is 23786 ms
Mar 30 13:07:23 sunserver in.mpathd: Improved failure detection time 11893 ms on (inet ce2) for group "ipmp0"
Mar 30 13:07:23 sunserver in.mpathd: Improved failure detection time 10000 ms on (inet ce0) for group "ipmp0"
Mar 30 13:54:39 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 13:54:39 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 13:59:33 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 13:59:33 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 14:00:02 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 14:00:02 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 14:04:10 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 14:04:10 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 14:05:12 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 14:05:12 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 14:08:10 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 14:08:10 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 14:08:12 sunserver in.mpathd: Cannot meet requested failure detection time of 10000 ms on (inet ce0) new failure detection time for group "ipmp0" is 86786 ms
Mar 30 14:09:12 sunserver in.mpathd: Improved failure detection time 43393 ms on (inet ce2) for group "ipmp0"
Mar 30 14:09:14 sunserver in.mpathd: Improved failure detection time 21696 ms on (inet ce2) for group "ipmp0"
Mar 30 14:09:14 sunserver in.mpathd: Improved failure detection time 10848 ms on (inet ce0) for group "ipmp0"
Mar 30 14:09:15 sunserver in.mpathd: Improved failure detection time 10000 ms on (inet ce2) for group "ipmp0"
Mar 30 14:09:48 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 14:09:48 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 14:38:01 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 14:38:01 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 14:46:57 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 14:46:57 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 30 14:53:44 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 30 14:53:44 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 30 19:54:23 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 30 19:54:23 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 31 02:55:37 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 31 02:55:37 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 31 08:24:20 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 31 08:24:20 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 31 09:12:10 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 31 09:12:10 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 31 09:13:17 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 31 09:13:17 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 31 10:24:45 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 31 10:24:45 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 31 10:24:49 sunserver in.mpathd: Cannot meet requested failure detection time of 10000 ms on (inet ce0) new failure detection time for group "ipmp0" is 156296 ms
Mar 31 10:26:45 sunserver in.mpathd: Improved failure detection time 78148 ms on (inet ce0) for group "ipmp0"
Mar 31 10:27:17 sunserver in.mpathd: Improved failure detection time 39074 ms on (inet ce0) for group "ipmp0"
Mar 31 10:27:45 sunserver in.mpathd: Improved failure detection time 19537 ms on (inet ce0) for group "ipmp0"
Mar 31 10:27:52 sunserver in.mpathd: Improved failure detection time 10000 ms on (inet ce0) for group "ipmp0"
Mar 31 10:28:14 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 31 10:28:14 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 31 10:29:59 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 31 10:29:59 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 31 10:30:35 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 31 10:30:35 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2
Mar 31 10:39:28 sunserver in.mpathd: NIC repair detected on ce0 of group ipmp0
Mar 31 10:39:28 sunserver in.mpathd: Successfully failed back to NIC ce0
Mar 31 10:41:49 sunserver in.mpathd: NIC failure detected on ce0 of group ipmp0
Mar 31 10:41:49 sunserver in.mpathd: Successfully failed over from NIC ce0 to NIC ce2

道可道非常道 发表于 2009-03-31 12:11

其它的一些信息

root@sunserver # netstat -in
NameMtuNet/Dest      Address      IpktsIerrs OpktsOerrs Collis Queue
lo0   8232 127.0.0.0   127.0.0.1      135764928 0   135764928 0   0      0   
ce0   1500 192.2.1.0   192.2.1.42   1196263598 0   2580404654 0   0      0   
ce2   1500 192.2.1.0   192.2.1.10   2203615821 0   4085668554 0   0      0   


root@sunserver # ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
      inet 127.0.0.1 netmask ff000000
ce0: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 2
      inet 192.2.1.42 netmask ffffff00 broadcast 192.2.1.255
      groupname ipmp0
      ether 0:14:4f:47:97:28
ce2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
      inet 192.2.1.10 netmask ffffff00 broadcast 192.2.1.255
      groupname ipmp0
      ether 0:14:4f:47:97:28
ce2:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
      inet 192.2.1.41 netmask ffffff00 broadcast 192.2.1.255
root@sunserver #



root@sunserver # more hostname.ce0
sun1-test-ce1 deprecated -failover netmask + broadcast + group ipmp0 up
root@sunserver # more hostname.ce2
sunserver netmask + broadcast + group ipmp0 up \
addif sun1-test-ce0 deprecated -failover netmask + broadcast + up



root@sunserver # more /etc/hosts
#
# Internet host table
#
127.0.0.1       localhost      
192.2.1.10      sunserver       sunserver.com   loghost
192.2.1.41      sun1-test-ce0
192.2.1.42      sun1-test-ce1
root@sunserver #

道可道非常道 发表于 2009-03-31 12:12

是物理网卡有问题吗?
处理的步骤是什么?谢谢

道可道非常道 发表于 2009-03-31 12:24

ping 192.2.1.10 有丢包

道可道非常道 发表于 2009-03-31 12:33

我查了一下
在配置了IPMP的系统,原因是网络负载过大,ICMP包不能在指定的时间内返回,建议增加IPMP的缺省延时配置后重

启IPMP进程即可减少报错的次数。
可以修改/etc/default/mpathd 文件
将变量FAILURE_DETECTION_TIME 的值增加就行了

/mpathd 文件已经修改了,
如何重启IPMP进程

道可道非常道 发表于 2009-03-31 14:04

系统日志
1. Mar 31 10:24:45 NIC repair detected on ce0 of group ipmp0
2. Mar 31 10:24:45 Successfully failed back to NIC ce0
3.Mar 31 10:24:49 Cannot meet requested failure detection time of 10000 ms on (inet ce0) new failure detection time for group "ipmp0" is 156296 ms
4. Mar 31 10:26:45 Improved failure detection time 78148 ms on (inet ce0) for group "ipmp0"
Mar 31 10:27:17 Improved failure detection time 39074 ms on (inet ce0) for group "ipmp0"
5.Mar 31 10:28:14 NIC failure detected on ce0 of group ipmp0
6.Mar 31 10:28:14 Successfully failed over from NIC ce0 to NIC ce2
7.Mar 31 10:29:59 NIC repair detected on ce0 of group ipmp0
8.Mar 31 10:29:59 Successfully failed back to NIC ce0

1.ipmp0中的ce0修复了
2.fail back 到 ce0
3.超过IPMP的缺省延时10000ms
4. Improved failure detection time on (inet ce0) for group "ipmp0"
5. ipmp0中的ce0错误被探测到
6.网卡从ce0 fail over 到ce2
循环又开始了
7. .ipmp0中的ce0修复了
8.fail back 到 ce0

li_hunter 发表于 2009-03-31 15:53

回复 #5 道可道非常道 的帖子

看上去你已经找到答案了,只是不明白如何重启生效?好像没有单独重启IPMP服务的,你只能重启一下系统。

道可道非常道 发表于 2009-03-31 16:05

原帖由 li_hunter 于 2009-3-31 15:53 发表 http://bbs2.chinaunix.net/images/common/back.gif
看上去你已经找到答案了,只是不明白如何重启生效?好像没有单独重启IPMP服务的,你只能重启一下系统。

没有啊,您有什么高见啊,谢谢

道可道非常道 发表于 2009-03-31 16:06

重新启动 in.mpathd 守护进程。



# pkill -HUP in.mpathd

道可道非常道 发表于 2009-04-01 09:09

没有解决啊,就算调整到30000ms,也是会报错的啊
根源是什么,是网卡问题,还是网络问题
页: [1] 2 3
查看完整版本: 系统中in.mpathd错误,这主要是什么问题,如何解决,谢谢