免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 3110 | 回复: 0
打印 上一主题 下一主题

[原创]suncluster中网口物理正常,IPMP状态异常的处理 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2007-11-29 10:58 |只看该作者 |倒序浏览
suncluster中网口物理正常,IPMP状态异常的处理\r\n\r\n\r\n一、现象: \r\n\r\n1. suncluster中的ipmp状态异常 \r\n\r\nroot@ABCSERVER1 # scstat -i \r\n\r\n-- IPMP Groups -- \r\n\r\nNode Name Group Status Adapter Status \r\n\r\n--------- ----- ------ ------- ------ \r\n\r\nIPMP Group: ABCSERVER1 ipmp1 Online qfe1 Offline \r\n\r\nIPMP Group: ABCSERVER1 ipmp1 Online ce1 Online \r\n\r\n\r\n\r\nIPMP Group: ABCSERVER2 ipmp1 Online qfe1 Offline \r\n\r\nIPMP Group: ABCSERVER2 ipmp1 Online ce1 Online \r\n\r\n\r\n\r\n2. ping群组中Offline的网口,显示alive: \r\n\r\n/userhome/abcapp$ ping 10.xx.x.233 \r\n\r\n10.xx.x.233 is alive \r\n\r\n\r\n\r\n二、诊断: \r\n\r\n查看message信息: \r\n\r\nroot@ABCSERVER1 # dmesg \r\n\r\nThu Nov 29 09:59:57 CST 2007 \r\n\r\n… \r\n\r\nNov 29 01:34:43 ABCSERVER1 in.mpathd[436]: [ID 168056 daemon.error] All Interfaces in group ipmp1 have failed \r\n\r\nNov 29 01:34:43 ABCSERVER1 Cluster.PNM: [ID 890413 daemon.notice] ipmp1: state transition from OK to DOWN. \r\n\r\n… \r\n\r\nNov 29 01:46:58 ABCSERVER1 genunix: [ID 408789 kern.notice] NOTICE: ce1: fault cleared external to device; service available \r\n\r\nNov 29 01:46:58 ABCSERVER1 genunix: [ID 451854 kern.notice] NOTICE: ce1: xcvr addr:0x01 - link up 100 Mbps full duplex \r\n\r\nNov 29 01:46:58 ABCSERVER1 in.mpathd[436]: [ID 820239 daemon.error] The link has come up on ce1 \r\n\r\nNov 29 01:46:59 ABCSERVER1 qfe: [ID 517869 kern.info] SUNW,qfe1: 100 Mbps full duplex link up - internal transceiver \r\n\r\nNov 29 01:47:25 ABCSERVER1 in.mpathd[436]: [ID 620804 daemon.error] Successfully failed back to NIC ce1 \r\n\r\nNov 29 01:47:25 ABCSERVER1 in.mpathd[436]: [ID 299542 daemon.error] NIC repair detected on ce1 of group ipmp1 \r\n\r\nNov 29 01:47:25 ABCSERVER1 in.mpathd[436]: [ID 237757 daemon.error] At least 1 interface (ce1) of group ipmp1 has repaired \r\n\r\nNov 29 01:47:25 ABCSERVER1 Cluster.PNM: [ID 890413 daemon.notice] ipmp1: state transition from DOWN to OK. \r\n\r\nNov 29 01:47:25 ABCSERVER1 in.mpathd[436]: [ID 832587 daemon.error] Successfully failed over from NIC qfe1 to NIC ce1 \r\n\r\n… \r\n\r\nNov 29 01:47:25 ABCSERVER1 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource ora-service status msg on node ABCSERVER1 change to <LogicalHostname online.> \r\n\r\n从message信息上来看,在断开网络后,suncluster确认两个网口失效,认为IPMP的state transition为down,同时标明资源组中的server-rs为DEGRADED。在健全检查(sanity check)失败后,suncluster决定不做切换。网络再次连通时,两个网口全部link上,IPMP的state transition变回up,同时服务从qfe1切回ce1。改变server-rs状态变为online,node状态变为online,suncluster处理过程完毕。 \r\n\r\n看来suncluster少做了一步,似乎只要将网络服务从ce1切回qfe1,IPMP状态就正常了。 \r\n\r\n三、处理: \r\n\r\n1. 试图强制将qfe1拉起来: \r\n\r\nroot@ABCSERVER1 # ifconfig qfe1:1 up \r\n\r\n未果,依旧显示IPMP offline \r\n\r\n2. 查询SA299第一章关于in.mpathd进程的说明,执行命令重新读取mpathd配置: \r\n\r\nroot@ABCSERVER1 # pkill -HUP /sbin/in.mpathd \r\n\r\n3. 等待片刻后查看dmesg中出现以下信息: \r\n\r\nNov 29 09:20:30 ABCSERVER1 last message repeated 1 time \r\n\r\nNov 29 09:36:16 ABCSERVER1 in.mpathd[436]: [ID 111610 daemon.error] SIGHUP: restart and reread config file \r\n\r\nNov 29 09:37:10 ABCSERVER1 in.mpathd[4902]: [ID 620804 daemon.error] Successfully failed back to NIC qfe1 \r\n\r\nNov 29 09:37:10 ABCSERVER1 in.mpathd[4902]: [ID 299542 daemon.error] NIC repair detected on qfe1 of group ipmp1 \r\n\r\n查看IPMP状态: \r\n\r\nroot@ABCSERVER1 # scstat -i \r\n\r\n-- IPMP Groups -- \r\n\r\n\r\n\r\nNode Name Group Status Adapter Status \r\n\r\n--------- ----- ------ ------- ------ \r\n\r\nIPMP Group: ABCSERVER1 ipmp1 Online qfe1 Online \r\n\r\nIPMP Group: ABCSERVER1 ipmp1 Online ce1 Online \r\n\r\n\r\n\r\nIPMP Group: ABCSERVER2 ipmp1 Online qfe1 Offline \r\n\r\nIPMP Group: ABCSERVER2 ipmp1 Online ce1 Online \r\n\r\n在备机重复操作,状态全部正常 \r\n\r\n欢迎大家一起讨论
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP