- 论坛徽章:
- 0
|
多网口的IP Network Multipathing failover 互为热备份实例篇
我的配的IPMP的工作记录
SunFire V480 Server
自带两个10/100/1000M自适应的网口,设备名分别是ce0和ce1;
在PCI槽中又扩了一块四口网卡X1034A,设备名分别是qfe0,qfe1,qfe2,qfe3;
用来做ipmp(IP MultiPath)的是ce0和qfe0,其中ce0为主网口,IP为220.192.216.x,提供对外服务;而qfe0设置为备用网口;
ce0和qfe0组成一个名为ipmp_group的NIC Group;
IP 192.168.0.1和192.168.0.2是为了检测用途的两个IP,用来测试所在网卡的状态;
/sbin/in.mpathd是检测、恢复、回复多个网卡的系统进程。
在/etc/rc2.d下创建如下启动脚本,并重新启动机器:
root@AAA1# more /etc/rc2.d/S99ipmp
ifconfig qfe0 plumb
ifconfig ce0 group ipmp_group
ifconfig qfe0 group ipmp_group
ifconfig ce0 addif 192.168.0.1 -failover deprecated up
ifconfig qfe0 192.168.0.2 -failover deprecated standby up
将主网口ce0的网线拔下,出现如下情况:
root@AAA1#
Dec 22 15:44:24 AAA1 genunix: WARNING: ce0: fault detected external to device; service degraded
Dec 22 15:44:24 AAA1 genunix: WARNING: ce0: xcvr addr:0x01 - link down
Dec 22 15:44:32 AAA1 in.mpathd[339]: NIC failure detected on ce0 of group ipmp_group
Dec 22 15:44:32 AAA1 in.mpathd[339]: Successfully failed over from NIC ce0 to NIC qfe0
这时对外服务IP成功的从ce0切换到qfe0,可以用ifconfig命令查看:
root@AAA1# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4>; mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
ce0: flags=19000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,FAILED>; mtu 0 index 2
inet 0.0.0.0 netmask 0
groupname ipmp_group
ether 0:3:ba:29:87:4b
ce0:1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED>; mtu 1500 index 2
inet 192.168.0.1 netmask ffffff00 broadcast 192.168.0.255
qfe0: flags=29040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,STANDBY>; mtu 1500 index 3
inet 192.168.0.2 netmask ffffff00 broadcast 192.168.0.255
groupname ipmp_group
ether 0:3:ba:29:87:4b
qfe0:1: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY>; mtu 1500 index 3
inet 220.192.216.x netmask fffffff0 broadcast 220.192.216.111
再测试恢复过程,将ce0的网线重新插好,出现如下情况:
root@AAA1#
Dec 22 15:46:18 AAA1 genunix: NOTICE: ce0: fault cleared external to device; service available
Dec 22 15:46:18 AAA1 genunix: NOTICE: ce0: xcvr addr:0x01 - link up 10 Mbps half duplex
Dec 22 15:46:32 AAA1 in.mpathd[339]: NIC repair detected on ce0 of group ipmp_group
Dec 22 15:46:32 AAA1 in.mpathd[339]: Successfully failed back to NIC ce0
这时再用ifconfig命令查看,发现对外服务IP已经成功的从qfe0切换回ce0:
root@AAA1# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4>; mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4>; mtu 1500 index 2
inet 220.192.216.x netmask fffffff0 broadcast 220.192.216.111
groupname ipmp_group
ether 0:3:ba:29:87:4b
ce0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER>; mtu 1500 index 2
inet 192.168.0.1 netmask ffffff00 broadcast 192.168.0.255
qfe0: flags=69040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,STANDBY,INACTIVE>; mtu 1500 index 3
inet 192.168.0.2 netmask ffffff00 broadcast 192.168.0.255
groupname ipmp_group
ether 0:3:ba:29:87:4b
root@AAA1#
但是最后发现一个问题,如果本网段内没有一台支持Multicast的设备作为in.mpathd的测试设备的话,回导致该NIC group内的所有网口全部fail。
Dec 23 12:27:36 AAA1 in.mpathd[336]: All Interfaces in group ipmp_group have failed
只有在测试是网段有个能响应IP Multicast的设备,ipmp_group才能恢复:
Dec 23 12:32:14 AAA1 in.mpathd[336]: At least 1 interface (qfe0) of group ipmp_group has repaired
Dec 23 12:32:14 AAA1 in.mpathd[336]: NIC repair detected on qfe0 of group ipmp_group
Dec 23 12:32:15 AAA1 in.mpathd[336]: NIC repair detected on ce0 of group ipmp_group
Dec 23 12:32:15 AAA1 in.mpathd[336]: Successfully failed back to NIC ce0
Dec 23 12:32:15 AAA1 in.mpathd[336]: Improved failure detection time 16660 ms
Dec 23 12:32:15 AAA1 in.mpathd[336]: Improved failure detection time 10000 ms |
|