- 论坛徽章:
- 0
|
大家好,
我的系统环境:
一台sun solaris 主机 (SunOS XXX 5.10 Generic_144488-04 sun4u sparc SUNW,SPARC-Enterprise)
一台 hp linux 安腾 小机 2660 ,一台 emc nas 存储。
两台小机都通过 nfs 方式连接到 emc 的同一空间,当然是通过不同的ip挂载(mount)上的。
主机和存贮之间通过Hua3千兆交换机连接。
solairs上运行着一个机构仓储(olap)系统,管理的都是几百兆至2G的大文件,大量的客户端会从这里下载文件,
问题:linux 连接emc没有问题,速度正常,但是 solaris连接 emc时就非常慢,几乎连接不上,现象如下:
1、如果 在solaris上执行 df -h 就会发现 原本是要 挂载的几个 存储卷出现的速度非常慢, 比如正常情况下应该有如下信息:
root@digital # df -h
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d0 20G 7.5G 12G 39% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 108G 1.9M 108G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
sharefs 0K 0K 0K 0% /etc/dfs/sharetab
/dev/md/dsk/d20 9.9G 3.9G 5.8G 41% /usr
fd 0K 0K 0K 0% /dev/fd
swap 108G 896K 108G 1% /tmp
swap 108G 88K 108G 1% /var/run
/dev/md/dsk/d40 42G 1.1G 41G 3% /export/home
/dev/dsk/c5t6006016008D028000D9C3D7B9120E111d0s6
1.6T 337M 1.6T 1% /ebook2
运行df后本地设备很快就看到了,但红色部分(emc设备)就要等很久很久才能出现,df的输出就卡在那儿,不动了。
2、从solaris 主机ping emc 存储,结果如下:
64 bytes from 192.168.1.90: icmp_seq=0. time=0.251 ms
64 bytes from 192.168.1.90: icmp_seq=1. time=0.240 ms
64 bytes from 192.168.1.90: icmp_seq=2. time=0.241 ms
64 bytes from 192.168.1.90: icmp_seq=3. time=0.228 ms
64 bytes from 192.168.1.90: icmp_seq=4. time=0.214 ms
64 bytes from 192.168.1.90: icmp_seq=5. time=0.202 ms
64 bytes from 192.168.1.90: icmp_seq=6. time=0.203 ms
64 bytes from 192.168.1.90: icmp_seq=7. time=0.186 ms
64 bytes from 192.168.1.90: icmp_seq=8. time=0.173 ms
64 bytes from 192.168.1.90: icmp_seq=9. time=0.281 ms
64 bytes from 192.168.1.90: icmp_seq=10. time=0.268 ms
64 bytes from 192.168.1.90: icmp_seq=11. time=0.247 ms
64 bytes from 192.168.1.90: icmp_seq=12. time=0.267 ms
64 bytes from 192.168.1.90: icmp_seq=13. time=0.240 ms
64 bytes from 192.168.1.90: icmp_seq=14. time=0.257 ms
3、我查过主机的负载,诸如 mpstat ,vmstat ,iostat等,貌似都很正常,一点都不高,
4、运行 netstat -s :
TCP tcpRtoAlgorithm = 4 tcpRtoMin = 400
tcpRtoMax = 60000 tcpMaxConn = -1
tcpActiveOpens =625995 tcpPassiveOpens =653332
tcpAttemptFails =240240 tcpEstabResets =100845
tcpCurrEstab = 141 tcpOutSegs =176425096
tcpOutDataSegs =221848300 tcpOutDataBytes =1164290171
tcpRetransSegs =8454951 tcpRetransBytes =3622524478
tcpOutAck =46609139 tcpOutAckDelayed =1921528
tcpOutUrg = 126 tcpOutWinUpdate = 19405
tcpOutWinProbe = 5762 tcpOutControl =2498101
tcpOutRsts =289327 tcpOutFastRetrans = 112
tcpInSegs =187196237
tcpInAckSegs =108991448 tcpInAckBytes =172433623
tcpInDupAck =9538040 tcpInAckUnsent = 13
tcpInInorderSegs =253961330 tcpInInorderBytes =2151919823
tcpInUnorderSegs =219678 tcpInUnorderBytes =1633190326
tcpInDupSegs = 44673 tcpInDupBytes =4308560
tcpInPartDupSegs = 16267 tcpInPartDupBytes =9297984
tcpInPastWinSegs = 89 tcpInPastWinBytes =2044903847
tcpInWinProbe = 11 tcpInWinUpdate = 3902
tcpInClosed = 2630 tcpRttNoUpdate =77906834
tcpRttUpdate =30544671 tcpTimRetrans =2771191
tcpTimRetransDrop = 2133 tcpTimKeepalive = 18977
tcpTimKeepaliveProbe= 6773 tcpTimKeepaliveDrop = 2
tcpListenDrop = 0 tcpListenDropQ0 = 0
tcpHalfOpenDrop = 0 tcpOutSackRetrans =2817688
IPv4 ipForwarding = 2 ipDefaultTTL = 255
ipInReceives =361884451 ipInHdrErrors = 7
ipInAddrErrors = 0 ipInCksumErrs = 0
ipForwDatagrams = 0 ipForwProhibits = 12960
ipInUnknownProtos = 0 ipInDiscards = 41
ipInDelivers =365653531 ipOutRequests =272316826
ipOutDiscards = 1479 ipOutNoRoutes = 0
ipReasmTimeout = 60 ipReasmReqds = 0
ipReasmOKs = 0 ipReasmFails = 0
ipReasmDuplicates = 0 ipReasmPartDups = 0
ipFragOKs = 0 ipFragFails = 0
ipFragCreates = 0 ipRoutingDiscards = 0
tcpInErrs = 1 udpNoPorts =5161846
udpInCksumErrs = 0 udpInOverflows = 0
rawipInOverflows = 0 ipsecInSucceeded = 426
ipsecInFailed = 0 ipInIPv6 = 0
ipOutIPv6 = 0 ipOutSwitchIPv6 = 0
tcpRetransBytes / tcpOutDataBytes 的值 大的有点离谱。
5、 solaris mount连接 emc 上的某些逻辑卷正常,mount 某些应用系统经常使用的就不正常,很慢。
6、重起 solaris , storage 和 network 就都正常了,但过一段后,上面的问题又会出现。
已经咨询过emc的技术支持,他们认定 存储是没有问题的,觉得是 交换机和网线的问题,对这个答案不满意,我也换过网线,甚至用了直连的方式,也不行。
也咨询过 sun的厂商技术支持(现在是oracle了,blalalala),他们一口咬叮 操作系统,网卡都是没有问题的。
我的软件开发商也很强势,几乎每提供什么帮助,唉,技不如人就要被欺负阿。
我想请教大家的是,如果想优化这个系统,我应该从那里入手,是应用系统的问题?把某些资源耗光了?
还是存储或nfs的问题,请大家多多指点,我对网络了解的有限,请各位不吝赐教
先谢谢了。
|
|