- 论坛徽章:
- 0
|
本帖最后由 xklyy01 于 2015-05-12 10:00 编辑
各位大神,小弟是HPUX运维工程师新手一枚,
公司里需要巡检的机器不多,为了提升技术,
就拿测试机写了一下日常巡检的shell脚本,
大家看看这个脚本怎么样,帮忙给点意见
目前只能实现单机巡检
不能自动巡检其他机器或发邮件
#!/sbin/sh
#edition 1.0
echo "You are logged in as `whoami` ";
if [ `whoami` != root ]; then
echo "Must be logged on as root to run this script."
exit
fi
CHECK_DATE=`date +%F`
echo "Running script at `date`"
CHECK_REPORT_PATH=/tmp/getinfo
ls -d $CHECK_REPORT_PATH
if [ $? -gt 0 ]
then
mkdir $CHECK_REPORT_PATH
fi
CURRENT_DIR=`pwd`
echo "################check start###################"
#hostname check
Hostname=`hostname`
echo "Hostname Check start,please wait.."
echo "*********Hostname Check**********" > $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "Hostname is $Hostname" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "-------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#cluster status check
echo "cluster status check start,please wait.."
echo "********cluster status check*****" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
cmviewcl >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#cpu status check
echo "cpu status check start,please wait.."
echo "********cpu status check********" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
sar -u 4 5 >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#disk status check
echo "disk status check start,please wait.."
echo "********disk status check********" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
sar -d 4 5 >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#swap status check
echo "swap status check start,please wait.."
echo "********swap status check********" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
swapinfo -atm >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#system uptime check
echo "system uptime check start,please wait.."
echo "********system uptime check********" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
/usr/bin/uptime -s >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#FS exeeded 90& usage check
echo "FS exeeded 90& usage check start,please wait.."
echo "***FS exeeded 90& usage check****" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
bdf | awk '{ h=x[split($5,x,"%")-1];if(h>=90) print $0}' >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#cpu check
#if you use this script in LINUX environment,you should use" top -n 1"
#if you use this script in HPUX environment,you should use" top -d 1"
echo "cpu check start,please wait.."
echo "*********cpu check******" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
top -d 1 >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#fail syslog check
echo "fail syslog check start,please wait.."
echo "*********fail syslog check******" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
FAIL_LOG=`tail -2000 /var/adm/syslog/syslog.log | grep -i fail`
ls -d $FAIL_LOG
if [ $? -ne 0 ]
then
$FAIL_LOG >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
else
echo "There is not fail syslog!" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
fi
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
#error syslog check
echo "error syslog check start,please wait.."
echo "*********error syslog check******" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
ERROR_LOG=`tail -2000 /var/adm/syslog/syslog.log | grep -i err`
ls -d $ERROR_LOG
if [ $? -ne 0 ]
then
$ERROR_LOG >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
else
echo "There is not error syslog!" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
fi
echo "----------------------------\n" >> $CHECK_REPORT_PATH
/Report_$CHECK_DATE
#event check
echo "event check start,please wait.."
echo "*********event check******" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
EVENT_LOG=`tail -2000 /var/opt/resmon/log/event.log`
$EVENT_LOG >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "----------------------------\n" >> $CHECK_REPORT_PATH/Report_$CHECK_DATE
echo "##################check end###################"
下面是执行这个脚本后的的前台运行结果
# sh hpcheck1.0.sh
You are logged in as root
Running script at Sun May 10 10:14:26 MDT 2015
/tmp/getinfo
################check start###################
Hostname Check start,please wait..
cluster status check start,please wait..
hpcheck1.0.sh[34]: cmviewcl: not found. (因为是单台测试机,没有集群,所以边显示not found不奇怪)
cpu status check start,please wait..
disk status check start,please wait..
swap status check start,please wait..
system uptime check start,please wait..
FS exeeded 90& usage check start,please wait..
cpu check start,please wait..
fail syslog check start,please wait..
.
error syslog check start,please wait..
.
event check start,please wait..
hpcheck1.0.sh[96]: >------------: not found. (这边报错不懂为啥了)
##################check end###################
下面是生成报告的结果
# cat Report_2015-05-10
*********Hostname Check**********
Hostname is rp3440
-------------------
********cluster status check*****
----------------------------
********cpu status check********
HP-UX rp3440 B.11.31 U 9000/800 05/10/15
10:14:26 %usr %sys %wio %idle
10:14:30 0 0 0 100
10:14:34 0 0 0 100
10:14:38 0 0 0 100
10:14:42 0 0 2 97
10:14:46 0 0 0 100
Average 0 0 0 99
----------------------------
********disk status check********
HP-UX rp3440 B.11.31 U 9000/800 05/10/15
10:14:46 device %busy avque r+w/s blks/s avwait avserv
10:14:50 disk3 0.50 0.50 0 4 0.00 11.04
10:14:54 disk3 0.50 0.50 1 12 0.00 7.29
10:14:58 disk3 1.25 0.50 2 40 0.00 13.82
10:15:02 disk3 4.25 0.50 8 138 0.00 7.31
disk4 0.25 0.50 0 0 0.00 9.69
10:15:06 disk3 0.50 0.50 1 16 0.00 7.56
Average disk3 1.40 0.50 3 42 0.00 8.68
Average disk4 0.05 0.50 0 0 0.00 9.68
----------------------------
********swap status check********
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 8192 0 8192 0% 0 - 1 /dev/vg00/lvol2
reserve - 217 -217
memory 5842 840 5002 14%
total 14034 1057 12977 8% - 0 -
----------------------------
********system uptime check********
10:15am up 1 day, 23:54, 1 user, load average: 0.00, 0.00, 0.00
----------------------------
***FS exeeded 90& usage check****
DevFS 3 3 0 100% /dev/deviceFileSystem
----------------------------
*********cpu check******
System: rp3440 Sun May 10 10:15:07 2015
Load averages: 0.00, 0.00, 0.00
140 processes: 105 sleeping, 34 running, 1 zombie
Cpu states:
CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 0.00 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%
1 0.01 0.0% 0.0% 1.0% 99.0% 0.0% 0.0% 0.0% 0.0%
2 0.00 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%
3 0.00 0.0% 0.0% 1.0% 99.0% 0.0% 0.0% 0.0% 0.0%
--- ---- ----- ----- ----- ----- ----- ----- ----- -----
avg 0.00 0.0% 0.0% 1.0% 99.0% 0.0% 0.0% 0.0% 0.0%
System Page Size: 4Kbytes
Memory: 150072K (61568K) real, 287700K (132724K) virtual, 4950716K free Page# 1
/16
CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
2 ? 58 root 191 20 1044K 1044K run 2:54 0.68 0.68 vxfsd
1 ? 1752 root 168 20 4792K 824K sleep 0:55 0.38 0.38 utild
2 ? 1410 root 152 20 79696K 7904K run 0:17 0.20 0.20 cimprovagt
1 ? 1407 root 152 20 40684K 9032K run 3:51 0.15 0.15 cimserver
2 ? 60 root 191 20 180K 180K run 0:29 0.14 0.13 pm_schedcpu
1 ? 1401 root 152 20 12892K 2392K run 0:00 0.12 0.12 rpcd
----------------------------
*********fail syslog check******
There is not fail syslog!
----------------------------
*********error syslog check******
There is not error syslog!
----------------------------
*********event check******
---------------------------- |
|