gawk统计BO用户操作
gawk统计BO用户操作想要统计每天BO有多少用户在线?每个小时用户在线数多少?通过BO的审计功能可以实现,但是审计会降低性能。如果前端有Apache的做负载均衡只要开启了日志,我们便可以轻松的通过awk来分析日志,得到我们想要的数据。下面的代码中完成了我的3个需求:
1. 统计每天系统用户上线数多少
2.统计每个小时用户在线数多少
3.统计报表保存动作平均开销是多少?
通过gawk轻松搞定。
1. Apache 日志格式
日志格式用的是common,类似如下格式:
10.1.1.1 - - "GET /OpenDocument/opendoc/openDocument.jsp?iDocID=144758&boRefresh=Y HTTP/1.1" 200 3382
2. GAWK程序
通过IP地址和时间就能搞定前连个需求,通过jsp的页面判断用户进行了什么操作这样3个需求都能满足,代码如下:
view plaincopy to clipboard01.#! /usr/bin/gawk -f
02.#$1 is ip
03.#$4 is date
04.#year:substr($4,9,4)
05.#month: Mons
06.#day: substr($4,2,2)
07.#time ltime = substr($4,14,8);gsub(/:/," ",ltime)
08.#request url $7
09.function getTime(date){
10. year = substr(date,9,4)
11. month = Mons
12. day = substr(date,2,2)
13. ltime = substr(date,14,8)
14. gsub(/:/," ",ltime)
15. return mktime(year " " month " " day " " ltime)
16. }
17.BEGIN{
18. Mons["Jan"] = 1; Mons["Feb"] = 2; Mons["Mar"] = 3;
19. Mons["Apr"] = 4; Mons["May"] = 5; Mons["Jun"] = 6;
20. Mons["Jul"] = 7; Mons["Aug"] = 8; Mons["Sep"] = 9;
21. Mons["Oct"] = 10; Mons["Nov"] = 11; Mons["Dec"] = 12
22.}
23.{
24. currIp = $1
25. currDate = getTime($4)
26. hour = substr($4,14,2)
27.
28. #get how many user on line per hour
29. user
30. #get how many user on line today
31. ip
32.
33. #get avg report saved time
34. if ($7 ~ /\cdz_adv\/checkProcessSave.jsp/){
35. startTime = currDate
36. }
37. if ($7 ~ /reportSaveAlert.html\?/){
38. totalTime += currDate - startTime
39. times += 1
40. }
41.}
42.END{
43. print length(ip) " users on line today."
44. #print user per hour
45. for (i in user){
46. split(i,lists,SUBSEP)
47. count] += 1
48. }
49. for (i in count)
50. print i,":",count," users"
51. #print ave report saved time
52. if (times > 0)
53. print totalTime / times "s avg report saved time."
54. else
55. print "0 report is saved."
56.}用了原始数据之后便可以自己做个dashborad分析,做顾问不简单啊,啥都得懂。 good
awk的典型应用之一
页:
[1]