Chinaunix

标题: 用AWK统计apache响应时间最长的10条请求的URL TOP10 [打印本页]

作者: camwei    时间: 2012-02-03 20:39
标题: 用AWK统计apache响应时间最长的10条请求的URL TOP10
用AWK统计apache响应时间最长的10条请求的URL TOP10

##访问日志例子:      
1.1.1.1"[10/Dec/2011:23:30:00 +0800]"POST /t.do?requestid=apgrade HTTP/1.1"200"710001"A0001"616510220"B00001"
1.1.1.2"[10/Dec/2011:23:30:00 +0800]"GET http://*.*.*/421/10421640/l57411079840.png HTTP/1.1"200"710001"A0002"60916773"B00002"   

使用以下命令可以统计出来,但是不能去掉重复的URL,请教各位怎样能去除重复的URL?

/log01>more  access_log_001|grep  " +0800]"|awk -F\" '{print$7,$3}'|sort -nr|head -n 10
61651022 POST /t.do?requestid=apgrade HTTP/1.1
61612162 POST /t.do?requestid=apgrade HTTP/1.1
61268940 POST /t.do?requestid=apgrade HTTP/1.1
61076095 POST /t.do?requestid=apgrade HTTP/1.1
61065643 POST /t.do?requestid=apgrade HTTP/1.1
60993259 POST /t.do?requestid=apgrade HTTP/1.1
60961846 POST /t.do?requestid=apgrade HTTP/1.1
60949501 POST /t.do?requestid=apgrade HTTP/1.1
60932514 POST /t.do?requestid=apgrade HTTP/1.1
60916773 GET http://*.*.*/421/10421640/l57411079840.png HTTP/1.1
作者: yanu    时间: 2012-02-04 17:32
xindy@NAS ~/tmp $ cat file 61651022 POST /t.do?requestid=apgrade HTTP/1.1
61612162 POST /t.do?requestid=apgrade HTTP/1.1
61268940 POST /t.do?requestid=apgrade HTTP/1.1
61076095 POST /t.do?requestid=apgrade HTTP/1.1
61065643 POST /t.do?requestid=apgrade HTTP/1.1
60993259 POST /t.do?requestid=apgrade HTTP/1.1
60961846 POST /t.do?requestid=apgrade HTTP/1.1
60949501 POST /t.do?requestid=apgrade HTTP/1.1
60932514 POST /t.do?requestid=apgrade HTTP/1.1
60916773 GET http://*.*.*/421/10421640/l57411079840.png HTTP/1.1

xindy@NAS ~/tmp $ cat file  | sort -n -k 3,1 | awk '{a[$3]=$0}END{for(i in a){print a[i]}}'
61651022 POST /t.do?requestid=apgrade HTTP/1.1
60916773 GET http://*.*.*/421/10421640/l57411079840.png HTTP/1.1
xindy@NAS ~/tmp $


没仔细看,应该是这个意思吧?

作者: camwei    时间: 2012-02-05 18:30
本帖最后由 camwei 于 2012-02-05 18:36 编辑

回复 2# yanu

你这是重定向到一个文件,然后再处理,但是你这方法也不行,如下:
idt@suse10sp2:~/log> more 003.txt
60006943 POST /t.do?requestid=apgrade HTTP/1.1
60006037 POST /t.do?requestid=apgrade HTTP/1.1
34907266 GET /t.do?requestid=did=2 HTTP/1.1
33763806 GET /t.do?requestid=did=2 HTTP/1.1

31024120 GET /t.do?requestid=dontent%3ame HTTP/1.0
30586241 GET http://*.*.*m/t.do?requestid=dot&xType=nt%3e HTTP/1.1
30473203 GET http://*.*.*/t.do?requestid=dot&xType=nt:ge HTTP/1.1
30412504 GET http://*.*.*/t.do?requestid=dos&id=Cme HTTP/1.1
30397419 GET http://*.*.*/t.do?requestid=dos&id=Cre HTTP/1.1
30300715 GET http://*.*.*/t.do?requestid=dot&xType=nt1HT/1.1
30300705 GET http://*.*.*/t.do?requestid=dot&xType=nt2HT/1.1
30300615 GET http://*.*.*/t.do?requestid=dot&xType=nt3HT/1.1
30300605 GET http://*.*.*/t.do?requestid=dot&xType=nt4HT/1.1


按照你的方法,结果出来了,但顺序乱了,我想要按响应时间排序的(按第一列),而且也不能选出top10,执行结果如下:
cat 003.txt| sort -n -k 3,1 | awk '{a[$3]=$0}END{for(i in a){print a}}'
idt@suse10sp2:~/log> cat 003.txt| sort -n -k 3,1 | awk '{a[$3]=$0}END{for(i in a){print a}}'
30300615 GET http://*.*.*/t.do?requestid=dot&xType=nt3HT/1.1
34907266 GET /t.do?requestid=did=2 HTTP/1.1
30586241 GET http://*.*.*m/t.do?requestid=dot&xType=nt%3e HTTP/1.1
30300705 GET http://*.*.*/t.do?requestid=dot&xType=nt2HT/1.1
30300715 GET http://*.*.*/t.do?requestid=dot&xType=nt1HT/1.1
30473203 GET http://*.*.*/t.do?requestid=dot&xType=nt:ge HTTP/1.1
30412504 GET http://*.*.*/t.do?requestid=dos&id=Cme HTTP/1.1
60006943 POST /t.do?requestid=apgrade HTTP/1.1
31024120 GET /t.do?requestid=dontent%3ame HTTP/1.0
30397419 GET http://*.*.*/t.do?requestid=dos&id=Cre HTTP/1.1
30300605 GET http://*.*.*/t.do?requestid=dot&xType=nt4HT/1.1

而我想达到如下效果,既要选出结果,又要排好序,而且是TOP10:
60006943 POST /t.do?requestid=apgrade HTTP/1.1
34907266 GET /t.do?requestid=did=2 HTTP/1.1
31024120 GET /t.do?requestid=dontent%3ame HTTP/1.0
30586241 GET http://*.*.*m/t.do?requestid=dot&xType=nt%3e HTTP/1.1
30473203 GET http://*.*.*/t.do?requestid=dot&xType=nt:ge HTTP/1.1
30412504 GET http://*.*.*/t.do?requestid=dos&id=Cme HTTP/1.1
30397419 GET http://*.*.*/t.do?requestid=dos&id=Cre HTTP/1.1
30300715 GET http://*.*.*/t.do?requestid=dot&xType=nt1HT/1.1
30300705 GET http://*.*.*/t.do?requestid=dot&xType=nt2HT/1.1
30300615 GET http://*.*.*/t.do?requestid=dot&xType=nt3HT/1.1
   
作者: chenyx    时间: 2012-02-05 18:34
sort -r -k 1 003.txt  | head
作者: camwei    时间: 2012-02-05 18:43
回复 4# chenyx

那回到最开始的问题了,没去重复
idt@suse10sp2:~/log> sort -r -k 1 003.txt  | head -10
60006943 POST /t.do?requestid=apgrade HTTP/1.1
60006037 POST /t.do?requestid=apgrade HTTP/1.1
34907266 GET /t.do?requestid=did=2 HTTP/1.1
33763806 GET /t.do?requestid=did=2 HTTP/1.1
31024120 GET /t.do?requestid=dontent%3ame HTTP/1.0
30586241 GET http://*.*.*m/t.do?requestid=dot&xType=nt%3e HTTP/1.1
30473203 GET http://*.*.*/t.do?requestid=dot&xType=nt:ge HTTP/1.1
30412504 GET http://*.*.*/t.do?requestid=dos&id=Cme HTTP/1.1
30397419 GET http://*.*.*/t.do?requestid=dos&id=Cre HTTP/1.1
30300715 GET http://*.*.*/t.do?requestid=dot&xType=nt1HT/1.1


   
作者: chenyx    时间: 2012-02-05 18:51
2楼不是已经给你去重了吗?你结合2楼的改进下就行了
作者: yanu    时间: 2012-02-05 19:42
awk 'a[$3]{next}{a[$3]=1;print}'

去掉第三列重复的行
作者: camwei    时间: 2012-02-05 21:10
看来还是要用到两个重定向,分三步处理了
作者: janlzy20    时间: 2012-02-06 18:47
uniq 去重




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2