- 论坛徽章:
- 1
|
本帖最后由 mingming_song 于 2014-05-26 18:58 编辑
各位麻烦个事情,老生常谈的nginx的日志处理问题,找出某天nginx的访问TOP10的URL,以及URL中最大,最小,平均时间
处理还是截取 request 字段和request_time字段,但是单独截取一段进行排序,去重,统计之后结果是正确的,但是麻烦的是。同时使用两个字段指定字段排序去重之后得到的结果和我单独使用字段得出的结果不太一致,
这里是log_format格式
log_format main '$remote_addr $host $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" "$http_user_agent" '
'"$gzip_ratio" $http_x_forwarded_proto $http_x_real_ip '
'$request_time $upstream_response_time $pipe';
cat log | awk -F"[./]" '{print $9}' | sort | uniq -c | sort -r -k1 | head -n 20 单纯的只是统计了URL
cat log | awk -F"[? ]" '{print $7,"|",$(NF-2)}' | sort -t"|" -k1 | uniq -c | head -n 20 的出来的URL排名和时间都不同于之前的,请问这边是哪里出的问题
另外计算时间上面,利用awk是sum统计出来的时间和真实的时间看起来也不太一致,不知道是哪里出了问题,awk不是特别熟悉
ps: 附上日志部分,特别信息做了特别处理- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "GET /EditText_view.action?textId=857548 HTTP/1.1" 200 7388 "http://www.so.com/s?q=manbang\xE6\x9B\xBC\xE9\x82\xA6\xE6\x97\x97\xE8\x88\xB0\xE5\xBA\x97&ie=utf-8&src=360sou_home" "Mozilla/5.0 (Windows; U; Windows NT 5.2) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.27 Safari/525.13" "3.22" http 42.120.64.7 0.149 0.148 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "POST /imgcenter/upload HTTP/1.1" 200 99 "-" "Java/1.6.0_25" "-" http 210.14.140.203 0.013 0.013 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "GET /EditText_view.action?textId=1470909&client=key:LiFangWangForiPhone&mobile=1&sourcetype=weixin HTTP/1.1" 200 22448 "-" "Mozilla/5.0 (Linux; U; Android 4.3; zh-cn; SM-N9005 Build/JSS15J) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30 MicroMessenger/5.2.1.400" "-" http 124.239.208.135 0.194 0.194 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "GET /view.action?postId=15471449&from=m_media HTTP/1.1" 302 0 "http://www.xxxx.com/1626925" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_1_3 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Mobile/10B329 MicroMessenger/5.1" "-" http 14.158.241.211 0.006 0.005 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "GET /EditText_view.action?cf=false&textId=1396577 HTTP/1.1" 200 28187 "-" "Mozilla/5.0" "-" http 183.60.198.134 0.531 0.531 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "GET /media/movie HTTP/1.1" 200 147899 "-" "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-" http 101.28.10.10 5.288 0.226 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:01 +0800] "GET /do_not_delete/noc.gif HTTP/1.1" 200 2358 "-" "ChinaCache" "-" http 61.178.248.109 0.002 0.001 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:02 +0800] "GET /EditPicture_photoView.action?pictureId=5706511 HTTP/1.1" 200 5376 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "3.38" http 66.249.73.216 0.223 0.222 .
- 192.168.xx.xx www.xxxx.com - [15/May/2014:23:59:02 +0800] "GET /timeline_reblog.action?postId=1511970 HTTP/1.1" 302 0 "http://www.xxxx.com/xxxx.action?postId=1511970" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)" "-" http 221.204.7.62 0.006 0.006 .
复制代码 |
|