- 论坛徽章:
- 0
|
本帖最后由 fortunegzhou 于 2013-08-28 14:34 编辑
日志格式如下
2013/08/16 01:10:11.111 E12345678900-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:11:22.222 E12345678900-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:12:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:13:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:14:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:15:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:16:11.111 E12345678900-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:17:22.222 E12345678900-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:18:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:19:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:20:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:21:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:22:11.111 E12345678900-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:25:22.222 E12345678900-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:30:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:32:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:36:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
2013/08/16 01:40:33.046 E68C93001100-0 6513 123 0 IN OK() 0 0 0 0 0 0
请问:如何用awk找到时间范围是2013/08/16 01:15:00.000 到 2013/08/16 01:30:00.000 之间的日志数据
另:日志文件很大,5G以上,用什么方式查会快??
----------------------------------------2013/08/28(上午)更新-------------------------------------------------------
感谢大家的帮助,现报告各种写法的速度情况:
测试文件:2G, 返回的结果集在文件的末尾,大概15条
awk '{t=$1$2; if(t>="2013/08/2001:15:00.000" && t<"2013/08/2001:20:00.000" print}' $filename
awk '$1$2 >= "2013/08/2001:15:00.000" && $1$2 <= "2013/08/2001:20:00.000"' $filename
awk -vs="2013/08/20 01:15:00.000" -ve="2013/08/20 01:20:00.000" '{t=$1" "$2}s<=t&&t<=e' $filename
awk 'BEGIN{s="2013/08/20 01:15:00.000";e="2013/08/20 01:20:00.000"}{t=$1" "$2}s<=t&&t<=e' $filename
这三种写法,执行速度最快,时间都在25s或26s
awk 'BEGIN{start=mktime("2013 08 20 01 15 00";end=mktime("2013 08 20 01 20 00"}{split($1" "$2,a,"[/|.| ]";now=mktime(a[1]" "a[2]" "a[3]" "a[4]" "a[5]" "a[6]);if(now>=start&&(now<end||(now==end&&a[7]==0))) print $0}' $filename
这种写法,速度很慢,至少5分钟没出结果,估计是mktime命令执行相当慢, 此种写法没问题,我测试过小文件,能返回正确结果
用perl脚本也比较慢,都超过200s
perl -lane 'BEGIN{$s="2013/08/20 01:15:00.000";$e="2013/08/20 01:20:00.000"}{$t="$F[0] $F[1]";if($s lt $t && $t lt $e){print}}' $filename
用时244s
perl -lane 'print if "2013/08/20 01:15:00.000" lt "@F[0..1]" and "@F[0..1]" lt "2013/08/20 01:20:00.000"' $filename
用时259s
”seesea2517“的方案:没出正确的结果,估计是\t分隔符所致,我的日志文件可能不是\t分隔的,但是不管怎样,命令执行完需耗时分别如下:
cat $filename | sed -n '/2013\/08\/20\t01:15/, /2013\/08\/20\t01:20/ p' 28s
cat $filename | sed -n '/2013\/08\/20\t01:15/, /2013\/08\/20\t01:20/ p; /2013\/08\/20\t01:20/q;' 46s
----------------------------------------2013/08/28(下午)更新-------------------------------------------------------
根据cao627的提议,我试了2楼Shell_HAT的grep方式,发现速度更快,只用了2s
grep -E "2013/08/20\s*01:1[5-9]" ./2G.log |
|