免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 4125 | 回复: 1
打印 上一主题 下一主题

[文本处理] AWK多行,不同条件输出不同字段如何处理? [复制链接]

论坛徽章:
145
技术图书徽章
日期:2013-10-01 15:32:13戌狗
日期:2013-10-25 13:31:35金牛座
日期:2013-11-04 16:22:07子鼠
日期:2013-11-18 18:48:57白羊座
日期:2013-11-29 10:09:11狮子座
日期:2013-12-12 09:57:42白羊座
日期:2013-12-24 16:24:46辰龙
日期:2014-01-08 15:26:12技术图书徽章
日期:2014-01-17 13:24:40巳蛇
日期:2014-02-18 14:32:59未羊
日期:2014-02-20 14:12:13白羊座
日期:2014-02-26 12:06:59
1 [报告]
发表于 2015-09-02 10:53 |显示全部楼层
回复 1# xiaogui_vip


$ cat FILE
2015-08-31 15:53:59,383 INFO  -  进入任务:sendTask ZH05
2015-08-31 15:53:59,384 INFO  -  目标号码:1
2015-08-31 15:53:59,384 INFO  -  FROM:111,TO:222 BEGIN
2015-08-31 15:53:59,384 INFO  -  请求长度为:110
2015-08-31 15:53:59,424 INFO  -  请求状态:200
2015-08-31 15:53:59,424 INFO  -  返回值为:<?xml version="1.0" encoding="utf-8"?>2|14410076429841410</string>
2015-08-31 15:53:59,427 INFO  -  返回值解析后为:2|14410076429841410

$ awk '
BEGIN{
OFS=","
}
/进入任务/{
  time = $1" "$2
  sub(/,.+$/, "", time)
  task = $NF
  to = "";
}
match($0,/TO:([^ ]+)/,a){
  to = a[1]
}
/返回值解析后/{
  sub(/^.*[|]/, "", $NF)
  print time, task, to, $NF
}' FILE
2015-08-31 15:53:59,ZH05,222,14410076429841410

   

论坛徽章:
145
技术图书徽章
日期:2013-10-01 15:32:13戌狗
日期:2013-10-25 13:31:35金牛座
日期:2013-11-04 16:22:07子鼠
日期:2013-11-18 18:48:57白羊座
日期:2013-11-29 10:09:11狮子座
日期:2013-12-12 09:57:42白羊座
日期:2013-12-24 16:24:46辰龙
日期:2014-01-08 15:26:12技术图书徽章
日期:2014-01-17 13:24:40巳蛇
日期:2014-02-18 14:32:59未羊
日期:2014-02-20 14:12:13白羊座
日期:2014-02-26 12:06:59
2 [报告]
发表于 2015-09-02 11:14 |显示全部楼层
回复 5# xiaogui_vip

https://www.gnu.org/software/gawk/manual/gawk.html#String-Functions
9.1.3 String-Manipulation Functions

The functions in this section look at or change the text of one or more strings.

....

sub(regexp, replacement [, target])

    Search target, which is treated as a string, for the leftmost, longest substring matched by the regular expression regexp. Modify the entire string by replacing the matched text with replacement. The modified string becomes the new value of target. Return the number of substitutions made (zero or one).

    The regexp argument may be either a regexp constant (/…/) or a string constant ("…"). In the latter case, the string is treated as a regexp to be matched. See Computed Regexps, for a discussion of the difference between the two forms, and the implications for writing your program correctly.

    This function is peculiar because target is not simply used to compute a value, and not just any expression will do—it must be a variable, field, or array element so that sub() can store a modified value there. If this argument is omitted, then the default is to use and alter $0.47 For example:

    str = "water, water, everywhere"
    sub(/at/, "ith", str)

    sets str to ‘wither, water, everywhere’, by replacing the leftmost longest occurrence of ‘at’ with ‘ith’.

    If the special character ‘&’ appears in replacement, it stands for the precise substring that was matched by regexp. (If the regexp can match more than one string, then this precise substring may vary.) For example:

    { sub(/candidate/, "& and his wife"); print }

    changes the first occurrence of ‘candidate’ to ‘candidate and his wife’ on each input line. Here is another example:

    $ awk 'BEGIN {
    >         str = "daabaaa"
    >         sub(/a+/, "C&C", str)
    >         print str
    > }'
    -| dCaaCbaaa

    This shows how ‘&’ can represent a nonconstant string and also illustrates the “leftmost, longest” rule in regexp matching (see Leftmost Longest).

    The effect of this special character (‘&’) can be turned off by putting a backslash before it in the string. As usual, to insert one backslash in the string, you must write two backslashes. Therefore, write ‘\\&’ in a string constant to include a literal ‘&’ in the replacement. For example, the following shows how to replace the first ‘|’ on each line with an ‘&’:

    { sub(/\|/, "\\&"); print }

    As mentioned, the third argument to sub() must be a variable, field, or array element. Some versions of awk allow the third argument to be an expression that is not an lvalue. In such a case, sub() still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away because there is no place to put it. Such versions of awk accept expressions like the following:

    sub(/USA/, "United States", "the USA and Canada")

    For historical compatibility, gawk accepts such erroneous code. However, using any other nonchangeable object as the third parameter causes a fatal error and your program will not run.

    Finally, if the regexp is not a regexp constant, it is converted into a string, and then the value of that string is treated as the regexp to match.

   
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP