论坛徽章:: 145

1楼 [报告]

发表于 2015-09-02 10:53 |显示全部楼层

回复 1# xiaogui_vip

$ cat FILE
2015-08-31 15:53:59,383 INFO  -  进入任务：sendTask ZH05
2015-08-31 15:53:59,384 INFO  -  目标号码：1
2015-08-31 15:53:59,384 INFO  -  FROM:111，TO:222 BEGIN
2015-08-31 15:53:59,384 INFO  -  请求长度为:110
2015-08-31 15:53:59,424 INFO  -  请求状态:200
2015-08-31 15:53:59,424 INFO  -  返回值为：<?xml version="1.0" encoding="utf-8"?>2|14410076429841410</string>
2015-08-31 15:53:59,427 INFO  -  返回值解析后为：2|14410076429841410

$ awk '
BEGIN{
OFS=","
}
/进入任务/{
  time = $1" "$2
  sub(/,.+$/, "", time)
  task = $NF
  to = "";
}
match($0,/TO:([^ ]+)/,a){
  to = a[1]
}
/返回值解析后/{
  sub(/^.*[|]/, "", $NF)
  print time, task, to, $NF
}' FILE
2015-08-31 15:53:59,ZH05,222,14410076429841410

jason680

富可敌国

论坛徽章:: 145

2楼 [报告]

发表于 2015-09-02 11:14 |显示全部楼层

回复 5# xiaogui_vip

https://www.gnu.org/software/gawk/manual/gawk.html#String-Functions
9.1.3 String-Manipulation Functions

The functions in this section look at or change the text of one or more strings.

....

sub(regexp, replacement [, target])

Search target, which is treated as a string, for the leftmost, longest substring matched by the regular expression regexp. Modify the entire string by replacing the matched text with replacement. The modified string becomes the new value of target. Return the number of substitutions made (zero or one).

The regexp argument may be either a regexp constant (/…/) or a string constant ("…"). In the latter case, the string is treated as a regexp to be matched. See Computed Regexps, for a discussion of the difference between the two forms, and the implications for writing your program correctly.

This function is peculiar because target is not simply used to compute a value, and not just any expression will do—it must be a variable, field, or array element so that sub() can store a modified value there. If this argument is omitted, then the default is to use and alter $0.47 For example:

str = "water, water, everywhere"
sub(/at/, "ith", str)

sets str to ‘wither, water, everywhere’, by replacing the leftmost longest occurrence of ‘at’ with ‘ith’.

If the special character ‘&’ appears in replacement, it stands for the precise substring that was matched by regexp. (If the regexp can match more than one string, then this precise substring may vary.) For example:

{ sub(/candidate/, "& and his wife"); print }

changes the first occurrence of ‘candidate’ to ‘candidate and his wife’ on each input line. Here is another example:

$ awk 'BEGIN {
>       str = "daabaaa"
>       sub(/a+/, "C&C", str)
>       print str
> }'
-| dCaaCbaaa

This shows how ‘&’ can represent a nonconstant string and also illustrates the “leftmost, longest” rule in regexp matching (see Leftmost Longest).

The effect of this special character (‘&’) can be turned off by putting a backslash before it in the string. As usual, to insert one backslash in the string, you must write two backslashes. Therefore, write ‘\\&’ in a string constant to include a literal ‘&’ in the replacement. For example, the following shows how to replace the first ‘|’ on each line with an ‘&’:

{ sub(/\|/, "\\&"); print }

As mentioned, the third argument to sub() must be a variable, field, or array element. Some versions of awk allow the third argument to be an expression that is not an lvalue. In such a case, sub() still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away because there is no place to put it. Such versions of awk accept expressions like the following:

sub(/USA/, "United States", "the USA and Canada")

For historical compatibility, gawk accepts such erroneous code. However, using any other nonchangeable object as the third parameter causes a fatal error and your program will not run.

Finally, if the regexp is not a regexp constant, it is converted into a string, and then the value of that string is treated as the regexp to match.

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

返回列表

Chinaunix › 论坛 › 程序设计 › Shell › AWK多行，不同条件输出不同字段如何处理？

[文本处理] AWK多行，不同条件输出不同字段如何处理？ [复制链接]