免费注册 查看新帖 |

ChinaUnix.net

  平台 论坛 博客 文库
最近访问板块 发新帖
楼主: yinyuemi

[学习共享] awk初学之常见问题 [复制链接]

论坛徽章:
1
技术图书徽章
日期:2013-09-25 21:06:29
发表于 2011-11-07 22:22 |显示全部楼层
回复 1# yinyuemi


    非常感谢共享!

论坛徽章:
1
技术图书徽章
日期:2013-09-25 21:06:29
发表于 2011-11-07 22:37 |显示全部楼层
回复 1# yinyuemi


    楼主具体解析下 awk -v RS="#" '{$1=$1;print $0}'  $1=$1应该如何理解啊? 刚刚看了你给发的那个帖子,没有看明白!

论坛徽章:
2
射手座
日期:2014-10-10 15:59:4715-16赛季CBA联赛之上海
日期:2016-03-03 10:27:14
发表于 2011-11-08 00:52 |显示全部楼层
本帖最后由 yinyuemi 于 2011-11-08 00:54 编辑
回复  yinyuemi


    楼主具体解析下 awk -v RS="#" '{$1=$1;print $0}'  $1=$1应该如何理解啊? 刚刚 ...
yuloveban 发表于 2011-11-07 22:37

http://www.gnu.org/s/gawk/manual/gawk.html#Fields
Advanced Notes: Understanding $0

It is important to remember that $0 is the full record, exactly as it was read from the input. This includes any leading or trailing whitespace, and the exact whitespace (or other characters) that separate the fields.

It is a not-uncommon error to try to change the field separators in a record simply by setting FS and OFS, and then expecting a plain ‘print’ or ‘print $0’ to print the modified record.

But this does not work, since nothing was done to change the record itself. Instead, you must force the record to be rebuilt, typically with a statement such as ‘$1 = $1’, as described earlier.
  1. $1=$1可以使OFS生效


  2. echo '1#1#1
  3. 2#2#2'  |awk -vRS="#" '{$1=$1;print $0}'
  4. 1
  5. 1
  6. 1 2 # OFS默认值为空格,生效!或者说,任意一个对域进行操作的action,都会使得OFS生效,比如$2=$2,NF+=0;
  7. 2
  8. 2

  9. 再举两个例子:
  10.     echo '1 1 1
  11.     2 2 2' |awk 'NR==1{OFS=":";$1=$1;print}NR==2{OFS="#";$1=$1;print}'  
  12.     1:1:1
  13.     2#2#2

  14.     echo '1 1 1
  15.     2 2 2' |awk 'NR==1{OFS=":";$0=$0;print}NR==2{OFS="#";$0=$0;print}'   # $0=$0的指令并不会引起OFS生效。
  16.     1 1 1
  17.     2 2 2

复制代码

论坛徽章:
1
技术图书徽章
日期:2013-09-25 21:06:29
发表于 2011-11-08 09:28 |显示全部楼层
回复 43# yinyuemi


    3qs

论坛徽章:
0
发表于 2011-11-08 13:08 |显示全部楼层
总结的真好,感谢

论坛徽章:
0
发表于 2011-11-17 20:12 |显示全部楼层
关于第四点,今天看manual的时候找到了,贴下:
Finally, there are times when it is convenient to force awk to rebuild the entire record, using the current value of the fields and OFS. To do this, use the seemingly innocuous assignment:

     $1 = $1   # force record to be reconstituted
     print $0  # or whatever else with $0

This forces awk rebuild the record. It does help to add a comment, as we've shown here.

There is a flip side to the relationship between $0 and the fields. Any assignment to $0 causes the record to be reparsed into fields using the current value of FS. This also applies to any built-in function that updates $0, such as sub() and gsub() (see String Functions).

Advanced Notes: Understanding $0
It is important to remember that $0 is the full record, exactly as it was read from the input. This includes any leading or trailing whitespace, and the exact whitespace (or other characters) that separate the fields.

It is a not-uncommon error to try to change the field separators in a record simply by setting FS and OFS, and then expecting a plain ‘print’ or ‘print $0’ to print the modified record.

But this does not work, since nothing was done to change the record itself. Instead, you must force the record to be rebuilt, typically with a statement such as ‘$1 = $1’, as described earlier.

论坛徽章:
2
射手座
日期:2014-10-10 15:59:4715-16赛季CBA联赛之上海
日期:2016-03-03 10:27:14
发表于 2011-11-17 23:57 |显示全部楼层
回复 46# xiaopan3322


    多谢Bob

论坛徽章:
0
发表于 2011-11-18 09:03 |显示全部楼层
不错,给力的贴子

论坛徽章:
0
发表于 2012-05-15 14:08 |显示全部楼层
10. awk ‘! a[$0]++’ 怎么理解?

这是一个非常经典的去重复项的awk语句,虽然短小,不过涉及到了不少知识点,下面一一解读:

<1> :”!” 即非。

<2>:a[$0],以$0为数据下标,建立数组a

<3>:a[$0]++,即给数组a赋值,a[$0]+=1

<4> :那么组合起来,awk是怎么执行!a[$0]++的呢?我用一个实际例子来解释:


01.cat file

02.111

03.222

04.111

05.222

06.333

07.

08.awk '{print a[$0],!a[$0]++,a[$0],!a[$0],$0}' file

09.  1 1 0 111

10.  1 1 0 222

11.1 0 2 0 111

12.1 0 2 0 222

13.  1 1 0 333
复制代码


原来,第一个a[$0]的值为空,所以!a[$0]++是先作判断,结果为1(非空为真,即为1),再作数组赋值a[$0]++。这也就是为什么前面的!a[$0]++并不一定等于后面的!a[$0]。

awk ‘++a[$0]==1’ 和上面的代码作用一样,你理解了么?


不懂

论坛徽章:
13
15-16赛季CBA联赛之同曦
日期:2016-01-28 19:52:032015亚冠之北京国安
日期:2015-10-07 14:28:19NBA常规赛纪念章
日期:2015-05-04 22:32:03处女座
日期:2015-01-15 19:45:44卯兔
日期:2014-10-28 16:17:14白羊座
日期:2014-05-24 15:10:46寅虎
日期:2014-05-10 09:50:35白羊座
日期:2014-03-12 20:52:17午马
日期:2014-03-01 08:37:27射手座
日期:2014-02-19 19:26:54子鼠
日期:2013-11-30 09:03:56狮子座
日期:2013-09-08 08:37:52
发表于 2012-05-29 15:10 |显示全部楼层
总结的真好 ,感谢分享!
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

数据风云,十年变迁
DTCC 第十届中国数据库技术大会已启航!

2019年5月8日~5月10日,由IT168旗下ITPUB企业社区平台主办的第十届中国数据库技术大会(DTCC2019),将在北京隆重召开。大会将邀请百余位行业专家,就热点技术话题进行分享,是广大数据领域从业人士的又一次年度盛会和交流平台。与SACC2018类似,本届大会将采用“3+2”模式:3天传统技术演讲+2天深度主题培训。大会不仅提供超100场的主题演讲,还会提供连续2天的深度课程培训,深化数据领域的项目落地实践方案。
DTCC2019,一场值得期待的数据技术盛会,殷切地希望您报名参与!

活动入口>>
  

北京盛拓优讯信息技术有限公司. 版权所有 16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122
中国互联网协会会员  联系我们:huangweiwei@it168.com
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP