免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
1234下一页
最近访问板块 发新帖
查看: 12437 | 回复: 30
打印 上一主题 下一主题

gawk 4.0.0 release!!! [复制链接]

论坛徽章:
2
射手座
日期:2014-10-10 15:59:4715-16赛季CBA联赛之上海
日期:2016-03-03 10:27:14
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2011-07-01 04:29 |只看该作者 |倒序浏览
本帖最后由 yinyuemi 于 2011-07-07 01:53 编辑

大家可以到GNU的ftp上下载下来爽一爽ftp://ftp.gnu.org/gnu/gawk,粗略的看了下介绍,新版本的gawk功能更强大了!!!
下面是4.0.0版本gawk的一些新的features(测试了下部分功能):
http://lists.gnu.org/archive/html/info-gnu/2011-06/msg00013.html

   Copyright (C) 2010, 2011 Free Software Foundation, Inc.

   Copying and distribution of this file, with or without modification,
   are permitted in any medium without royalty provided the copyright
   notice and this notice are preserved.

Changes from 3.1.8 to 4.0.0
---------------------------

1. The special files /dev/pid, /dev/ppid, /dev/pgrpid and /dev/user are
   now completely gone. Use PROCINFO instead.

2. The POSIX 2008 behavior for `sub' and `gsub' are now the default.
   THIS CHANGES BEHAVIOR!!!!

  1. echo '11122211' |awk '{sub(/1{3}/,"")}1'
  2. 22211
复制代码
3. The \s and \S escape sequences are now recognized in regular expressions.

  1. echo '111 222  11' |awk '{gsub(/\s/,"")}1'
  2. 11122211
复制代码
4. The split() function accepts an optional fourth argument which is an array
   to hold the values of the separators.
  1. echo '111-222|33' |awk '{split($0,a,/[-|]/,seps);print "a[1] = "a[1] RS "a[2] = "a[2] RS "a[3] = "a[3] RS "spes[1] = "seps[1] RS "speS[2] = "seps[2]}'
  2. a[1] = 111
  3. a[2] = 222
  4. a[3] = 33
  5. spes[1] = -
  6. speS[2] = |
复制代码
5. New -b / --characters-as-bytes option that means "hands off my data"; gawk
   won't try to treat input as a multibyte string.

6. New --sandbox option; see the doc.
  1. --sandbox
  2.     Disable the system() function, input redirections with getline, output redirections with print and printf, and dynamic extensions. This is particularly useful when you want to run awk scripts from questionable sources and need to make sure the scripts can't access your system (other than the specified input data file).
复制代码
7. Indirect function calls are now available.
  1. --With indirect function calls, you tell gawk to use the value of a variable as the name of the function to call.
复制代码
8. Interval expressions are now part of default regular expressions for
   GNU Awk syntax.

9. --gen-po is now correctly named --gen-pot.

10. switch / case is now enabled by default. There's no longer a need
    for a configure-time option.
  1. --Control flow in the switch statement works as it does in C.

  2. seq 10 |awk '{switch ($0%2){
  3. case "0":
  4. print "even number: "$0;break
  5. default:
  6. print "odd number: "$0
  7. }
  8. }'
  9. odd number: 1
  10. even number: 2
  11. odd number: 3
  12. even number: 4
  13. odd number: 5
  14. even number: 6
  15. odd number: 7
  16. even number: 8
  17. odd number: 9
  18. even number: 10
复制代码
11. Gawk now supports BEGINFILE and ENDFILE. See the doc for details.

--The body of the BEGINFILE rules is executed just before gawk reads the first record from a file. FILENAME is set to the name of the current file, and FNR is set to zero.
--The ENDFILE rule is called when gawk has finished processing the last record in an input file. For the last input file, it will be called before any END rules. (这两个功能真的很酷,尤其是在处理多个文件时,如下面:)

  1. head f1 f2
  2. ==> f1 <==
  3. aaa
  4. bbb
  5. ccc

  6. ==> f2 <==
  7. aaa
  8. bbb
  9. ccc

  10. awk 'BEGIN{print"BEGIN: ---"}BEGINFILE{print "\nBEGINFILE: +++"}{print}ENDFILE{print"ENDFILE: +++\n"}END{print"END: ---"}' f1 f2
  11. BEGIN: ---

  12. BEGINFILE: +++
  13. aaa
  14. bbb
  15. ccc
  16. ENDFILE: +++


  17. BEGINFILE: +++
  18. aaa
  19. bbb
  20. ccc
  21. ENDFILE: +++

  22. END: ---
复制代码
12. Directories named on the command line now produce a warning, not
    a fatal error, unless --posix or --traditional.

13. The new FPAT variable allows you to specify a regexp that matches
    the fields, instead of matching the field separator. The new patsplit()
    function gives the same capability for splitting.

--The value of FPAT should be a string that provides a regular expression. This regular expression describes the contents of each field.

  1. echo '111-222|33' |awk -vFS="[-|]" '{print "$1 = "$1 RS "$2 = "$2 RS "$3 = "$3}'
  2. $1 = 111
  3. $2 = 222
  4. $3 = 33

  5. #如果用FPAT呢?

  6. echo '111-222|33' |awk -vFPAT="[^-|]+" '{print "$1 = "$1 RS "$2 = "$2 RS "$3 = "$3}'
  7. $1 = 111
  8. $2 = 222
  9. $3 = 33
复制代码
14. All long options now have short options, for use in `#!' scripts.

15. Support for IPv6 added via /inet6/... special file. /inet4/... forces
    IPv4 and /inet chooses the system default (probably IPv4).

16. Added a warning for /[:space:]/ that should be /[[:space:]]/.

17. Merged with John Haque's byte code internals. Adds dgawk debugger and
    possibly improved performance.

18. `break' and `continue' are no longer valid outside a loop, even with
    --traditional.

19. POSIX character classes work with --traditional (BWK awk supports them).

20. Nuked redundant --compat, --copyleft, and --usage long options.

21. Arrays of arrays added. See the doc. (这个更强!)

  1. awk 'BEGIN{arr["a"]["b"]=1;arr["a"]["c"]=2;
  2. for( i in arr)
  3. for( j in arr[i])
  4. print i,j,arr[i][j]
  5. }'
  6. a b 1
  7. a c 2
复制代码
22. Per the GNU Coding Standards, dynamic extensions must now define
    a global symbol indicating that they are GPL-compatible. See
    the documentation and example extensions.
    THIS CHANGES BEHAVIOR!!!!

23. In POSIX mode, string comparisons use strcoll/wcscoll.
    THIS CHANGES BEHAVIOR!!!!

24. The option for raw sockets was removed, since it was never implemented.

25. If not in POSIX mode, gawk turns ranges of the form [d-h] into
    [defgh] before compiling a regexp.  Maybe this will stop all the
    questions about [a-z] matching uppercase letters.
    THIS CHANGES BEHAVIOR!!!!

26. PROCINFO["strftime"] now holds the default format for strftime().

27. Updated to latest infrastructure: Autoconf 2.68, Automake 1.11.1,
    Gettext 0.18.1, Bison 2.5.

28. Many code cleanups. Removed code for many old, unsupported systems:
        - Atari
        - Amiga
        - BeOS
        - Cray
        - MIPS RiscOS
        - MS-DOS with Microsoft Compiler
        - MS-Windows with Microsoft Compiler
        - NeXT
        - SunOS 3.x, Sun 386 (Road Runner)
        - Tandem (non-POSIX)
        - Prestandard VAX C compiler for VAX/VMS
        - Probably others that I've forgotten

29. If PROCINFO["sorted_in"] exists, for(iggy in foo) loops sort the
    indices before looping over them.  The value of this element
    provides control over how the indices are sorted before the loop
    traversal starts. See the manual.

30. A new isarray() function exists to distinguish if an item is an array
    or not, to make it possible to traverse multidimensional arrays.

31. asort() and asorti() take a third argument specifying how to sort.
    See the doc.
--

论坛徽章:
0
2 [报告]
发表于 2011-07-01 11:12 |只看该作者
沙发……

论坛徽章:
3
2015年辞旧岁徽章
日期:2015-03-03 16:54:152015年迎新春徽章
日期:2015-03-04 09:51:162015年亚洲杯之阿曼
日期:2015-04-07 20:00:59
3 [报告]
发表于 2011-07-01 17:26 |只看该作者
提示: 作者被禁止或删除 内容自动屏蔽

论坛徽章:
0
4 [报告]
发表于 2011-07-01 20:24 |只看该作者
很多非常好新功能!
但是,有个问题是,这个4。0什么时候能成为标配阿。在自己机器上过瘾地用完了新功能,放server上都不转了可就麻烦了。

论坛徽章:
0
5 [报告]
发表于 2011-07-01 22:47 |只看该作者
先顶!

论坛徽章:
1
摩羯座
日期:2014-12-29 15:59:36
6 [报告]
发表于 2011-07-03 18:31 |只看该作者
Cygwin 编译中...

论坛徽章:
0
7 [报告]
发表于 2011-07-04 10:38 |只看该作者
提示: 作者被禁止或删除 内容自动屏蔽

论坛徽章:
0
8 [报告]
发表于 2011-07-06 14:35 |只看该作者
awk 4.0 改进内容:

1. 增加了新的参数
2. 所有长参数都有对应的短参数
3. "--sandbox" 参数不再调用 system() 来访问文件系统
4. 默认使用 POSIX 2008 "sub" 和 "gsub" 动作
5. 增强了对正则表达式的支持.
6. 其他方面的改进、bug修复和代码清理

论坛徽章:
33
ChinaUnix元老
日期:2015-02-02 08:55:39CU十四周年纪念徽章
日期:2019-08-20 08:30:3720周年集字徽章-周	
日期:2020-10-28 14:13:3020周年集字徽章-20	
日期:2020-10-28 14:04:3019周年集字徽章-CU
日期:2019-09-08 23:26:2519周年集字徽章-19
日期:2019-08-27 13:31:262016科比退役纪念章
日期:2022-04-24 14:33:24
9 [报告]
发表于 2011-07-07 01:03 |只看该作者
回复 1# yinyuemi


有没有可以在windows上直接使用的exe?

论坛徽章:
2
射手座
日期:2014-10-10 15:59:4715-16赛季CBA联赛之上海
日期:2016-03-03 10:27:14
10 [报告]
发表于 2011-07-07 01:38 |只看该作者
回复 9# Shell_HAT


    gawk4.00支持cygwin environment,需要编译,(不过我没成功,ls紫云飞兄不知成功没,老大可以试试)
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP