免费注册 查看新帖 |


  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1662 | 回复: 0
打印 上一主题 下一主题

Awk Tutorial [复制链接]

1 [收藏(0)] [报告]
发表于 2007-08-14 11:20 |只看该作者 |倒序浏览

Table of Contents

Copyright 2001,2004 Bruce Barnett and General Electric Company
All rights reserved
You are allowed to print copies of this tutorial for your personal use, and link to this page, but you are not allowed to make electronic copies, or redistribute this tutorial in any form without permission.
Last update: Tue Mar 9 11:07:08 EST 2004 Awk is an extremely versatile programming language for working on files. We'll teach you just enough to understand the examples in this book, plus a smidgen.
Why learn AWK?

In the past I have covered grep and sed. This section discusses AWK, another cornerstone of UNIX shell programming. There are three variations of AWK:
AWK - the original from AT&T
NAWK - A newer, improved version from AT&T
GAWK - The Free Software foundation's version
Originally, I didn't plan to discuss NAWK, but several UNIX vendors have replaced AWK with NAWK, and there are several incompatibilities between the two. It would be cruel of me to not warn you about the differences. So I will highlight those when I come to them. It is important to know than all of AWK's features are in NAWK and GAWK. Most, if not all, of NAWK's features are in GAWK. NAWK ships as part of Solaris. GAWK does not. However, many sites on the Internet have the sources freely available. If you user Linux, you have GAWK.
Why is AWK so important? It is an excellent filter and report writer. Many UNIX utilities generates rows and columns of information. AWK is an excellent tool for processing these rows and columns, and is easier to use AWK than most conventional programming languages. It can be considered to be a pseudo-C interpretor, as it understands the same arithmatic operators as C. AWK also has string manipulation functions, so it can search for particular strings and modify the output. AWK also has associative arrays, which are incredible useful, and is a feature most computing languages lack. Associative arrays can make a complex problem a trivial exercise.
I won't exhaustively cover AWK. That is, I will cover the essential parts, and avoid the many variants of AWK. It might be too confusing to discuss three different versions of AWK. I won't cover the GNU version of AWK called "gawk." Similarly, I will not discuss the new AT&T AWK called "nawk." The new AWK comes on the Sun system, and you may find it superior to the old AWK in many ways. In particular, it has better diagnostics, and won't print out the infamous "bailing out near line ..." message the original AWK is prone to do. Instead, "nawk" prints out the line it didn't understand, and highlights the bad parts with arrows. If you find yourself needing a feature that is very difficult or impossible to do in AWK, I suggest you either use NAWK, or convert your AWK script into PERL using the "a2p" conversion program which comes with PERL. PERL is a marvelous language, and I use it all the time, but I do not plan to cover PERL in these tutorials. Having made my intention clear, I can continue with a clear conscience.
Many UNIX utilities have strange names. AWK is one of those utilities. It is not an abbreviation for awkward. In fact, it is an elegant and simple language. The work "AWK" is derived from the initials of the language's three developers: A. Aho, B. W. Kernighan and P. Weinberger.
Basic Structure

The essential organization of an AWK program follows the form:
pattern { action }
The pattern specifies when the action is performed. Like most UNIX utilities, AWK is line oriented. That is, the pattern specifies a test that is performed with each line read as input. If the condition is true, then the action is taken. The default pattern is something that matches every line. This is the blank or null pattern. Two other important patterns are specified by the keywords "BEGIN" and "END." As you might expect, these two words specify actions to be taken before any lines are read, and after the last line is read. The AWK program below:
BEGIN         { print "START" }
                { print }
END           { print "STOP" }
adds one line before and one line after the input file. This isn't very useful, but with a simple change, we can make this into a typical AWK program:
BEGIN { print "File\tOwner"," }
{ print $8, "\t", $3}
END { print " - DONE -" }
The characters "\t" Indicates a tab character so the output lines up on even boundries. The "$8" and "$3" have a meaning similar to a shell script. Instead of the eighth and third argument, they mean the eighth and third field of the input line. You can think of a field as a column, and the action you specify operates on each line or row read in.
There are two differences between AWK and a shell processing the characters within double quotes. AWK understands special characters follow the "" character. The UNIX shells do not. Also, unlike the shell (and PERL) AWK does not evaluate variables within strings. The second line could not be written like this:
{print "$8\t$3" }
That example would print "$8 $3." Inside the quotes, the dollar sign is not a special character. Outside, it corresponds to a field. What do I mean by the third and eight field? Consider the "/usr/bin/ls -l" command, which has eight columns of information. The System V version, "/usr/5bin/ls -l," has 9 columns. The third column is the owner, and the eighth column in the name of the file. This AWK program can be used to process the output of the "ls -l" command, printing out the filename, then the owner, for each file. I'll show you how.
   To download [HOLER]实战Solaris之Awk编程.pdf for more information



您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复


北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP