大家好!
我有几个问题向大家请教。
我有一些文本,想从中间提取一些信息。
以下是其中的一个文本。
N9Received: from cloudband-be.local (localhost [127.0.0.1])N: by cloudband-be.local (Postfix) with ESMTP id 545615405FBN; for <test@test.com>; Fri, 22 Aug 2014 00:54:12 +0000 (UTC)N+Date: Fri, 22 Aug 2014 00:54:12 +0000 (UTC)N
From: foo@cloud-band.comN
To: test@test.comNHMessage-ID: <346866564.3.1408668852344.JavaMail.ncos@cloudband-be.local>N/Subject: Your CloudBand password has been resetN
MIME-Version: 1.0N'Content-Type: text/plain; charset=UTF-8N
Content-Transfer-Encoding: 7bitN
Dear test test, N
Your password has been reset. N
Username: test N
E-mail: test@test.com N
Password: EY3vNTKe N
NuTo protect the security of your account our customer service and support personnel will never ask for your password. N
Sincerely, N
CloudBand Customer ServiceX
我希望从这些文本中获取每个用户名、邮件、密码和日期,并以以下方式逐行输出:
hrg@linux-xb9s:~/work/shell> cat aaa
N9Received: from cloudband-be.local (localhost [127.0.0.1])N: by cloudband-be.local (Postfix) with ESMTP id 545615405FBN; for <test@test.com>; Fri, 22 Aug 2014 00:54:12 +0000 (UTC)N+Date: Fri, 22 Aug 2014 00:54:12 +0000 (UTC)N
From: foo@cloud-band.comN
To: test@test.comNHMessage-ID: <346866564.3.1408668852344.JavaMail.ncos@cloudband-be.local>N/Subject: Your CloudBand password has been resetN
MIME-Version: 1.0N'Content-Type: text/plain; charset=UTF-8N
Content-Transfer-Encoding: 7bitN
Dear test test, N
Your password has been reset. N
Username: test N
E-mail: test@test.com N
Password: EY3vNTKe N
NuTo protect the security of your account our customer service and support personnel will never ask for your password. N
Sincerely, N
CloudBand Customer ServiceX
hrg@linux-xb9s:~/work/shell> grep -Po 'Username:.*(?=N)|E-mail:.*(?=N)|Password:.*(?=N)|Date:.*(?=N)' aaa | sed ':a N;s/\n/|/g;ta' | awk -F'|' '{print $2"|"$3"|"$4"|"$1}'
Username: test |E-mail: test@test.com |Password: EY3vNTKe |Date: Fri, 22 Aug 2014 00:54:12 +0000 (UTC)
hrg@linux-xb9s:~/work/shell> 作者: bikkuri 时间: 2014-08-23 13:28 本帖最后由 bikkuri 于 2014-08-23 17:10 编辑
谢谢您的答复,但是好像在处理文本的时候,一部分可以得到正确的输出,另一部分则得不到正确的输出。
以下是一个没有得到正确输出的文本的例子:
N)Received: by cloudband-be.local (Postfix)N6 id E9C79540610; Fri, 22 Aug 2014 08:21:04 +0000 (UTC)N+Date: Fri, 22 Aug 2014 08:21:
04 +0000 (UTC)N=From: MAILER-DAEMON@cloudband-be.local (Mail Delivery System)N,Subject: Undelivered Mail Returned to SenderN
To: foo@cloud-band.comN
Auto-Submitted: auto-repliedN
MIME-Version: 1.0N<Content-Type: multipart/report; report-type=delivery-status;N5 boundary="571345405FA.1408695664/cloudband-b
e.local"N;Message-Id: <20140822082104.E9C79540610@cloudband-be.local>N
N$This is a MIME-encapsulated message.N
N+--571345405FA.1408695664/cloudband-be.localN!Content-Description: NotificationN*Content-Type: text/plain; charset=us-asciiN
N3This is the mail system at host cloudband-be.local.N
N;I'm sorry to have to inform you that your message could notN<be delivered to one or more recipients. It's attached below.N
N7For further assistance, please send mail to postmaster.N
N9If you do so, please include this problem report. You canN8delete your own text from the attached returned message.N
N" The mail systemN
N#<kiyoshi.amemiya@ibm.com>: hostNB usnavsmail1.ndc.ibm.com[135.3.39.9] said: 550 5.7.1NK <kiyoshi.amemiya@ibm.com>...
Fix reverse DNS for 135.254.61.241 (inN
reply to RCPT TO command)N
N+--571345405FA.1408695664/cloudband-be.localN$Content-Description: Delivery reportN%Content-Type: message/delivery-statusN
N&Reporting-MTA: dns; cloudband-be.localN
X-Postfix-Queue-ID: 571345405FAN,X-Postfix-Sender: rfc822; foo@cloud-band.comN3Arrival-Date: Fri, 22 Aug 2014 08:21:01 +0000 (UTC)N
N4Final-Recipient: rfc822; kiyoshi.amemiya@ibm.comN6Original-Recipient: rfc822;kiyoshi.amemiya@ibm.comN
Action: failedN
Status: 5.7.1N3Remote-MTA: dns; usnavsmail1.ndc.ibm.comNMDiagnostic-Code: smtp; 550 5.7.1 <kiyoshi.amemiya@ibm.com>..
. Fix reverseN
DNS for 135.254.61.241N
N+--571345405FA.1408695664/cloudband-be.localN(Content-Description: Undelivered MessageN
Content-Type: message/rfc822N
N!Return-Path: <foo@cloud-band.com>N9Received: from cloudband-be.local (localhost [127.0.0.1])N: by cloudband-be.local (Postf
ix) with ESMTP id 571345405FANI for <kiyoshi.amemiya@ibm.com>; Fri, 22 Aug 2014 08:21:01 +0000 (UTC)N+Date: Fri, 22 Aug 2014 08:
21:01 +0000 (UTC)N
From: foo@cloud-band.comN
To: kiyoshi.amemiya@ibm.comNIMessage-ID: <1575957485.9.1408695661356.JavaMail.ncos@cloudband-be.local>N/Subject: Your CloudBand
password has been resetN
MIME-Version: 1.0N'Content-Type: text/plain; charset=UTF-8N
Content-Transfer-Encoding: 7bitN
Dear amemiya kiyoshi, N
Your password has been reset. N
Username: kamemiya N$E-mail: kiyoshi.amemiya@ibm.com N
Password: fpMMduNF N
NuTo protect the security of your account our customer service and support personnel will never ask for your password. N
Sincerely, N
CloudBand Customer ServiceN
N---571345405FA.1408695664/cloudband-be.local--X
处理的结果是:
[root@cloudband-be E]# strings *|grep -Po 'Username:.*(?=N)|E-mail:.*(?=N)|Password:.*(?=N)|Date:.*(?=N)' | sed ':a N;s/\n/|/g;ta' |
awk -F'|' '{print $2"|"$3"|"$4"|"$1}'
Date: Fri, 22 Aug 2014 08:21:01 +0000 (UTC)|Date: Fri, 22 Aug 2014 08:21:01 +0000 (UTC)|Username: kamemiya N$E-mail: kiyoshi.amemi ya@ibm.com |Date: Fri, 22 Aug 2014 08:21:04 +0000 (UTC)N=From: MAILER-DAEMON@cloudband-be.local (Mail Delivery System)N,Subject: Und
elivered Mail Returned to Sender
我看了一下,我看了一下出错的原因是因为输入的源文件中有三个Date。
[root@cloudband-be E]# strings E9C79540610 |grep -c "Date"
3
能不能只处理最靠近Username的那个?
我用您的grep命令的部分可以得到以下结果。
Date: Fri, 22 Aug 2014 00:56:13 +0000 (UTC)N=From: MAILER-DAEMON@cloudband-be.local (Mail Delivery System)N,Subject: Undelivered Mail Returned to Sender
Date: Sat, 23 Aug 2014 06:32:49 +0000 (UTC)N=From: MAILER-DAEMON@cloudband-be.local (Mail Delivery System)N,Subject: Undelivered Mail Returned to Sender
Date: Mon, 18 Aug 2014 05:36:18 +0000 (UTC)
Date: Mon, 18 Aug 2014 05:36:18 +0000 (UTC)
Username: demo
E-mail: demo.taro@demo.com
Password: A4qMoHko
Date: Fri, 22 Aug 2014 08:21:04 +0000 (UTC)N=From: MAILER-DAEMON@cloudband-be.local (Mail Delivery System)N,Subject: Undelivered Mail Returned to Sender
可能是编译的时候没有加--enable-switch选项,所以不支持switch语句。
If gawk is configured with the --enable-switch option to the configure command, then it accepts an additional
control-flow statement:
switch (expression) {
case value|regex : statement
...
[ default: statement ]
}
If gawk is configured with the --disable-directories-fatal option, then it will silently skip directories named
on the command line. Otherwise, it will do so only if invoked with the --traditional option.