Chinaunix

标题: 再问统计不重复的项目问题 [打印本页]

作者: conall    时间: 2013-11-19 10:32
标题: 再问统计不重复的项目问题
有一个文件,想统计两种情况下$5不重复的个数
if(\!a[$1,$2,$3,$4,$5]++)an[$5]++————an[$5]
if($7~/      /&&\!a[$1,$2,$3,$4,$5]++)bn[$5]++————bn[$5]
nawk 'BEGIN{FS=OFS=";"}NR==FNR{if(\!a[$1,$2,$3,$4,$5]++)an[$5]++;else if($7~/      /&&\!a[$1,$2,$3,$4,$5]++)bn[$5]++;next}{print $0,an[$5],bn[$5]}' 1.txt 1.txt
仅计算了an[$5]的值,求前辈更正
>cat 1.txt
BJRNC18;SITE;BJS0978;1970-1-01;1970-1-01 08:03;1970-1-01 08:03:13;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC09;SITE;BJS5780;2012-10-08;2012-10-08 01:31;2012-10-08 01:31:24;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC08;SITE;BJN1050;2012-10-10;2012-10-10 02:03;2012-10-10 02:03:04;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost
BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost
BJRNC11;SITE;BJN0585;2012-10-21;2012-10-21 04:53;2012-10-21 04:53:43;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC10;SITE;BJN5890;2012-10-21;2012-10-21 16:02;2012-10-21 16:02:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC09;SITE;BJS6220;2012-11-02;2012-11-02 01:09;2012-11-02 01:09:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC14;SITE;BJN0768;2012-11-04;2012-11-04 16:29;2012-11-04 16:29:56;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC25;SITE;BJS6156;2012-11-09;2012-11-09 09:40;2012-11-09 09:40:02;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:15;2013-11-01 07:15:24;2013-11-1  07:15:32;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost



作者: 关阴月飞    时间: 2013-11-19 10:40
回复 1# conall


    你还是直接说需求吧,你的代码看不懂呀!!!!
作者: conall    时间: 2013-11-19 10:41
回复 2# 关阴月飞

分别统计
1、统计$1,$2,$3,$4,$5有几个不同的$5
2、统计$7包含空格情况下,$1,$2,$3,$4,$5有几个不同的$5

   
作者: conall    时间: 2013-11-19 10:43
分别统计
1、统计$5不重复的个数
2、统计$7包含空格情况下,$5不重复的个数
作者: yestreenstars    时间: 2013-11-19 11:18
我猜的:
  1. [root@localhost ~]# awk -F\; '!a[$1,$2,$3,$4,$5]++{m++;if($7~/^ *$/)n++}END{print m,n}' i
  2. 11 9
  3. [root@localhost ~]#
复制代码

作者: conall    时间: 2013-11-19 11:19
类似这样
nawk 'BEGIN{FS=OFS=";"}NR==FNR{if(!a[$1,$2,$3,$4,$5]++)an[$5]++;next}{print $0,an[$5]}' 1.txt 1.txt  > 2.txt
nawk 'BEGIN{FS=OFS=";"}NR==FNR{if($7~/      /&&\!b[$1,$2,$3,$4,$5]++)bn[$5]++;next}{print $0,bn[$5]}' 2.txt 2.txt
我想写在一行
作者: yestreenstars    时间: 2013-11-19 11:29
回复 6# conall
这样子吗?
  1. [root@localhost ~]# awk 'BEGIN{FS=OFS=";"}NR==FNR&&!a[$1,$2,$3,$4,$5]++{b[$5]++;if($7~/^ *$/)c[$5]++;next}{print $0,b[$5],c[$5]}' i i
  2. BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;1
  3. BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;
  4. BJRNC18;SITE;BJS0978;1970-1-01;1970-1-01 08:03;1970-1-01 08:03:13;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  5. BJRNC09;SITE;BJS5780;2012-10-08;2012-10-08 01:31;2012-10-08 01:31:24;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  6. BJRNC08;SITE;BJN1050;2012-10-10;2012-10-10 02:03;2012-10-10 02:03:04;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  7. BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost;1;1
  8. BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;1
  9. BJRNC11;SITE;BJN0585;2012-10-21;2012-10-21 04:53;2012-10-21 04:53:43;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  10. BJRNC10;SITE;BJN5890;2012-10-21;2012-10-21 16:02;2012-10-21 16:02:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  11. BJRNC09;SITE;BJS6220;2012-11-02;2012-11-02 01:09;2012-11-02 01:09:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  12. BJRNC14;SITE;BJN0768;2012-11-04;2012-11-04 16:29;2012-11-04 16:29:56;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  13. BJRNC25;SITE;BJS6156;2012-11-09;2012-11-09 09:40;2012-11-09 09:40:02;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
  14. BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:15;2013-11-01 07:15:24;2013-11-1  07:15:32;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;
  15. BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost;1;
  16. BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;
  17. [root@localhost ~]#
复制代码

作者: 关阴月飞    时间: 2013-11-19 11:36
回复 4# conall


    我也猜一个:
n为$5 不重复的次数, s为 $7 为空时 $5 不重复的次数。
  1. awk -F\; '!a[$5]++{++n} !(+$7){if(!b[$5]++)++s}END{print n,s}' 3
复制代码

作者: conall    时间: 2013-11-19 11:46
源文件:
>cat 1.txt
BJRNC18;SITE;BJS0978;1970-1-01;1970-1-01 08:03;1970-1-01 08:03:13;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC09;SITE;BJS5780;2012-10-08;2012-10-08 01:31;2012-10-08 01:31:24;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC08;SITE;BJN1050;2012-10-10;2012-10-10 02:03;2012-10-10 02:03:04;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost
BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost
BJRNC11;SITE;BJN0585;2012-10-21;2012-10-21 04:53;2012-10-21 04:53:43;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC10;SITE;BJN5890;2012-10-21;2012-10-21 16:02;2012-10-21 16:02:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC09;SITE;BJS6220;2012-11-02;2012-11-02 01:09;2012-11-02 01:09:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC14;SITE;BJN0768;2012-11-04;2012-11-04 16:29;2012-11-04 16:29:56;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC25;SITE;BJS6156;2012-11-09;2012-11-09 09:40;2012-11-09 09:40:02;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:15;2013-11-01 07:15:24;2013-11-1  07:15:32;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost
结果:
BJRNC18;SITE;BJS0978;1970-1-01;1970-1-01 08:03;1970-1-01 08:03:13;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC09;SITE;BJS5780;2012-10-08;2012-10-08 01:31;2012-10-08 01:31:24;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC08;SITE;BJN1050;2012-10-10;2012-10-10 02:03;2012-10-10 02:03:04;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost;1;1
BJRNC25;SITE;BJS0860;2012-10-19;2012-10-19 04:23;2012-10-19 04:23:53;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;1
BJRNC11;SITE;BJN0585;2012-10-21;2012-10-21 04:53;2012-10-21 04:53:43;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC10;SITE;BJN5890;2012-10-21;2012-10-21 16:02;2012-10-21 16:02:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC09;SITE;BJS6220;2012-11-02;2012-11-02 01:09;2012-11-02 01:09:23;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC14;SITE;BJN0768;2012-11-04;2012-11-04 16:29;2012-11-04 16:29:56;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC25;SITE;BJS6156;2012-11-09;2012-11-09 09:40;2012-11-09 09:40:02;                  ;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,EthernetSwitch=1,EthernetSwitchPort=6;3151;Ethernet Switch Port Fault;1;1
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:15;2013-11-01 07:15:24;2013-11-1  07:15:32;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3101;Contact to Default Router 0 Lost;1;
BJRNC30;SITE;BJN5584;2013-11-01;2013-11-01 07:23;2013-11-01 07:23:49;2013-11-1  07:23:56;Subrack=1,Slot=2,PlugInUnit=1,ExchangeTerminalIp=1,InternalEthernetPort=1,IpInterface=1;3102;Contact to Default Router 1 Lost;1;

作者: conall    时间: 2013-11-19 11:49
回复 7# yestreenstars


正解

   
作者: conall    时间: 2013-11-19 11:51
回复 7# yestreenstars
正解:
awk 'BEGIN{FS=OFS=";"}NR==FNR&&!a[$1,$2,$3,$4,$5]++{b[$5]++;if($7~/^ *$/)c[$5]++;next}{print $0,b[$5],c[$5]}'
只是不知道,我这里为什么不行?
当$7~/      /时,还是要!b[$1,$2,$3,$4,$5]不重复啊?
nawk 'BEGIN{FS=OFS=";"}NR==FNR{if(\!a[$1,$2,$3,$4,$5]++)an[$5]++;else if($7~/      /&&\!b[$1,$2,$3,$4,$5]++)bn[$5]++;next}{print $0,an[$5],bn[$5]}' 1.txt 1.txt

   
作者: conall    时间: 2013-11-19 11:59
自己修改了一下:
nawk 'BEGIN{FS=OFS=";"}NR==FNR{if(\!a[$1,$2,$3,$4,$5]++)an[$5]++;if($7~/      /&&\!b[$1,$2,$3,$4,$5]++)bn[$5]++;next}{print $0,an[$5],bn[$5]}' 1.txt 1.txt  去掉else实际也是可以
1、为什么加了esle就不行?
2、难道if($7~/^ *$/)c[$5]++————这里已经包含了\!b[$1,$2,$3,$4,$5]++,这个条件吗?可是他没有包含在第一个if的大括号里啊?
awk 'BEGIN{FS=OFS=";"}NR==FNR&&!a[$1,$2,$3,$4,$5]++{b[$5]++;if($7~/^ *$/)c[$5]++;next}{print $0,b[$5],c[$5]}'
作者: conall    时间: 2013-11-19 12:03
nawk 'BEGIN{FS=OFS=";"}NR==FNR{if(\!a[$1,$2,$3,$4,$5]++){an[$5]++;if($7~/     /)bn[$5]++};next}{print $0,an[$5],bn[$5]}' 1.txt 1.txt

nawk 'BEGIN{FS=OFS=";"}NR==FNR{if(\!a[$1,$2,$3,$4,$5]++)an[$5]++;if($7~/      /&&\!b[$1,$2,$3,$4,$5]++)bn[$5]++;next}{print $0,an[$5],bn[$5]}' 1.txt 1.txt

nawk 'BEGIN{FS=OFS=";"}NR==FNR&&!a[$1,$2,$3,$4,$5]++{b[$5]++;if($7~/^ *$/)c[$5]++;next}{print $0,b[$5],c[$5]}' 1.txt 1.txt

晕菜,这3种用法都是可以的,谢谢各位前辈





欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2