论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2010-06-12 11:38 |只看该作者 |倒序浏览

本帖最后由 luck_libiao 于 2010-06-12 11:43 编辑

有如下数据：
update ts_productorder set applytime5='20100612000000' where subscriberkey=50775650 and recid=3 and productorderkey5=999000000049281632;
update ts_productorder set applytime5='20100612000000' where subscriberkey=54016439 and recid=2 and productorderkey5=999000000049375891;
update ts_productorder set applytime1='20100612000000' where subscriberkey=53102121 and recid=3 and productorderkey1=999000000049347243;
update ts_productorder set applytime4='20100612000000' where subscriberkey=54022999 and recid=2 and productorderkey4=999000000049276174;
update ts_productorder set applytime2='20100612000000' where subscriberkey=55917492 and recid=3 and productorderkey2=999000000049426647;
update ts_productorder set applytime4='20100612000000' where subscriberkey=54852113 and recid=2 and productorderkey4=999000000049323308;
update ts_productorder set applytime4='20100612000000' where subscriberkey=53420560 and recid=2 and productorderkey4=999000000049330824;
update ts_productorder set applytime4='20100612000000' where subscriberkey=55434744 and recid=2 and productorderkey4=999000000049280350;

想要统计subscriberkey=后面的值是否存在重复的，并且输出各相同的值的统计数据！

如果不包含等号的话：
可以这样！
awk '{ a[$8]++} END{for (b in a) {print b,a} ' file
但是现在有个=号，又不想先将=给先去掉，是否有方法做呢？
看下之前的贴，都是说可以用[],但是我写成如下：
awk -F "[= ]" '{ a[$8]++} END{for (b in a) {print b,a} ' file
不成功
awk -F "[=\ ]" '{ a[$8]++} END{for (b in a) {print b,a} ' file也不成功，不知道如何是好！

文库|博客

iori809

巨富豪门

论坛徽章:: 0

2楼 [报告]

发表于 2010-06-12 11:43 |只看该作者

回复 1# luck_libiao

什么叫相同值的统计数据？
过滤重复的？还是？把你要的在贴出来吧

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

99超人 99超人当前离线禁止发言求职 : 技术支持/维好友博客消息论坛徽章: 0	3楼 [报告] 发表于 2010-06-12 11:45 \|只看该作者提示: 作者被禁止或删除内容自动屏蔽
	实战分享：从技术角度谈机器学习入门\| 【大话IT】RadonDB低门槛向MySQL集群下战书 \| ChinaUnix打赏功能已上线！ \| 新一代分布式关系型数据库RadonDB知多少？

luck_libiao

白手起家

论坛徽章:: 0

4楼 [报告]

发表于 2010-06-12 11:46 |只看该作者

回复 2# iori809

需要的数据为：
50775650 10
54016439 5
53102121 3

即该列的数值以及重复出现的次数！
现在关键是分割符的问题！既有= 又包含空格！不知道如何处理！

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

lkk2003rty

巨富豪门

论坛徽章:: 0

5楼 [报告]

发表于 2010-06-12 11:46 |只看该作者

awk '{ a[$8]++} END{for (b in a) {print b,a[b]} ' file
這個統計的不是
subscriberkey=55434744 等號後面的值出現得重複次數啊
樓主成功的要長什麽樣以你給出的數據為例說明下唄

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

luck_libiao

白手起家

论坛徽章:: 0

6楼 [报告]

发表于 2010-06-12 11:47 |只看该作者

回复 3# 99超人

是可以，但是我想知道如何用分隔符来实现！

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

aluoyeshi

丰衣足食

论坛徽章:: 0

7楼 [报告]

发表于 2010-06-12 11:47 |只看该作者

我这里是可以的
awk -F "[= ]" '{ a[$8]++} END{for (b in a) {print b,a[b]} ' file

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

lkk2003rty

巨富豪门

论坛徽章:: 0

8楼 [报告]

发表于 2010-06-12 11:51 |只看该作者

awk -F "[= ]" '{a[$8]++}END{for(cnt in a)print cnt,a[cnt]}' file

复制代码

這個嚴重懷疑樓主print後面跟數組變量名能用,不報錯?

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

bbgg1983

富足长乐

论坛徽章:: 0

9楼 [报告]

发表于 2010-06-12 11:51 |只看该作者

楼主你写的那个语句大括号的数量不匹配

复制代码

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

luck_libiao

白手起家

论坛徽章:: 0

10楼 [报告]

发表于 2010-06-12 11:58 |只看该作者

回复 8# lkk2003rty

我先做了下取出=号，
awk -F "=" '{ print $1" "$2" "$3" "$4" "$5 }' billtype_Mem.sql > billtype_Mem.sql_1 ------是否还可以写成更简单的？
获得如下数据：
update ts_productorder set applytime5 '20100612000000' where subscriberkey 50775650 and recid 3 and productorderkey5 999000000049281632;
update ts_productorder set applytime5 '20100612000000' where subscriberkey 54016439 and recid 2 and productorderkey5 999000000049375891;
update ts_productorder set applytime1 '20100612000000' where subscriberkey 53102121 and recid 3 and productorderkey1 999000000049347243;
update ts_productorder set applytime4 '20100612000000' where subscriberkey 54022999 and recid 2 and productorderkey4 999000000049276174;
update ts_productorder set applytime2 '20100612000000' where subscriberkey 55917492 and recid 3 and productorderkey2 999000000049426647;
update ts_productorder set applytime4 '20100612000000' where subscriberkey 54852113 and recid 2 and productorderkey4 999000000049323308;
update ts_productorder set applytime4 '20100612000000' where subscriberkey 53420560 and recid 2 and productorderkey4 999000000049330824;
update ts_productorder set applytime4 '20100612000000' where subscriberkey 55434744 and recid 2 and productorderkey4 999000000049280350;

然后再用如下

awk ' { a[$8]++ } END { for ( b in a ) { print b,a } } ' billtype_Mem.sql_1
需要的结果为：
55434744 1
56604049 1
10735231 1
54653178 1
55917492 1
11433573 1
53102121 1
56628142 1
10103998 1
53956299 1
54852113 1
55486244 1
56056673 1
56391452 1
54254460 1
50775650 1
53102118 1
53420560 1
56783193 1
54653202 1
54022999 1
54016439 1
10337832 1
50369194 1

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

12 3 / 3 页下一页

返回列表

Chinaunix › 论坛 › 程序设计 › Shell › 关于awk包含空格以及=做为分隔符的问题！

关于awk包含空格以及=做为分隔符的问题！ [复制链接]

浏览过的版块