Chinaunix

标题: 如何找出这个文件的前4个字符的distict和后三个字符的distinct的值是多少？ [打印本页]

作者: 奥巴牛 时间: 2010-08-06 15:27
标题: 如何找出这个文件的前4个字符的distict和后三个字符的distinct的值是多少？
例如一个TXT文件包含

APU1_gl_tab13.dbf
APU1_sys03.rdo
APU1_sys04.ksp
APU1_too02.dbf
GPU1_sys05.dbf
KSU1_sys02.dbf

显示出前4个字符的distinct的数值是APU1 GPU1 KSU1
显示后3个字符的distinct的数值是dbf rdo ksp

应该怎么写脚本？

作者: 好看的附件 时间: 2010-08-06 15:32
回复 1# 奥巴牛

没看明白，是要找出现次数？

作者: welcome008 时间: 2010-08-06 15:33
cut -b1-4 filename|sort -u
awk '{print substr($0,length($0)-3,4)}' filename|sort -u

作者: 奥巴牛 时间: 2010-08-06 15:37

回复奥巴牛

没看明白，是要找出现次数？
好看的附件发表于 2010-08-06 15:32

不是，是找出前4个字段和后3个字段有多少个不重复的值

作者: 好看的附件 时间: 2010-08-06 15:42
回复 4# 奥巴牛

awk -F\_ '{print $1}' filename|sort -u
awk -F\. '{print $2}' filename|sort -u

作者: nomyself 时间: 2010-08-06 15:43
本帖最后由 nomyself 于 2010-08-06 15:47 编辑

perl -lne 'print "$1\t$2" if /([A-Z0-9]{4}).*\.([a-z]{3})/' txt
while read a; do echo -e "${a:0:4}\t${a:1-4}"; done<txt

复制代码

作者: Shell_HAT 时间: 2010-08-06 16:46
回复 6# nomyself

没有看清楼主的问题吧？

作者: Shell_HAT 时间: 2010-08-06 16:49

awk -F_ '{if(a[$1]==0)print $1;a[$1]++}' urfile

复制代码

awk -F. '{if(a[$2]==0)print $2;a[$2]++}' urfile

复制代码

作者: ywlscpl 时间: 2010-08-07 18:22
去重用!a[$x]++，这是经了

awk -F_ '!a[$1]++{print $1}' file
awk -F. '!a[$2]++{print $2}' file

作者: shileiadmin 时间: 2010-08-08 11:38
回复 1# 奥巴牛

这样？

[root@localhost lx]# sed 's/_$.*$\./ /g' urfile
APU1 dbf
APU1 rdo
APU1 ksp
APU1 dbf
GPU1 dbf
KSU1 dbf

复制代码

欢迎光临 Chinaunix (http://bbs.chinaunix.net/)