Chinaunix

标题: 合并行并插入逗号 [打印本页]

作者: 紫风8824    时间: 2014-07-09 11:05
标题: 合并行并插入逗号
本帖最后由 紫风8824 于 2014-07-09 11:13 编辑

求教:
>Potri.010G1602 969 99996 104 specific cd04300 GT1_Glycogen_Phosphorylase 0 1224.62
>Potri.010G1602 969 263942 104 superfamily cl10013 Glycosyltransferase_GTB_type 0 1224.62
>Potri.010G1602 969 233722 107 multidom TIGR02093 P_ylase 0 1111.18
>Potri.010G1602 973 99996 104 specific cd04300 GT1_Glycogen_Phosphorylase 0 1218.85
>Potri.010G1602 973 263942 104 superfamily cl10013 Glycosyltransferase_GTB_type 0 1218.85
>Potri.010G1602 973 233722 107 multidom TIGR02093 P_ylase 0 1105.79
变成:
>Potri.010G1602 969 99996,104,specific,cd04300,,,GT1_Glycogen_Phosphorylase,0,1224.62  263942,104,superfamily,cl10013,,,Glycosyltransferase_GTB_type,0,1224.62  233722,107,multidom,,,TIGR02093 P_ylase,0,1111.18
>Potri.010G1602 973 99996,104,specific,cd04300,,,GT1_Glycogen_Phosphorylase,0,1218.85 263942,104,superfamily,cl10013,,,Glycosyltransferase_GTB_type,0,1218.85 233722,107,multidom,TIGR02093,,,P_ylase,0,1105.79
请大神指教~谢谢!
作者: Herowinter    时间: 2014-07-09 11:15
本帖最后由 Herowinter 于 2014-07-09 11:16 编辑

回复 1# 紫风8824
第一列固定吗?合并条件是第一列 第二列都相同还是第二列相同?

   
作者: Kasiotao    时间: 2014-07-09 11:28
本帖最后由 Kasiotao 于 2014-07-09 11:28 编辑
  1. awk -v flag=0 '{if($2!=flag){if(NR!=1){printf "\n"}printf $1" "$2" ";flag=$2}printf $3","$4","$5","$6",,,"$7","$8","$9" "}' testfile
复制代码
试试 是不是你要的效果
作者: Herowinter    时间: 2014-07-09 11:29
回复 1# 紫风8824
  1. awk '{s=$3;for(i=4;i<=NF;i++){if(i!=7)s=s","$i;else s=s",,,"$i};a[$1" "$2]=length(a[$1" "$2])?a[$1" "$2]" "s:s} END{for(i in a)print i,a[i]|"sort -nk2,2"}' i
  2. >Potri.010G1602 969 99996,104,specific,cd04300,,,GT1_Glycogen_Phosphorylase,0,1224.62 263942,104,superfamily,cl10013,,,Glycosyltransferase_GTB_type,0,1224.62 233722,107,multidom,TIGR02093,,,P_ylase,0,1111.18
  3. >Potri.010G1602 973 99996,104,specific,cd04300,,,GT1_Glycogen_Phosphorylase,0,1218.85 263942,104,superfamily,cl10013,,,Glycosyltransferase_GTB_type,0,1218.85 233722,107,multidom,TIGR02093,,,P_ylase,0,1105.79
复制代码

作者: huang6894    时间: 2014-07-09 11:29
回复 1# 紫风8824
  1. awk -vOFS=',' '{for(i=1;i++<=NF;)a[$1","$2]=a[$1","$2]","$i}END{for(b in a){sub(/,$/,"",a[b]);print b"\n"a[b]}}' hh
  2. >Potri.010G1602,973
  3. ,973,99996,104,specific,cd04300,GT1_Glycogen_Phosphorylase,0,1218.85,,973,263942,104,superfamily,cl10013,Glycosyltransferase_GTB_type,0,1218.85,,973,233722,107,multidom,TIGR02093,P_ylase,0,1105.79
  4. >Potri.010G1602,969
  5. ,969,99996,104,specific,cd04300,GT1_Glycogen_Phosphorylase,0,1224.62,,969,263942,104,superfamily,cl10013,Glycosyltransferase_GTB_type,0,1224.62,,969,233722,107,multidom,TIGR02093,P_ylase,0,1111.18
复制代码

作者: 用户名注册后不能更改    时间: 2014-07-09 12:39
回复 4# Herowinter

看楼主的三个逗号我有点没明白这逻辑,然后看你的结果是正确的,再一看命令……
作者: 紫风8824    时间: 2014-07-09 15:18
回复 3# Kasiotao
谢谢~ 可以
作者: 紫风8824    时间: 2014-07-09 15:19
回复 2# Herowinter
合并条件是第一列、第二列都相同再合并~


   
作者: yestreenstars    时间: 2014-07-09 15:32
  1. sed -r ':1;N;s/^(\S+\s+\S+\b)(.*)\n\1(.*)/\1\2\3/;t1;P;D'
复制代码

作者: zxy877298415    时间: 2014-07-09 17:20
回复 9# yestreenstars

大师楼主还有三个逗号的需求了


   
作者: Herowinter    时间: 2014-07-09 17:22
回复 10# zxy877298415
坑爹的需求,我数了好几遍才确定是$7。。。

   
作者: yestreenstars    时间: 2014-07-09 18:03
回复 10# zxy877298415

神马意思?
   
作者: 用户名注册后不能更改    时间: 2014-07-09 19:48
回复 12# yestreenstars

你看“变成”后面的文本,莫名其妙的有连续三个逗号……
作者: yestreenstars    时间: 2014-07-09 20:41
回复 13# 用户名注册后不能更改

好吧,看到了,懒得改~
   
作者: 丿妖月    时间: 2014-07-10 11:13
tr "\n" ",,,"




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2