- 论坛徽章:
- 0
|
本帖最后由 iamline 于 2014-02-14 13:26 编辑
我有一个文件,如下:- g1.t1 gi|6322029 beta-fructofuranosidase SUC2 100%
- g1.t1 gi|151943004 invertase 100%
- g1.t1 gi|190406374 invertase 2 external form precursor 100%
- g1.t1 gi|259147091 Suc2p 100%
- g1.t1 gi|256270229 Suc2p 100%
- g2.t1 gi|8980428 cytochrome oxidase subunit II 100%
- g2.t1 gi|50812098 cytochrome c oxidase subunit 2 100%
- g2.t1 gi|491103312 putative cytochrome c oxydase, subunit 2 (mitochondrion) 99%
- g2.t1 gi|57790532 putative cytochrome c oxydase, subunit 2 99%
- g3.t1 gi|50304021 hypothetical protein 97%
- g3.t1 gi|367009920 hypothetical protein TDEL_0B01210 98%
- g3.t1 gi|50294508 hypothetical protein 98%
- g3.t1 gi|410074219 hypothetical protein KAFR_0A01180 97%
- g3.t1 gi|365987047 hypothetical protein NDAI_0E02950 97%
- g4.t1 gi|50312033 hypothetical protein 92%
- g4.t1 gi|410080105 hypothetical protein KAFR_0E03460 79%
- g4.t1 gi|254581928 ZYRO0D11858p 77%
- g4.t1 gi|410080103 hypothetical protein KAFR_0E03450 78%
- g4.t1 gi|444319104 hypothetical protein TBLA_0D01820 79%
- g5.t1 gi|1730047 RecName: Full=Hexose transporter 2 permease 76%
- g6.t1 gi|1730047 RecName: Full=Hexose transporter 2 permease 91%
- g6.t1 gi|1730047 RecName: Full=Hexose transporter 2 permease 91%
- g6.t1 gi|255714649 KLTH0E02772p 89%
- g6.t1 gi|50284831 hypothetical protein 89%
- g6.t1 gi|50284831 hypothetical protein 88%
- g7.t1 gi|42521636 beta-D-galactosidase 86%
- g7.t1 gi|50304489 hypothetical protein 86%
复制代码 想进行如下处理:以$1为key(同一个key一般有多行),key相同时比较$3(信息列),如果$3不是hypothetical protein,则输出每个key的第一个信息行,余下的行舍弃; 但是如果$3是hypothetical protein,则往下匹配第二行,第三行……若碰到排在最前的第一个不是hypothetical protein的行,则输出; 若全部都是hypothetical protein,则还是输出第一个信息行
得到如下的结果:- g1.t1 gi|6322029 beta-fructofuranosidase SUC2 100%
- g2.t1 gi|8980428 cytochrome oxidase subunit II 100%
- g3.t1 gi|50304021 hypothetical protein 97%
- g4.t1 gi|254581928 ZYRO0D11858p 77%
- g5.t1 gi|1730047 RecName: Full=Hexose transporter 2 permease 76%
- g6.t1 gi|1730047 RecName: Full=Hexose transporter 2 permease 91%
- g7.t1 gi|42521636 beta-D-galactosidase 86%
复制代码 期待各大侠指点啊!!
|
|