- 论坛徽章:
- 0
|
file1,csv文件
fam1,chr1,16950470,0.833266667007005
fam1,chr1,76190271,0.684558404931228
fam1,chr1,103427994,0.48393638491905
fam1,chr1,111863039,0.980521031104692
file2,vcf文件,\t分隔
chr1 16877405 . G A ...
chr1 16910063 rs3872319 C T ...
chr1 16950470 rs12144467 C T ...
如果file1的第三列在file2的第二列存在,就输出file2的一整行。
比如16950470在file2里存在,就输出chr1 16950470 rs12144467 C T ...
我写的代码是awk 'NR==FNR{FS=","}{a[$3]=0}NR>FNR{FS="\t"}{if ($2 in a){print $0}}' file1 file2
可是我的结果中总是会把第一个结果忽略,但是会输出file1的那一行,如下所示:
fam1,chr1,15518588,0.913004409459763
chr1 16950470 rs12144467 C T ...
chr1 76190271 . C T ...
chr1 103427994 rs4353117 A T ...
哪位亲能帮忙解释一下? |
|