- 论坛徽章:
- 0
|
本帖最后由 iamline 于 2012-10-30 14:21 编辑
我有3个样品的信息,如下所示:
more file1.list
xyn1 deletion NC_001133.9:1-1200 1200 0.0921954
xyn1 deletion NC_001133.9:13701-22400 8700 0.152098
xyn1 deletion NC_001133.9:23101-29500 6400 0.282957
xyn1 deletion NC_001133.9:160301-166200 5900 0.0557053
xyn1 deletion NC_001133.9:190101-221800 31700 0.35545
xyn1 duplication NC_001133.9:223101-230300 7200 0.15544
xyn1 deletion NC_001134.8:1-8400 8400 0.260659
xyn1 duplication NC_001134.8:9001-9600 600 0.296478
xyn1 deletion NC_001134.8:29601-36400 6800 0.0416205
xyn1 deletion NC_001134.8:222601-224900 2300 0.173108
more file2.list
xyn6 deletion NC_001133.9:1-1200 1200 0.155348
xyn6 deletion NC_001133.9:13701-29900 16200 0.249624
xyn6 deletion NC_001133.9:160201-166200 6000 0.0870661
xyn6 deletion NC_001133.9:187101-187800 700 0.261129
xyn6 deletion NC_001133.9:198501-221800 23300 0.209163
xyn6 duplication NC_001133.9:223101-230300 7200 0.17344
xyn6 deletion NC_001134.8:1-6800 6800 0.257799
xyn6 deletion NC_001134.8:29601-36300 6700 0.0334622
xyn6 deletion NC_001134.8:222601-224500 1900 0.263614
xyn6 deletion NC_001134.8:259701-263200 3500 0.246482
xyn6 deletion NC_001134.8:643701-644100 400 0.288723
xyn6 duplication NC_001134.8:801701-804100 2400 0.432693
xyn6 deletion NC_001134.8:808201-813200 5000 0.319271
more file3.list
xyq8 duplication NC_001133.9:2101-12200 10100 2.5883
xyq8 deletion NC_001133.9:12201-27800 15600 0.0382995
xyq8 deletion NC_001133.9:29001-29600 600 0.232672
xyq8 duplication NC_001133.9:29601-67900 38300 1.36872
xyq8 duplication NC_001133.9:78301-107900 29600 1.52689
xyq8 duplication NC_001133.9:116201-127700 11500 1.54793
xyq8 duplication NC_001133.9:133401-160200 26800 1.49694
xyq8 deletion NC_001133.9:160201-166200 6000 0.0959008
xyq8 duplication NC_001133.9:166201-183800 17600 1.55304
xyq8 deletion NC_001133.9:183801-184900 1100 0.265608
xyq8 duplication NC_001133.9:184901-187100 2200 1.78619
xyq8 deletion NC_001133.9:190101-193700 3600 0.682606
xyq8 duplication NC_001133.9:193701-198200 4500 1.41234
xyq8 deletion NC_001133.9:198201-202800 4600 0.0975082
xyq8 deletion NC_001133.9:204001-206400 2400 0.215752
xyq8 deletion NC_001133.9:208001-230300 22300 0.271109
xyq8 deletion NC_001134.8:3801-6900 3100 0.251818
xyq8 deletion NC_001134.8:29601-35600 6000 0.319272
xyq8 deletion NC_001134.8:259701-262600 2900 0.139611
xyq8 deletion NC_001134.8:427801-429800 2000 0.508001
列之间\t隔开,文件的$1是样品名,$2是样品的特征信息,$3是$2这一信息的位置坐标,其中$3分为2部分,“:”连接,冒号前是$2所在的序列号,冒号后是$2在序列上的起始终止位置,$4是$2的长度值,$5是一个打分值
我想要比较这3个样品文件,逐行比较,连接元素就是$2和$3, $2与$3冒号前的部分必须保证一样才有可比性,$3冒号后的位置范围只要有重叠就可以认为样品有相似的部分,即可输出;
比较3个样品(已按$3排序)的每一行,把比较的结果输出到同一行里,每一行输出顺序按xyn1、xyn6、xyq8,只输出每个样的后4列即可,若没有可比较的,则样品的对应位置为空
例如比较上面的前2行,结果如下:
xyn1 xyn6 xyq8
deletion NC_*-1200 1200 0.0921954 deletion NC_*1-1200 1200 0.155348
duplication NC_*200 10100 2.5883
描述的比较繁琐,实在不好描述,大侠帮忙看一下,若阐述不清楚,我会再追加,谢谢了!!!
|
|