- 论坛徽章:
- 0
|
10可用积分
本帖最后由 aids260 于 2011-03-27 00:59 编辑
- BGIBMGA000058-TA nscaf1071 1083932 1085028
- BGIBMGA000059-TA nscaf1071 1144201 1144993
- BGIBMGA000060-TA nscaf1071 1194454 1203962
- BGIBMGA000061-TA nscaf1071 1233907 1234573
- BGIBMGA000062-TA nscaf1071 1239819 1245122
- BGIBMGA000063-TA nscaf1071 1293757 1306270
- BGIBMGA000064-TA nscaf1071 1313232 1316001
- BGIBMGA000065-TA nscaf1087 2451 9117
- BGIBMGA000066-TA nscaf109 1549 4332
- BGIBMGA000067-TA nscaf1108 3401547 3415278
- BGIBMGA000068-TA nscaf1108 2724200 2728350
- BGIBMGA000069-TA nscaf1108 2546311 2574462
- BGIBMGA000070-TA nscaf1108 2533757 2534560
- BGIBMGA000071-TA nscaf1108 2527802 2528488
- BGIBMGA000072-TA nscaf1108 2442934 2452908
- BGIBMGA000073-TA nscaf1108 2382927 2390096
- BGIBMGA000074-TA nscaf1108 2370811 2374956
- BGIBMGA000075-TA nscaf1108 2350982 2353168
- BGIBMGA000076-TA nscaf1108 2305769 2312842
- BGIBMGA000077-TA nscaf1108 2238841 2239515
- BGIBMGA000078-TA nscaf1108 2207199 2212812
- BGIBMGA000079-TA nscaf1108 2140463 2140972
复制代码- + BGIBMGA000062-TA FWDP30_FL5_P09.seq nscaf1071 1318628 1239395 1240559
- + BGIBMGA000062-TA MFBP02_F_H18.seq nscaf1071 1318628 1239668 1243526
- + BGIBMGA000064-TA fdpeP10_F_J11.seq nscaf1071 1318628 1313192 1316067
- + BGIBMGA000064-TA MFBP04_F_N06.seq nscaf1071 1318628 1313197 1316106
- + BGIBMGA000128-TA FWDP26_FL5_O06.seq nscaf1108 3459965 1275912 1278501
- + BGIBMGA000129-TA FWDP02_FL5_F17.seq nscaf1108 3459965 1283141 1319375
- + BGIBMGA000129-TA FWDP02_FL5_O19.seq nscaf1108 3459965 1282859 1319327
- + BGIBMGA000129-TA FWDP10_FL5_A23.seq nscaf1108 3459965 1283141 1319395
- + BGIBMGA000129-TA FWDP31_FL5_I15.seq nscaf1108 3459965 1283115 1319315
- + BGIBMGA000129-TA MFBP15_F_I08.seq nscaf1108 3459965 1283141 1319403
- + BGIBMGA000140-TA FWDP04_FL5_M21.seq nscaf1108 3459965 1773552 1776967
- + BGIBMGA000140-TA FWDP25_FL5_J23.seq nscaf1108 3459965 1785487 1791348
- + BGIBMGA000154-TA FWDP03_FL5_B14.seq nscaf1108 3459965 2355076 2357513
- + BGIBMGA000154-TA fdpeP09_F_E13.seq nscaf1108 3459965 2355077 2357478
- + BGIBMGA000155-TA fdpeP13_F_N09.seq nscaf1108 3459965 2429759 2436013
- + BGIBMGA000201-TA MFBP02_F_D11.seq nscaf1299 357319 349886 350726
- + BGIBMGA000202-TA MFBP13_F_J17.seq nscaf1375 11050 7946 8756
- + BGIBMGA000360-TA fdpeP16_F_J16.seq nscaf1681 5888100 45406 46743
- + BGIBMGA000360-TA fdpeP07_F_A13.seq nscaf1681 5888100 45375 46693
- + BGIBMGA000362-TA fdpeP08_F_F04.seq nscaf1681 5888100 90916 94336
复制代码 先看下第二个文件中的值
BGIBMGA000062-TA
BGIBMGA000062-TA 第二列代表着基因名, 如果是一样的话 就计算下他们 第六列的平均值
即同样的基因名的情况下,以第二列为参照,取第六列的平均值
在第一个文件里边也有这个基因的一些信息:
BGIBMGA000062-TA nscaf1071 1239819 1245122
计算了上边的那个平均值后 用 第一个文件里的第三列的值 减去刚才计算的平均值( 再加一个列出来)
然后在顺便统计下,第二个文件里的每个基因名重复出现了几次(次数再加一列出来)
输出这样的格式:
+ BGIBMGA000062-TA FWDP30_FL5_P09.seq nscaf1071 1318628 1239395 1240559
+ BGIBMGA000062-TA MFBP02_F_H18.seq nscaf1071 1318628 1239668 1243526
+ BGIBMGA000062-TA nscaf1071 上边两个数的平均值 差值 基因名出现的次数(2次)
+ BGIBMGA000064-TA fdpeP10_F_J11.seq nscaf1071 1318628 1313192 1316067
+ BGIBMGA000064-TA MFBP04_F_N06.seq nscaf1071 1318628 1313197 1316106
+ BGIBMGA000064-TA nscaf1071 上边两个数的平均值 差值 基因名出现的次数(2次)
或者直接输出这种格式
+ BGIBMGA000062-TA nscaf1071 上边两个数的平均值 差值 出现的次数(2次)
+ BGIBMGA000064-TA nscaf1071 上边两个数的平均值 差值 出现的次数(2次)
补充一点:::::
在上边的数据中没有体现出来, 第二个文件的第6列的值有重复的值,如果它为关键字用来定义哈希的话,会取消一部分值的,我觉的用数组来定义比较好,大家有什么好办法吗???- + BGIBMGA000648-TA MFBP22_F_C21.seq nscaf1690 7983491 2927332 2942206
- + BGIBMGA000648-TA MFBP22_F_M19.seq nscaf1690 7983491 2927328 2942181
- + BGIBMGA000648-TA MFBP22_F_O19.seq nscaf1690 7983491 2927328 2942149
- + BGIBMGA000648-TA MFBP21_F_E08.seq nscaf1690 7983491 2927328 2942222
- + BGIBMGA000648-TA MFBP21_F_F16.seq nscaf1690 7983491 2927328 2942207
- + BGIBMGA000648-TA MFBP23_F_M16.seq nscaf1690 7983491 2927328 2942190
- + BGIBMGA000648-TA MFBP26_F_G19.seq nscaf1690 7983491 2927328 2942178
复制代码 我把文件打包传上吧 |
|