免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
楼主: bioinfor
打印 上一主题 下一主题

两道题,问了N多人,没结果,再问一下看看  关闭 [复制链接]

论坛徽章:
0
21 [报告]
发表于 2004-12-14 01:19 |只看该作者

两道题,问了N多人,没结果,再问一下看看

[quote]原帖由 "bioinfor"]这个序列没有>10的重复,如果有的话也只能是本身,如果不延伸的话,它会有很多个重复,从10开始以1个步长移动。[/quote 发表:


没有重复? 不知道你的 “重复“ 的定义是什么?

我只是帮你找出所有重复的串, 步长是一个字符, 在特定的条件下,如你的例子,
会有位置的重复。 如何解释, 则是使用者的事了。

程序已修改为每个字符串必须同时包含 ATCG 四个字符。Try again.

论坛徽章:
0
22 [报告]
发表于 2004-12-14 12:51 |只看该作者

两道题,问了N多人,没结果,再问一下看看

这是结果,我想说明一下,主任的程序是把>10的且有匹配的都输出来,所以对于这种序列来说,就是从10开始以1长步长增加,所以输出的重复序列彼此重叠。就拿我的数据AAAATTTTCCCCGGGGAAAATTTTCCCCGGGG来说,实际重复的应该只是一条序列,那就是AAAATTTTCCCCGGGG,因为这条是最大的,且没有和他重复的序列。
我当初为了大家看起来方便自己改的例子,这个是原例子,标准答案,你可以试试:
String:
ACGTGCGATCACAGGCCGTGCAGAGACTGACGATCAGACGACGTGACAGGCCGTGCAGAGACTGACGATCAG
Repeat:
Repeat: ACAGGCCGTGCAGAGACTGACGATCAG, Size: 27, Start Positions: 11, 46

我输入序列的结果:
$ ./1 datafile
------------------- Line# 1 --------------------

AAAATTTTCC Length=10 Position=1;17
AAATTTTCCC Length=10 Position=2;18
AATTTTCCCC Length=10 Position=3;19
ATTTTCCCCG Length=10 Position=4;20
TTTTCCCCGG Length=10 Position=5;21
TTTCCCCGGG Length=10 Position=6;22
TTCCCCGGGG Length=10 Position=7;23
AAAATTTTCCC Length=11 Position=1;17
AAATTTTCCCC Length=11 Position=2;18
AATTTTCCCCG Length=11 Position=3;19
ATTTTCCCCGG Length=11 Position=4;20
TTTTCCCCGGG Length=11 Position=5;21
TTTCCCCGGGG Length=11 Position=6;22
AAAATTTTCCCC Length=12 Position=1;17
AAATTTTCCCCG Length=12 Position=2;18
AATTTTCCCCGG Length=12 Position=3;19
ATTTTCCCCGGG Length=12 Position=4;20
TTTTCCCCGGGG Length=12 Position=5;21
AAAATTTTCCCCG Length=13 Position=1;17
AAATTTTCCCCGG Length=13 Position=2;18
AATTTTCCCCGGG Length=13 Position=3;19
ATTTTCCCCGGGG Length=13 Position=4;20
AAAATTTTCCCCGG Length=14 Position=1;17
AAATTTTCCCCGGG Length=14 Position=2;18
AATTTTCCCCGGGG Length=14 Position=3;19
AAAATTTTCCCCGGG Length=15 Position=1;17
AAATTTTCCCCGGGG Length=15 Position=2;18
AAAATTTTCCCCGGGG Length=16 Position=1;17

论坛徽章:
0
23 [报告]
发表于 2004-12-15 00:54 |只看该作者

两道题,问了N多人,没结果,再问一下看看

关于位置的重复, 有时和你的取舍有关:

AAACCCGGGTTTATTAAACCCGGGTGGGCCCGGGTTTA

有两个符合条件的:

AAACCCGGGT Length=10 Position=1;16
CCCGGGTTTA Length=10 Position=4;29

同样长度, 但有重复,  你选那一个?  

你可以自行修改程序以达到你特定的要求.

论坛徽章:
0
24 [报告]
发表于 2004-12-15 12:01 |只看该作者

两道题,问了N多人,没结果,再问一下看看

重新写过以解决重叠问题:


  1. #!/bin/awk -f
  2. #
  3. # A script can be used to check any repeat pieces of nucleotide sequences.
  4. #
  5. # Design: lighspeed
  6. # Date: Dec. 14, 2004
  7. #
  8. # Repeat Match Usage::  $0 datafile
  9. # Reverted Repeat Match Usage::  $0 -v r=1 datafile
  10. #

  11. function is_overlap(p, l) {
  12.   e = p + l - 1
  13.   for (i in record) {
  14.    a = i + record[i] - 1
  15.    if (( i >= p && i <= e ) || ( a >= p && a <= e ) || ( p >= i && p <= a ) || ( e >= i && e <= a ))
  16.       return 1
  17.   }
  18.   return 0
  19. }

  20. {
  21.   L=length($0)
  22.   STR_MIN=10
  23. #  STR_MAX=int(L / 2)
  24.   STR_MAX=30
  25.   if ( r == 1 )
  26.     print "---------------Reverted Repeat Match Line# "NR" -----------------\n"
  27.   else
  28.     print "------------------Repeat Match Line# "NR" --------------------\n"

  29.   for ( Str_Len=STR_MAX; Str_Len >= STR_MIN; Str_Len -- ) {

  30.     for ( Position=1; Position <= L - 2 * Str_Len + 1; Position ++ ) {
  31.       if ( is_overlap(Position,Str_Len) == 1 )
  32.         continue

  33.       count=0
  34.       pos=Position
  35.       offset=Position + Str_Len - 1
  36.       left=substr($0,Position,Str_Len)

  37.       if (index(left,"A")==0 || index(left,"C")==0 || index(left,"G")==0 || index(left,"T")==0 )
  38.         continue

  39.       right=substr($0, Position + Str_Len)

  40.       if ( r == 1 ) {
  41.         old_left=left
  42.         rev_left=""
  43.         for ( i=length(left); i>=1; i-- )
  44.           rev_left=rev_left""substr(left,i,1)
  45.         left=rev_left
  46.       }

  47.       while ( Str_Len <= length(right) ) {
  48.         i=index(right,left)
  49.         if ( i > 0 ) {
  50.           j=offset + i
  51.           if ( is_overlap(j,Str_Len) == 0 ) {
  52.             count ++
  53.             record[Position]=Str_Len
  54.             record[j]=Str_Len
  55.             pos=pos","j
  56.           }
  57.           right=substr(right, i + Str_Len)
  58.           offset+=(i + Str_Len - 1)
  59.         }
  60.         else
  61.           break
  62.       }

  63.       if (count > 0) {
  64.         match_number[Str_Len] ++
  65.         if (r == 1) {
  66.           left=old_left
  67.           print  "Reverted Repeat: " left",", "Size: "Str_Len",", "Start Positions: "pos
  68.         }
  69.         else
  70.           print  "Repeat: " left",", "Size: "Str_Len",", "Start Positions: "pos
  71.       }
  72.       
  73.     }
  74.   }
  75. }
  76.        
复制代码


测试你的文件



  1. # cat data1
  2. ACGTGCGATCACAGGCCGTGCAGAGACTGACGATCAGACGACGTGACAGGCCGTGCAGAGACTGACGATCAG

  3. # ./1 data1
  4. ------------------Repeat Match Line# 1 --------------------

  5. Repeat: ACAGGCCGTGCAGAGACTGACGATCAG, Size: 27, Start Positions: 11,46

复制代码


Repeat 测试 (前面的 10000 个字符的文件, STR_MIN=10, STR_MAX=30)



  1. #  time ./1 datafile > report1

  2. real    1m45.46s
  3. user    1m24.95s
  4. sys     0m0.03s   

  5. # cat report1
  6. ------------------Repeat Match Line# 1 --------------------

  7. Repeat: TTGGCTGGGCACAGTGGCTCACGCCTGTAA, Size: 30, Start Positions: 1086,5893
  8. Repeat: GGAGTTCAAGACCAGCCTGGCCAACATGGT, Size: 30, Start Positions: 1161,2687
  9. Repeat: TGGCCAACATGGTGAAACCCCGTCTCTA, Size: 28, Start Positions: 5983,8614
  10. Repeat: CCTGTAATCCCAGCACTTTGGGAGGC, Size: 26, Start Positions: 1613,2948
  11. Repeat: CGGGCATGGTGGCTCACGCTTGTAAT, Size: 26, Start Positions: 2617,8526
  12. Repeat: CCAGCACTTTGGGAGGCTGAGGCAGG, Size: 26, Start Positions: 5925,8553
  13. Repeat: GAACTCCTGACCTCAGGTGATCC, Size: 23, Start Positions: 3913,9062
  14. Repeat: CCTAGCACTTTGGGAGGCTGAG, Size: 22, Start Positions: 1117,2643
  15. Repeat: CGTGCCTGTAATCCCAGCTACT, Size: 22, Start Positions: 1241,8671
  16. Repeat: TGAGGCAGGAGAATTGCTTGAA, Size: 22, Start Positions: 1271,6075
  17. Repeat: GAGGTTGTAGTGAGCCGAGAT, Size: 21, Start Positions: 1805,2832
  18. Repeat: GGAGGTGGAGGTTGCAGTGA, Size: 20, Start Positions: 502,8727
  19. Repeat: ACTCCAGCCTGGGCGACAGA, Size: 20, Start Positions: 541,1336
  20. Repeat: GTGCCACTGCACTCCAGCCT, Size: 20, Start Positions: 2854,6130
  21. Repeat: CTAAAAATACAAAAATTAG, Size: 19, Start Positions: 1708,8642
  22. Repeat: AGCTACTTGGGAGGCTGAG, Size: 19, Start Positions: 2784,3093
  23. Repeat: AAAAATACAAAAATTAGCC, Size: 19, Start Positions: 3046,6013
  24. Repeat: AGGAGAATCACTTGAACC, Size: 18, Start Positions: 1778,8707
  25. Repeat: CCCAGGCTGGAGTGCAAT, Size: 18, Start Positions: 3732,8883
  26. Repeat: AAAGTGCTGGGATTACAG, Size: 18, Start Positions: 4558,9101
  27. Repeat: ACTGCACTCCAGCCTGG, Size: 17, Start Positions: 1832,8753
  28. Repeat: TGGATCACTTGAGGTCA, Size: 17, Start Positions: 2670,8579
  29. Repeat: TCGCTTGAACCCGGGAG, Size: 17, Start Positions: 2812,3121
  30. Repeat: TGGAGTTTTGCTCTTGT, Size: 17, Start Positions: 3713,8864
  31. Repeat: GCCTTGGCCTCCCAAA, Size: 16, Start Positions: 1460,3940
  32. Repeat: TATTTTTAGTAGAGAC, Size: 16, Start Positions: 4473,9015
  33. Repeat: CCACCTCGCCTGGCT, Size: 15, Start Positions: 208,8993
  34. Repeat: TGGGGAGGCTGAGGT, Size: 15, Start Positions: 327,467
  35. Repeat: TAAACAAGGACTTTT, Size: 15, Start Positions: 1510,1556
  36. Repeat: GGGTTTCTCCATGTT, Size: 15, Start Positions: 4490,9032
  37. Repeat: GAAACCCCGTCTCT, Size: 14, Start Positions: 1693,2717
  38. Repeat: AGACTCCATCTCAA, Size: 14, Start Positions: 2887,8781
  39. Repeat: CTGCCTCAGCCTCC, Size: 14, Start Positions: 3802,8953
  40. Repeat: GATTACAGGCATGC, Size: 14, Start Positions: 3827,8978
  41. Repeat: TGTGGTGGTGCA, Size: 12, Start Positions: 436,1732
  42. Repeat: ACAATGCTGTAA, Size: 12, Start Positions: 847,9825
  43. Repeat: ACCCTGTCTCTA, Size: 12, Start Positions: 1194,5426
  44. Repeat: TGAGGTCAGGAG, Size: 12, Start Positions: 2992,5958
  45. Repeat: GCCTGTAATCC, Size: 11, Start Positions: 309,3081
  46. Repeat: AGGCTGGTCTC, Size: 11, Start Positions: 1436,9051
  47. Repeat: GTGTTTCTAAC, Size: 11, Start Positions: 2256,7173
  48. Repeat: ATGAACAAGGG, Size: 11, Start Positions: 7604,9373
  49. Repeat: AAGCAATTCTC, Size: 11, Start Positions: 8435,8942
  50. Repeat: TTCTTTTTGA, Size: 10, Start Positions: 65,4308
  51. Repeat: CTGTGAATAT, Size: 10, Start Positions: 260,6206
  52. Repeat: GATTTTCTAT, Size: 10, Start Positions: 633,9283
  53. Repeat: GCTGTCATTT, Size: 10, Start Positions: 652,5261
  54. Repeat: ATTAGTTTTC, Size: 10, Start Positions: 738,7084
  55. Repeat: AAGTTTCAAG, Size: 10, Start Positions: 928,5153
  56. Repeat: TTAGTTCTCA, Size: 10, Start Positions: 1955,6839
  57. Repeat: TCAGCCAGAT, Size: 10, Start Positions: 1988,9888
  58. Repeat: ATTTGCTTTT, Size: 10, Start Positions: 2246,9636
  59. Repeat: TGAGCTCTTA, Size: 10, Start Positions: 3565,9590
  60. Repeat: GCCCACATTA, Size: 10, Start Positions: 7007,8318
复制代码


Reverted Repeat 测试 (前面的 10000 个字符的文件, STR_MIN=10, STR_MAX=30)

注意: 语法为  ./1 -v r=1  datafile



  1. # time ./1 -v r=1 datafile > report2

  2. real    1m7.45s
  3. user    1m2.28s
  4. sys     0m0.10s

  5. # cat report2

  6. ---------------Reverted Repeat Match Line# 1 -----------------

  7. Reverted Repeat: AGTTTTTCTTTTTTT, Size: 15, Start Positions: 674,4303
  8. Reverted Repeat: TCATTCATGGTA, Size: 12, Start Positions: 4999,9542
  9. Reverted Repeat: TTCTCAGACTAA, Size: 12, Start Positions: 6843,7896
  10. Reverted Repeat: AGGTGGGCGGA, Size: 11, Start Positions: 1641,2976
  11. Reverted Repeat: GTCTCTTAAAA, Size: 11, Start Positions: 2591,8498
  12. Reverted Repeat: TTGAGGTGACA, Size: 11, Start Positions: 5358,5856
  13. Reverted Repeat: ACAGAATAAAA, Size: 11, Start Positions: 6222,6567
  14. Reverted Repeat: ACTAGAGCTTG, Size: 11, Start Positions: 7340,8599
  15. Reverted Repeat: GTTTTCTTAA, Size: 10, Start Positions: 230,8448
  16. Reverted Repeat: TTTACTTTAG, Size: 10, Start Positions: 611,1360
  17. Reverted Repeat: TACAAAGAAC, Size: 10, Start Positions: 871,2576
  18. Reverted Repeat: GTTCTCAACT, Size: 10, Start Positions: 882,2497
  19. Reverted Repeat: GTGAAACCCT, Size: 10, Start Positions: 1189,1469,4554
  20. Reverted Repeat: AAGGACTTTT, Size: 10, Start Positions: 1515,2225
  21. Reverted Repeat: CTTTTTCTGA, Size: 10, Start Positions: 1545,2382
  22. Reverted Repeat: CAATTTGATC, Size: 10, Start Positions: 2094,4775
  23. Reverted Repeat: TCAAAAAAGA, Size: 10, Start Positions: 2897,5675
  24. Reverted Repeat: GTGACAACAG, Size: 10, Start Positions: 3172,4730
  25. Reverted Repeat: ATTTAATCGT, Size: 10, Start Positions: 3597,9181
  26. Reverted Repeat: AATATCTTTG, Size: 10, Start Positions: 5550,7952
  27. Reverted Repeat: CCTGGGAAGG, Size: 10, Start Positions: 5563,9524
  28. Reverted Repeat: TCTCAAATAG, Size: 10, Start Positions: 5620,9293
  29. Reverted Repeat: AGTATTATCA, Size: 10, Start Positions: 5650,7393
  30. Reverted Repeat: CAATAAATGG, Size: 10, Start Positions: 8800,9810
  31. Reverted Repeat: TTGTACGTAT, Size: 10, Start Positions: 9563,9578
复制代码

论坛徽章:
0
25 [报告]
发表于 2004-12-15 12:18 |只看该作者

两道题,问了N多人,没结果,再问一下看看

我也有一个思路。
字符串由A,C,G,T组成(意外着其它许多单字符没有使用)。
字符串最长为10000,重复的片段最小是10个符串。

从10个字符起,用替代法,即从第一个字符开始的10个字符为第一个变量(首先取ACGT之外的单字符,再取双、三字符变量……),用其替换所以能替换的序列,每替换一次其计数器加一。
如此循环,依次替换。
最后合并输出各变量(从)和各变量的计数器

论坛徽章:
0
26 [报告]
发表于 2004-12-15 18:08 |只看该作者

两道题,问了N多人,没结果,再问一下看看

原帖由 "lightspeed" 发表:
关于位置的重复, 有时和你的取舍有关:

AAACCCGGGTTTATTAAACCCGGGTGGGCCCGGGTTTA

有两个符合条件的:

AAACCCGGGT Length=10 Position=1;16
CCCGGGTTTA Length=10 Position=4;29

同样长度, 但有重复,  你..........



我觉得这种情况应该是先出现先匹配,匹配最长字段,不知道楼主是不是这意思

论坛徽章:
0
27 [报告]
发表于 2004-12-15 18:15 |只看该作者

两道题,问了N多人,没结果,再问一下看看

楼主请把以下几点模糊的地方讲明一下,我写了一个,可对下面的问题不知道什么才是你要的
1.上面位置重复怎么取?是不是先到先匹配,其他再说
2.匹配可重复吗?一个字段已经在前面被匹配过了,还可作为新字段的一部分被匹配吗?
3.好像就这么多,想到再说,我发现很多定义很模糊,所以要写出合格的程序先要了解下楼主的意思.谢谢,等回复

这题目很好,谢谢.

论坛徽章:
0
28 [报告]
发表于 2004-12-16 04:09 |只看该作者

两道题,问了N多人,没结果,再问一下看看

原帖由 "bitbull" 发表:
楼主请把以下几点模糊的地方讲明一下,我写了一个,可对下面的问题不知道什么才是你要的
1.上面位置重复怎么取?是不是先到先匹配,其他再说
2.匹配可重复吗?一个字段已经在前面被匹配过了,还可作为新字段的一部分被匹..........


1. Y
2. N
3. 每个匹配必须包括 ACGT 四个字符,因此 AAAAAAAAAAAAAAAAAAAA不算重复串.

本题出的条件不严格,楼主有一定责任.

论坛徽章:
0
29 [报告]
发表于 2004-12-16 04:24 |只看该作者

两道题,问了N多人,没结果,再问一下看看

最长的可能匹配字符串的长度是 length_of_string / 2, 例如
字符串的长度 10000, 则 最长的可能匹配字符串的长度为 5000
但实际中的最长匹配字符串要短的多.否则若程序中从 L/2 开始循环
所花时间极长. 

因此要手动找出最长串并赋给 STR_MAX.
上例中经手动二分查找(比如固定 STR_MAX=STR_MIN, 从200 开始, 大概 7~ 8 次,得到 STR_MAX=46)
也可以叫程序自动完成, 不过又要添加代码了.

下面是结果:

#  time ./1 datafile > report3

real    3m6.75s
user    2m23.36s
sys     0m0.10s
   




  1. # cat report3
  2. ------------------Repeat Match Line# 1 --------------------

  3. Repeat: TGAGGTCAGGAGTTCAAGACCAGCCTGGCCAACATGGTGAAACCCC, Size: 46, Start Positions: 2679,2992
  4. Repeat: TTGGCTGGGCACAGTGGCTCACGCCTGTAATCC, Size: 33, Start Positions: 1086,5893
  5. Repeat: TGGCCAACATGGTGAAACCCCGTCTCTA, Size: 28, Start Positions: 5983,8614
  6. Repeat: CCTGTAATCCCAGCACTTTGGGAGGC, Size: 26, Start Positions: 1613,2948
  7. Repeat: CGGGCATGGTGGCTCACGCTTGTAAT, Size: 26, Start Positions: 2617,8526
  8. Repeat: CAGCACTTTGGGAGGCTGAGGCAGG, Size: 25, Start Positions: 5926,8554
  9. Repeat: GAACTCCTGACCTCAGGTGATCC, Size: 23, Start Positions: 3913,9062
  10. Repeat: CGTGCCTGTAATCCCAGCTACT, Size: 22, Start Positions: 1241,8671
  11. Repeat: TGAGGCAGGAGAATTGCTTGAA, Size: 22, Start Positions: 1271,6075
  12. Repeat: GGGAGGCTGAGGTGGGTGGAT, Size: 21, Start Positions: 329,2654
  13. Repeat: GAGGTTGTAGTGAGCCGAGAT, Size: 21, Start Positions: 1805,2832
  14. Repeat: GGAGGTGGAGGTTGCAGTGA, Size: 20, Start Positions: 502,8727
  15. Repeat: ACTCCAGCCTGGGCGACAGA, Size: 20, Start Positions: 541,1336
  16. Repeat: GTGCCACTGCACTCCAGCCT, Size: 20, Start Positions: 2854,6130
  17. Repeat: CTAAAAATACAAAAATTAG, Size: 19, Start Positions: 1708,8642
  18. Repeat: AGCTACTTGGGAGGCTGAG, Size: 19, Start Positions: 2784,3093
  19. Repeat: AAAAATACAAAAATTAGCC, Size: 19, Start Positions: 3046,6013
  20. Repeat: AGGAGAATCACTTGAACC, Size: 18, Start Positions: 1778,8707
  21. Repeat: CCCAGGCTGGAGTGCAAT, Size: 18, Start Positions: 3732,8883
  22. Repeat: AAAGTGCTGGGATTACAG, Size: 18, Start Positions: 4558,9101
  23. Repeat: ACTGCACTCCAGCCTGG, Size: 17, Start Positions: 1832,8753
  24. Repeat: TCGCTTGAACCCGGGAG, Size: 17, Start Positions: 2812,3121
  25. Repeat: TGGAGTTTTGCTCTTGT, Size: 17, Start Positions: 3713,8864
  26. Repeat: GCCTTGGCCTCCCAAA, Size: 16, Start Positions: 1460,3940
  27. Repeat: TATTTTTAGTAGAGAC, Size: 16, Start Positions: 4473,9015
  28. Repeat: CCACCTCGCCTGGCT, Size: 15, Start Positions: 208,8993
  29. Repeat: TAAACAAGGACTTTT, Size: 15, Start Positions: 1510,1556
  30. Repeat: GGGTTTCTCCATGTT, Size: 15, Start Positions: 4490,9032
  31. Repeat: AGACTCCATCTCAA, Size: 14, Start Positions: 2887,8781
  32. Repeat: CTGCCTCAGCCTCC, Size: 14, Start Positions: 3802,8953
  33. Repeat: GATTACAGGCATGC, Size: 14, Start Positions: 3827,8978
  34. Repeat: TGTGGTGGTGCA, Size: 12, Start Positions: 436,1732
  35. Repeat: ACAATGCTGTAA, Size: 12, Start Positions: 847,9825
  36. Repeat: ACCCTGTCTCTA, Size: 12, Start Positions: 1194,5426
  37. Repeat: TGAGGTCAGGAG, Size: 12, Start Positions: 5958,8588
  38. Repeat: GCCTGTAATCC, Size: 11, Start Positions: 309,3081
  39. Repeat: CAGCCTGGCCA, Size: 11, Start Positions: 374,1173
  40. Repeat: GGGAGGCTGAG, Size: 11, Start Positions: 469,1128
  41. Repeat: AGGCTGGTCTC, Size: 11, Start Positions: 1436,9051
  42. Repeat: GTGTTTCTAAC, Size: 11, Start Positions: 2256,7173
  43. Repeat: ATGAACAAGGG, Size: 11, Start Positions: 7604,9373
  44. Repeat: AAGCAATTCTC, Size: 11, Start Positions: 8435,8942
  45. Repeat: TTCTTTTTGA, Size: 10, Start Positions: 65,4308
  46. Repeat: CTGTGAATAT, Size: 10, Start Positions: 260,6206
  47. Repeat: CCCAGCTACT, Size: 10, Start Positions: 458,6057
  48. Repeat: GATTTTCTAT, Size: 10, Start Positions: 633,9283
  49. Repeat: GCTGTCATTT, Size: 10, Start Positions: 652,5261
  50. Repeat: ATTAGTTTTC, Size: 10, Start Positions: 738,7084
  51. Repeat: AAGTTTCAAG, Size: 10, Start Positions: 928,5153
  52. Repeat: TTAGTTCTCA, Size: 10, Start Positions: 1955,6839
  53. Repeat: TCAGCCAGAT, Size: 10, Start Positions: 1988,9888
  54. Repeat: ATTTGCTTTT, Size: 10, Start Positions: 2246,9636
  55. Repeat: TGAGCTCTTA, Size: 10, Start Positions: 3565,9590
  56. Repeat: GCCCACATTA, Size: 10, Start Positions: 7007,8318
复制代码

论坛徽章:
0
30 [报告]
发表于 2004-12-16 22:45 |只看该作者

两道题,问了N多人,没结果,再问一下看看

原帖由 "lightspeed" 发表:


1. Y
2. N
3. 每个匹配必须包括 ACGT 四个字符,因此 AAAAAAAAAAAAAAAAAAAA不算重复串.

本题出的条件不严格,楼主有一定责任.



对不起了,其实如果是学生物的话这个就不用说了,我没考虑到这点,呵呵。大家都是IT的,所以可能不会自然想到这点,主任回答的很对,我想不用再说了。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP