- 论坛徽章:
- 26
|
回复 12# 咏咏672418539
有解决办法了[ ] 赞一个! [ ]
顺便提醒你,Python将很快成为你最喜欢的编程语言!
文件 mydata- 〉1
- ATGTCTAAANGTTCCTACTATTTTGAACCCTACTGANANGANAGACCTTCAACNNTCTTATT
- 〉2
- GCCAAGAACCTCAACACTTGATTAACCTTGG
- 〉3
- CAAGACGTGGGAAAAGCTCATCTTTGCTGCTATTGTGGTTGTC
- 〉4
- TCTGCTCGTCCCTACGGCCACCGTGCCGCCTT
- 〉5
- CTTCACNGCNCAGGTACGTTTACCAATTACATCANNN
复制代码 python2 代码- #!/usr/bin/python2
- # coding: utf-8
- from collections import defaultdict as DICK
- DATA = 'mydata' # 也就是 文件名
- F = open(DATA)
- U = 6 # 六联体
- V = [0.25**n for n in xrange(U + 1)]
- N = 'AGCT'
- for line in F:
- print line,
- seq = F.next()
- dic = DICK(int)
- for i in xrange(len(seq) - U):
- sub = seq[i:i + U]
- num = sub.count('N')
- if num is 0:
- dic[sub] += 1
- continue
- L = ['']
- for c in sub:
- if c is 'N': L = [e + C for e in L for C in N]
- else : L = [e + c for e in L]
- for e in L: dic[e] += V[num]
- for k, v in sorted(dic.items()): print k, v
复制代码 频率 |
|