- 论坛徽章:
- 0
|
大叔的这个程序在小样本的时候没看出问题,但到了大样本的时候似乎有些不对,如附件给的数据,当 $yangbengeshu = 3时,结果居然都是5或6,理论上应该是<=3的数值,不知问题出在哪里呢?- #!/usr/bin/perl
- use strict;
- use warnings;
- use 5.010;
- my $yangbengeshu = 3; #抽取样本个数
- my @A;
- open A,"<human_gene.txt";
- while(<A>){
- chomp;
- s/^\s+$//;
- s/^\s+//;
- s/\s+$//;
- push @A,$_ if $_ ne '';
- }
- close A;
- if($yangbengeshu > scalar @A){
- die "ERROR: 抽取样本个数大于总个数\n";
- }
- my @B;
- open B,"<targetgene.txt";
- while(<B>){
- chomp;
- s/^\s+$//;
- s/^\s+//;
- s/\s+$//;
- push @B,(split /\s*\/\s*/,$_) if $_ ne '';
- }
- close B;
- for(1 .. 1000){
- my %rand = ();
- while(scalar keys %rand < $yangbengeshu){
- $rand{int(rand scalar @A)} = 0;
- }
- my @A_choose = map{ $A[$_] } keys %rand;
- my (%m,%n);
- for(@A_choose,@B){
- $m{$_}++ && $n{$_}++;
- }
- say scalar keys %n;
- }
复制代码 结果是:
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
6
6
5
6
5
6
5
5
5
5
6
6
5
5
5
5
5
5
6
6
6
5
5
6
5
5
.... |
|