求解 如何输出文件中某一列长度最长的行?
我有一个基因组文件,其中第一列是染色体名称,第十列是对应的序列,目的是输出 在第一列名称相同的条件下,第十列长度最长的行。文件大致如下:
EAA1-10 0 scaffold1478 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAATAATTGGAACGAAAT
EAA1-10 0 scaffold1478 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAAT
EAA1-11 0 scaffold147 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAATAATTGGAACGAAAT
EAA1-12 0 scaffold100 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAATTGGAACGAAATAATGGAACGAAA
...
请求各位大神指教~
这样?{:yct10:}
#!perl6
use v6.c;
sub MAIN {
my $input_file = 'a.txt';
my %chromosome_hash = %();
for $input_file.IO.open(:chomp).lines -> $line {
my @parts = $line.split(/\s+/);
my $name = @parts;
my $current_seq_length = @parts.chars;
if %chromosome_hash{$name}<length>:exists {
my $seq_length = %chromosome_hash{$name}<length>;
if $current_seq_length > $seq_length {
%chromosome_hash{$name}<length> = $current_seq_length;
}
}
else {
%chromosome_hash{$name}<length> = $current_seq_length;
}
%chromosome_hash{$name}<line> = $line;
}
for %chromosome_hash.values.sort -> $value_hash {
$value_hash<line>.say;
}
}
回复 2# stanley_tam
非常感谢!!
我大约看明白您的意思了,但是运行出现了报错:
Scalar value @parts better written as $parts at delete.pl line 10.
Scalar value @parts better written as $parts at delete.pl line 11.
"my" variable %chromosome_hash masks earlier declaration in same scope at delete.pl line 16.
"my" variable $name masks earlier declaration in same scope at delete.pl line 16.
"my" variable $current_seq_length masks earlier declaration in same scope at delete.pl line 16.
"my" variable %chromosome_hash masks earlier declaration in same scope at delete.pl line 20.
"my" variable $name masks earlier declaration in same scope at delete.pl line 20.
"my" variable $current_seq_length masks earlier declaration in same scope at delete.pl line 20.
"my" variable %chromosome_hash masks earlier declaration in same scope at delete.pl line 22.
"my" variable $name masks earlier declaration in same scope at delete.pl line 22.
"my" variable %chromosome_hash masks earlier declaration in same scope at delete.pl line 25.
Warning: Use of "values" without parentheses is ambiguous at delete.pl line 25.
syntax error at delete.pl line 7, near "%() "
syntax error at delete.pl line 8, near "$input_file."
Global symbol "%line" requires explicit package name at delete.pl line 8.
syntax error at delete.pl line 13, near "if %chromosome_hash"
syntax error at delete.pl line 13, near "<length>:"
Execution of delete.pl aborted due to compilation errors.
要安装perl6 {:qq23:}
https://perl6.org/downloads/ 回复 4# stanley_tam
我安装失败了,最后一步make install 这里总是有问题,请问大神如果用perl5的话,是不是要改这种中间带点的行啊{:qq22:}
for $input_file.IO.open(:chomp).lines -> $line {
my $Flag={};
my $Data={};
while(my $line=<DATA>){
chomp;
my($id,$seq)=(split(/\s+/,$line));
my $length=length($seq);
if($Flag->{$id}){
if($length>$Flag->{$id}){
$Flag->{$id}=$length;
$Data->{$id}=$line;
}
}else{
$Flag->{$id}=$length;
$Data->{$id}=$line;
}
}
foreach my $id (sort keys %{$Data}){
print "$Data->{$id}\n";
}
__DATA__
EAA1-10 0 scaffold1478 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAATAATTGGAACGAAAT
EAA1-10 0 scaffold1478 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAAT
EAA1-11 0 scaffold147 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAATAATTGGAACGAAAT
EAA1-12 0 scaffold100 49534160 24S4359M126I10900M64I917M65I2319M * 0 0 AATTGGAACGAAATAATTGGAACGAAATTGGAACGAAATAATGGAACGAAA
1 #!/usr/bin/perl -w
2 use strict;
3 use 5.010;
4
5 open my $IN_1 , '<' , "orange_in_1.file" or die "can not read! $!\n";
6 open my $OUT, '>' , "orange_out.file"or die "can not write! $!\n";
7
8 my %hash;
9 my @line;
10 while (<$IN_1>) {
11 chomp;
12 @line = split /\s+/;
13 if (exists $hash{$line}) {
14 if (length ($hash{$line}->) < length ($line)) {
15 $hash{$line}-> = $line;
16 $hash{$line}-> = $line;
17 $hash{$line}-> = $line;
18 $hash{$line}-> = $line;
19 $hash{$line}-> = $line;
20 $hash{$line}-> = $line;
21 $hash{$line}-> = $line;
22 $hash{$line}-> = $line;
23 $hash{$line}-> = $line;
24 }
25 } else {$hash{$line} = [$line, $line, $line, $line, $line, $line, $line, $line, $line];}
26 }
27
28 map {print $OUT "$_ $hash{$_}-> $hash{$_}-> $hash{$_}-> $hash{$_}-> $hash{$_}-> $hash{$_}-> $hash{$_}-> $hash{$_}-> $hash{$_}->\n";} sort keys %hash;
运行结果:见附件图片 回复 6# b114213903
学习学习! 回复 6# b114213903
:victory:
页:
[1]