- 论坛徽章:
- 95
|
本帖最后由 MMMIX 于 2015-08-08 21:20 编辑
回复 18# super_two
真心建议你找些 Perl 编程的书好好看看, 不过估计你是听不进去的.
无论如何, 下述脚本演示了解决你的问题的一种方法. 另外,如果在你的输入文件中,id相同的序列顺序出现,那么也可以不用hash.
- use strict;
- use warnings;
- use v5.14;
- my %record;
- while (<DATA>) {
- my @f = split;
- $f[0] =~ s/(?:\.\d*)?$//;
- if (! $record{$f[0]} || $record{$f[0]}->[0] < length($f[1])) {
- $record{$f[0]} = [length $f[1], $f[1]];
- }
- }
- say "output version 1:";
- while (my ($id, $seq) = each %record) {
- say "$id\t", $seq->[1];
- }
- say "output version 2:";
- for my $id (sort keys %record) {
- printf "%-12s %s\n", $id, $record{$id}->[1];
- }
- __DATA__
- qwer.1 AFRTYUIFGHJKLVBNM
- qwer.2 BVXNVFGSFYEBCSHB
- qwer.3 HDFKSHFGSERYFIEURHFSUFDSHVBSJEUABFUHABFCAHFBC
- rtyuip00.1 AFHBVSFHUACFKUSHDBAKFHAKUFHSADKFUA
- hhjkl.1 JDNVKHFBAKHFAKFAFJNSADFJAS
- hhjk.2 HFSDHNFKANFAKFIJI
- hhjk.3 JNFAJNFALSDFLMAD
- hhjk.4 KJGSEGJOAKFFDSMFAPOKEF
复制代码 |
|