- 论坛徽章:
- 0
|
Written By: Wwashington AT ChinaUnix
Publish Date: 2006/09/10
说明:上次发表了“正确使用JIRA/Bugzilla做缺陷管理”之后,有朋友告
诉我,邮件标题在中英文和括号混用的时候仍然有问题,会引起乱码或丢失
字符。针对这个情况,今天上午我专门进行了深入研究,并且得到了满意的
解决。由于问题比较特殊,我打算独立一个专题来讲。引用的文章如下。
http://bbs.chinaunix.net/viewthread.php?tid=823695
[原创] 正确使用JIRA/Bugzilla做缺陷管理
1) 推荐代码
当中文太长、中英文和括号混用的时候,会引起 MIME-Q 编码换行,但是
目前的 Mail 系统如 21cn.com、163.com 以及 Mail Direct Pro 都无法
正确识别这些特殊的编码。需要改进 Perl 程序,使 MIME-Q 兼容邮件。
上次我已经给出过 u1.pl 和 u2.pl 的代码。现在 u1.pl 保持不变,看到
u2.pl 实在太简单了,于是把它扩充了一下,解决中文太长的问题。针对中
英文和括号混用引起换行的问题,专门编写了 u3.pl 并应用于 Bugzilla。
type u2.pl {执行时候请用 perl u2.pl "你好"}
<----------
use Encode;
use Unicode::UCD;
$str=decode('GBK',shift);
# 由于系统环境为gb2312,所以cmd后的中文参数也为
# gb2312编码,因此告诉perl先把shift得到的参数按
# gb2312解码成unicode
$str = encode('MIME-Q', $str);
print $str."\n";
my $tmp = $str;
my $len = length($str);
my $adr = index($tmp,'?==?');
my $pos = 0;
if ($adr == -1) {
print "[info] string ?==? not found, adr=$adr len=$len\n";
$adr = index($tmp,'?=',10); # 前缀 =?UTF-8?Q? 占了10个字符
$pos = $pos + 2; # 如果换行,则必须再删除2个字符
print "[info] string ?= is found, adr=$adr len=$len\n";
}
if ($adr == $len-2) { # 后缀 ?= 如果已经是最后的一个
$adr=-1;
}
if ($adr > 0) { # 这表明已经发现了多余的字符串
$str = substr($tmp,0,$adr).substr($tmp,$adr+12+$pos);
} # 后缀 ?= 占2个字符,前缀 =?UTF-8?Q? 占10个字符,一共12个
print "\n".$str."\n[info] adr=$adr len=$len";
---------->
type u3.pl {执行时候请用 perl u3.pl "你好"}
<----------
use Encode;
use Unicode::UCD;
$str=decode('GBK',shift);
# 由于系统环境为gb2312,所以cmd后的中文参数也为
# gb2312编码,因此告诉perl先把shift得到的参数按
# gb2312解码成unicode
$str = encode('MIME-Q', $str);
print $str."\n";
my $tmp = $str;
my $len = length($str);
my $utf = index($tmp,'=?UTF-8?Q?');
my $adr = 0; # 设置开始条件
my $pos = 0;
my $flag = "X";
my $plus = "X";
while ($adr != -1) { # 首先处理遇到 ?= 符换行的情况
$adr = index($tmp,'?=',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
$flag = substr($tmp,$adr+3,1); # 通常换行符出现在 ?= 后第三个
$plus = substr($tmp,$adr+2,1); # . , : ; = ? @ / < > ( ) [ ] 导致换行
if ($flag eq "\n") { # 如果出现换行标记 "\n" 则删除
print "\nFound ?= at right";
$str = substr($tmp,0,$adr).$plus.substr($tmp,$adr+5);
$tmp = $str;
}
$utf = $adr;
}
print "\n".$str."\n";
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
$plus = "X";
while ($adr != -1) { # 然后处理遇到 =? 符换行的情况
$adr = index($tmp,'=?',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
$flag = substr($tmp,$adr-3,1); # 通常换行符出现在 =? 前第三个
$plus = substr($tmp,$adr-1,1); # . , : ; = ? @ / < > ( ) [ ] 导致换行
if ($flag eq "\n") {
print "\nFound =? at left!";
$str = substr($tmp,0,$adr-3).$plus.substr($tmp,$adr);
$tmp = $str;
}
$utf = $adr;
}
print "\n".$str;
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
$plus = "X";
while ($adr != -1) { # 处理遇到 =?UTF-8?Q? 换行情况
$adr = index($tmp,'=?UTF-8?Q?',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
$flag = substr($tmp,$adr-2,1); # 通常换行符出现在 =? 前第二个
$plus = substr($tmp,$adr-3,1); # . , : ; = ? @ / < > ( ) [ ] 导致换行
if ($flag eq "\n") {
print "\nFound =?UTF-8?Q? at left!";
$str = substr($tmp,0,$adr-3).$plus.substr($tmp,$adr);
$tmp = $str;
}
$utf = $adr;
}
print "\n".$str;
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
while ($adr != -1) {
$adr = index($tmp,'=?UTF-8?Q?',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
print "\n[info] string =?UTF-8?Q? is found, utf=$utf adr=$adr len=$len\n";
if ($adr > 0) { # 找到多余 =?UTF-8?Q? 应当删除
$str = substr($tmp,0,$adr).substr($tmp,$adr+10);
print "\n".$str;
$utf = $adr;
$tmp = $str;
$len = length($str);
}
}
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
while ($adr != -1) {
$pos = index($tmp,'?=',$utf+10); # 首先从左向右依次搜索 ?= 位置
$utf = $pos;
$adr = index($tmp,'?=',$utf+10); # 搜索下一个 ?= 位置,判断末尾
if ($adr != -1) { # 找到 $adr,说明 $pos 不是末尾
$str = substr($tmp,0,$pos).substr($tmp,$pos+2);
$tmp = $str;
}
$len = length($str);
print "\n".$str;
# print "\n".$pos." --- ".$adr;
print "\n[done] string ?= is found, utf=$utf adr=$adr len=$len\n";
}
$flag = "X";
$utf = index($tmp,'=?UTF-8?Q?');
if ($utf > 1) {
$pos = $utf-2;
$flag = substr($tmp,$pos,1);
if ($flag eq "\n") { # 首个UTF-8符前 . , : ; = ? @ / < > ( ) [ ] 导致换行
$str = substr($tmp,0,$pos).substr($tmp,$utf);
}
$len = length($str);
print "\n".$str;
# print "\n".$pos." --- ".$flag;
print "\n[-OK-] string =?UTF-8?Q? is found, utf=$utf adr=$adr len=$len";
}
---------->
2) 运行结果
H:\Usr\bin>perl u2.pl "Buffer Overflow 问题处理"
=?UTF-8?Q?Buffer=20Overflow=20=E9=97=AE?=
=?UTF-8?Q?=E9=A2=98=E5=A4=84=E7=90=86?=
[info] string ?==? not found, adr=-1 len=82
[info] string ?= is found, adr=39 len=82
=?UTF-8?Q?Buffer=20Overflow=20=E9=97=AE=E9=A2=98=E5=A4=84?=
[info] adr=39 len=71
H:\Usr\bin>perl u2.pl "Buffer Overflow 问题"
=?UTF-8?Q?Buffer=20Overflow=20=E9=97=AE?==?UTF-8?Q?=E9=A2=98?=
=?UTF-8?Q?Buffer=20Overflow=20=E9=97=AE=E9=A2=98?=
[info] adr=39 len=62
--------------------------------------------------------------------------------------
H:\Usr\bin>perl u2.pl "[Bugzilla Mail] 系统更改密码请求 系统更改密码请求"
......
[Bugzilla Mail]
=?UTF-8?QE7=BB=9F=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81?=
=?UTF-8?Q?=E8=AF=B7=E6=B1=82=20=E7=B3=BB=E7=BB=9F=E6=9B=B4=E6=94=B9?=
=?UTF-8?Q?=E5=AF=86=E7=A0=81=E8=AF=B7=E6=B1=82?=
[info] adr=26 len=207
H:\Usr\bin>perl u3.pl "[Bugzilla Mail] 系统更改密码请求 系统更改密码请求 系统更改密码"
......
[Bugzilla Mail]=?UTF-8?Q?=20=E7=B3=BB=E7=BB=9F=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=
81=E8=AF=B7=E6=B1=82=20=E7=B3=BB=E7=BB=9F=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81=E8
=AF=B7=E6=B1=82=20=E7=B3=BB=E7=BB=9F=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81?=
[-OK-] string =?UTF-8?Q? is found, utf=17 adr=234 len=234
--------------------------------------------------------------------------------------
H:\Usr\bin>perl u3.pl "[Bugzilla]更改系统密码<Bugzilla>更改系统密码(Bugzilla)更改密码"
......
[Bugzilla]=?UTF-8?Q?=E6=9B=B4=E6=94=B9=E7=B3=BB=E7=BB=9F=E5=AF=86=E7=A0=81<Bugzi
lla>=E6=9B=B4=E6=94=B9=E7=B3=BB=E7=BB=9F=E5=AF=86=E7=A0=81(Bugzilla)=E6=9B=B4=E6
=94=B9=E5=AF=86=E7=A0=81?=
[-OK-] string =?UTF-8?Q? is found, utf=12 adr=186 len=186
H:\Usr\bin>perl u3.pl "更改密码[Bugzilla]更改密码<Bugzilla>更改密码(Bugzilla)更改密码"
......
=?UTF-8?Q?=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81[Bugzilla]=E6=9B=B4=E6=94=B9=E5=AF
=86=E7=A0=81<Bugzilla>=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81(Bugzilla)=E6=9B=B4=E6
=94=B9=E5=AF=86=E7=A0=81?=
[done] string ?= is found, utf=0 adr=138 len=186
--------------------------------------------------------------------------------------
H:\Usr\bin>perl u3.pl "更改密码[Bugzilla]更改密码<Bugzilla>"
......
=?UTF-8?Q?=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81[Bugzilla]=E6=9B=B4=E6=94=B9=E5=AF
=86=E7=A0=81?=<Bugzilla>
[done] string ?= is found, utf=92 adr=-1 len=104
H:\Usr\bin>perl u3.pl "[Bugzilla]更改密码"
......
[Bugzilla]=?UTF-8?Q?=E6=9B=B4=E6=94=B9=E5=AF=86=E7=A0=81?=
[-OK-] string =?UTF-8?Q? is found, utf=10 adr=-1 len=58
H:\Usr\bin>perl u3.pl "更改[Bugzilla]密码"
......
=?UTF-8?Q?=E6=9B=B4=E6=94=B9[Bugzilla]=E5=AF=86=E7=A0=81?=
[done] string ?= is found, utf=56 adr=-1 len=58
--------------------------------------------------------------------------------------
3) 邮件模版
从上面的运行结果,我们知道 Bugzilla 里面的 BugMail.pm 需要继续修改才能完美地支持中文。
修改的办法是用 UltraEdit 打开文件,搜索字符串 $substs{"summary"},在它的下面插入代码。
## Use Encode function in Perl to make an UTF-8 string.
use Encode;
$substs{"summary"} = encode('MIME-Q', $substs{"summary"});
my $str = $substs{"summary"};
my $tmp = $str;
my $len = length($str);
my $utf = index($tmp,'=?UTF-8?Q?');
my $adr = 0; # 设置开始条件
my $pos = 0;
my $flag = "X";
my $plus = "X";
while ($adr != -1) { # 首先处理遇到 ?= 符换行的情况
$adr = index($tmp,'?=',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
$flag = substr($tmp,$adr+3,1); # 通常换行符出现在 ?= 后第三个
$plus = substr($tmp,$adr+2,1); # . , : ; = ? @ / < > ( ) [ ] 导致换行
if ($flag eq "\n") { # 如果出现换行标记 "\n" 则删除
$str = substr($tmp,0,$adr).$plus.substr($tmp,$adr+5);
$tmp = $str;
}
$utf = $adr;
}
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
$plus = "X";
while ($adr != -1) { # 然后处理遇到 =? 符换行的情况
$adr = index($tmp,'=?',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
$flag = substr($tmp,$adr-3,1); # 通常换行符出现在 =? 前第三个
$plus = substr($tmp,$adr-1,1); # . , : ; = ? @ / < > ( ) [ ] 导致换行
if ($flag eq "\n") {
$str = substr($tmp,0,$adr-3).$plus.substr($tmp,$adr);
$tmp = $str;
}
$utf = $adr;
}
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
$plus = "X";
while ($adr != -1) { # 处理遇到 =?UTF-8?Q? 换行情况
$adr = index($tmp,'=?UTF-8?Q?',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
$flag = substr($tmp,$adr-2,1); # 通常换行符出现在 =? 前第二个
$plus = substr($tmp,$adr-3,1); # . , : ; = ? @ / < > ( ) [ ] 导致换行
if ($flag eq "\n") {
$str = substr($tmp,0,$adr-3).$plus.substr($tmp,$adr);
$tmp = $str;
}
$utf = $adr;
}
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
while ($adr != -1) {
$adr = index($tmp,'=?UTF-8?Q?',$utf+10); # 前缀 =?UTF-8?Q? 占了10个字符
if ($adr > 0) { # 找到多余 =?UTF-8?Q? 应当删除
$str = substr($tmp,0,$adr).substr($tmp,$adr+10);
$utf = $adr;
$tmp = $str;
$len = length($str);
}
}
$adr = 0; # 设置开始条件
$utf = index($tmp,'=?UTF-8?Q?');
while ($adr != -1) {
$pos = index($tmp,'?=',$utf+10); # 首先从左向右依次搜索 ?= 位置
$utf = $pos;
$adr = index($tmp,'?=',$utf+10); # 搜索下一个 ?= 位置,判断末尾
if ($adr != -1) { # 找到 $adr,说明 $pos 不是末尾
$str = substr($tmp,0,$pos).substr($tmp,$pos+2);
$tmp = $str;
}
$len = length($str);
}
$flag = "X";
$utf = index($tmp,'=?UTF-8?Q?');
if ($utf > 1) {
$pos = $utf-2;
$flag = substr($tmp,$pos,1);
if ($flag eq "\n") { # 首个UTF-8符前 . , : ; = ? @ / < > ( ) [ ] 导致换行
$str = substr($tmp,0,$pos).substr($tmp,$utf);
}
$len = length($str);
}
$substs{"summary"} = $str;
## Encode function ends here. Code is adapted from Perl script u3.pl
如果不想自己动手改,可以直接下载附件。里面包含了安装 Bugzilla 汉化包之后,还需要继续
修改的文件,覆盖到相应的目录就能用了。现在我已经解决邮件标题的中文乱码或字符丢失问题,
至于邮件内部的乱码,处理起来非常简单,汉化包的作者已经讲过这个问题,大家自己看吧。
http://blog.donews.com/ymliu888/archive/2005/12/14/658121.aspx
关于Bugzilla2.20发邮件以及邮件乱码的问题
[[i] 本帖最后由 wwashington 于 2006-9-11 19:22 编辑 [/i]] |
|