Chinaunix
标题:
问个比较菜的问题,大家多包含(那种编码较流行)
[打印本页]
作者:
khandielas
时间:
2009-04-17 12:50
标题:
问个比较菜的问题,大家多包含(那种编码较流行)
我现在想建数据库,可以存中文,然后要用perl写script跟这个数据库交换信息。
不太明白我要用哪个charset才对,也不清楚哪个比较流行,哪个以后更容易处理麻烦最少,读了下mysql的手册,好像下面这3个命令都可以,我要选哪个好?希望高手指教,在线等
CREATE DATABASE my_1st_db CHARACTER SET utf8;
CREATE DATABASE my_1st_db CHARACTER SET gb2312;
CREATE DATABASE my_1st_db CHARACTER SET gbk;
[
本帖最后由 枫影谁用了 于 2009-4-17 13:39 编辑
]
作者:
枫影谁用了
时间:
2009-04-17 12:54
第一个比较流行
作者:
khandielas
时间:
2009-04-17 13:36
多谢,好像这样就可以了。我把命令都写下来:
mysql>
CREATE DATABASE my_1st_db CHARACTER SET utf8;
Query OK, 1 row affected (0.00 sec)
mysql>
show DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| my_1st_db |
+--------------------+
mysql>
use my_1st_db;
Database changed
mysql>
create table my_1st_table (USR varchar(30), INFO varchar(30));
Query OK, 0 rows affected (0.01 sec)
mysql>
desc my_1st_table;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| USR | varchar(30) | YES | | NULL | |
| INFO | varchar(30) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
mysql>
insert into my_1st_table set USR='光明', INFO='学习中';
Query OK, 1 row affected (0.00 sec)
mysql>
select * from my_1st_table;
+--------+-----------+
| USR | INFO |
+--------+-----------+
| 光明 | 学习中 |
+--------+-----------+
然后我写了个小的perl script来试验:
#!/usr/bin/perl
use strict;
use warnings;
use DBI();
use utf8;
my $data_source = 'my_1st_db';
my $username = 'XXXXXXXX';
my $password = 'XXXXXXXX';
# Connect to the database.
my $dbh = DBI->connect("DBI:mysql:database=$data_source",
"$username", "$password",
{'RaiseError' => 1});
my $sql = "INSERT INTO my_1st_table VALUES('齐秦', '狼')";
$dbh->do($sql);
my $ref = $dbh->selectall_arrayref(
"SELECT USR, INFO FROM my_1st_table",
{ Slice => {} }
);
foreach my $q ( @$ref ) {
print "USER: $q->{USR} => INFO: $q->{INFO}\n";
}
ubuntu>
./test_mysql.pl
USER: 光明 => INFO: 学习中
USER: 齐秦 => INFO: 狼
作者:
khandielas
时间:
2009-04-17 13:41
继续test regex
好像要decode一下才能用regrex。
我加了这个
use Encode;
if (decode('utf8', $usr->{USR}) =~ /秦/) {
print "USER: $usr->{USR} => INFO: $usr->{INFO}\n";
}
这次输出的只有齐秦
ubuntu>.\test_mysql.pl
USER: 齐秦 => INFO: 狼
USER: 齐秦 => INFO: 狼
作者:
网中浪子
时间:
2009-04-17 14:07
提示:
作者被禁止或删除 内容自动屏蔽
作者:
khandielas
时间:
2009-04-17 14:13
query出来的都要decode一下才能作regrex么,有没有更好的办法?
如果程序长的话,这个decode的地方太多了啊。
我又试验了一下读入一个文本文件, 没有问题, code 如下:
open IN, "<", 'test_my_sql_01.input';
while ( <IN> ) {
chomp;
my ($usr, $info) = split /\|/;
print $usr, ' ==>> ', $info, "\n";
my $sql = "INSERT INTO my_1st_table set USR=\'$usr\', INFO=\'$info\'";
$dbh->do($sql);
}
作者:
khandielas
时间:
2009-04-18 12:23
加中文的条件就必须设定几个数据库参数
mysql>
SET character_set_client = utf8;
Query OK, 0 rows affected (0.00 sec)
mysql>
SET character_set_results = utf8;
Query OK, 0 rows affected (0.00 sec)
mysql>
SET character_set_connection = utf8;
然后加中文的条件也可以作query了。
my $ref = $dbh->selectall_arrayref(
"SELECT USR, INFO FROM my_1st_table WHERE USR='光明'",
{ Slice => {} }
);
foreach my $q ( @$ref ){
print "USER: $q->{USR} => INFO: $q->{INFO}\n";
}
$ ./test_mysql_query_with_condition.pl
USER: 光明 => INFO: 学习中
小结:
mySQL用utf8是可行的,如果想方便的话,直接在server level就把default 的charset 变成 utf8 也可以, 应用utf8的时候要注意定义column的时候用varchar, 可以节约空间。
perl 内部支持unicode,但是输出和输入的时候要转换。
作者:
khandielas
时间:
2009-04-18 12:25
最后作个show variables, 可以看到虽然server的charset还是latin1,database level的charset都变成了utf8.
mysql>
show variables;
+---------------------------------+-----------------------------+
| Variable_name | Value |
+---------------------------------+-----------------------------+
| auto_increment_increment | 1 |
| auto_increment_offset | 1 |
| automatic_sp_privileges | ON |
| back_log | 50 |
| basedir | /usr/ |
| binlog_cache_size | 32768 |
| bulk_insert_buffer_size | 8388608 |
|
character_set_client | utf8 |
| character_set_connection | utf8 |
|
character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | latin1_swedish_ci |
......
欢迎光临 Chinaunix (http://bbs.chinaunix.net/)
Powered by Discuz! X3.2