Chinaunix
标题:
字符集问题
[打印本页]
作者:
together007
时间:
2012-10-13 08:59
标题:
字符集问题
各位大虾:
下面两行应该是浏览器将中文转换成别的字符集了,我如果在C程序中读到这些串后将其还原成原中文?
$%7BPRODUCT_NAME%7D
%E4%B9%A6%E6%97%97%E5%85%8D%E8%B4%B9%E5%B0%8F%E8%AF%B4
%E5%BC%80%E5%BF%83%E6%B0%B4%E6%97%8F%E7%AE%B1
求各位指教,小弟在线等!
作者:
linux_c_py_php
时间:
2012-10-13 10:04
本帖最后由 linux_c_py_php 于 2012-10-13 11:43 编辑
[root@vps616 php]# php main.php
${PRODUCT_NAME}
书旗免费小说
开心水族箱[root@vps616 php]# cat main.php
<?php
$content = <<<EOF
$%7BPRODUCT_NAME%7D
%E4%B9%A6%E6%97%97%E5%85%8D%E8%B4%B9%E5%B0%8F%E8%AF%B4
%E5%BC%80%E5%BF%83%E6%B0%B4%E6%97%8F%E7%AE%B1
EOF;
echo urldecode($content);
?>
复制代码
额, 是C代码, 写了个, 感觉有点糟烂.
urldecode=0 out=${PRODUCT_NAME}书旗免费小说开心水族箱
[root@vps616 c]# cat main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int urldecode(const char *in, int *in_size, char *out, int *out_size) {
if (!in || !out || !in_size || !out_size)
return -1;
int indx, ondx;
for (indx = ondx = 0; *in_size > 0 && *out_size > 0; -- *out_size, ++ ondx) {
int drain = 0;
if (in[indx] == '+') {
out[ondx] = ' ';
drain = 1;
} else if (in[indx] == '%') {
if (*in_size < 3)
return 0;
int i;
char base_chr;
char out_byte = 0;
for (i = 1; i < 3; ++ i) {
int index = indx + i;
if (in[index] >= '0' && in[index] <= '9')
base_chr = '0';
else if (in[index] >= 'a' && in[index] <= 'z')
base_chr = 'a' - 10;
else if (in[index] >= 'A' && in[index] <= 'Z')
base_chr = 'A' - 10;
else
return -1;
out_byte = (out_byte << 4) | (in[index] - base_chr);
}
out[ondx] = out_byte;
drain = 3;
} else {
out[ondx] = in[indx];
drain = 1;
}
indx += drain;
*in_size -= drain;
}
return 0;
}
int main(int argc, char* const argv[]) {
const char *in = "$%7BPRODUCT_NAME%7D"
"%E4%B9%A6%E6%97%97%E5%85%8D%E8%B4%B9%E5%B0%8F%E8%AF%B4"
"%E5%BC%80%E5%BF%83%E6%B0%B4%E6%97%8F%E7%AE%B1";
int in_size = strlen(in);
int out_size = in_size;
char *out = calloc(1, out_size + 1);
int ret = urldecode(in, &in_size, out, &out_size);
printf("urldecode=%d out=%s\n", ret, out);
free(out);
return 0;
}
复制代码
作者:
folklore
时间:
2012-10-13 13:34
lsv5,8741
if it is utf-8, is easy to decode...
作者:
together007
时间:
2012-10-16 14:02
回复
2#
linux_c_py_php
谢谢大侠,您的操作系统用的是什么字符集?
作者:
noword2k
时间:
2012-10-16 17:38
一看就是utf-8。
直接转成对应的16进制,然后decode。
作者:
linux_c_py_php
时间:
2012-10-16 21:59
utf-8呀
together007 发表于 2012-10-16 14:02
回复 2# linux_c_py_php
欢迎光临 Chinaunix (http://bbs.chinaunix.net/)
Powered by Discuz! X3.2