- 论坛徽章:
- 0
|
[color="#000000"]1。[color="#000000"]Xerces-C支持的编码格式仅可数几种,不支持中文。详细描述见英文描述:
Xerces-C has intrinsic support for ASCII, UTF-8, UTF-16 (Big/Small
Endian), UCS4 (Big/Small Endian), EBCDIC code pages IBM037, IBM1047 and IBM1140
encodings, ISO-8859-1 (aka Latin1) and Windows-1252. This means that it can
parse input XML files in these above mentioned encodings.
2。IBM支持的另一开源项目ICU提供超过100种字符集。
XML4C -- the version of Xerces-C available from IBM -- combines Xerces-C
and
International Components for Unicode (ICU) and
extends the encoding support to over 100 different encodings that are allowed
by ICU. In particular, all the encodings registered with the
Internet Assigned Numbers Authority (IANA) are supported in XML4C.
Some implementations or ports of Xerces-C provide support for
additional encodings. The exact set will depend on the supplier of the parser
and on the character set transcoding services in use.
![]()
本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u2/63150/showart_1084766.html |
|