travelsky2008 发表于 2008-07-17 21:12

xerces支持字符集问题


                1。Xerces-C支持的编码格式仅可数几种,不支持中文。详细描述见英文描述:
Xerces-C has intrinsic support for ASCII, UTF-8, UTF-16 (Big/Small
      Endian), UCS4 (Big/Small Endian), EBCDIC code pages IBM037, IBM1047 and IBM1140
      encodings, ISO-8859-1 (aka Latin1) and Windows-1252. This means that it can
      parse input XML files in these above mentioned encodings.
2。IBM支持的另一开源项目ICU提供超过100种字符集。
      
XML4C -- the version of Xerces-C available from IBM -- combines Xerces-C
         and
         International Components for Unicode (ICU) and
         extends the encoding support to over 100 different encodings that are allowed
         by ICU.In particular, all the encodings registered with the
         
         Internet Assigned Numbers Authority (IANA)are supported in XML4C.
      
Some implementations or ports of Xerces-C provide support for
      additional encodings. The exact set will depend on the supplier of the parser
      and on the character set transcoding services in use.
http://blogimg.chinaunix.net/blog/upfile2/080717213209.gif
   
               
               
               
               
               

本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u2/63150/showart_1084766.html
页: [1]
查看完整版本: xerces支持字符集问题