这个文件如何读取?
Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:12:08) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> file="test"
>>> x1=open(file,"r",encoding="gb2132").read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LookupError: unknown encoding: gb2132
>>> x2=open(file,"r",encoding="gbk").read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa9 in position 69808: illega
l multibyte sequence
>>> x3=open(file,"r",encoding="utf-8").read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\python33\lib\codecs.py", line 301, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0: invalid
start byte
>>>
test文件在这里,请看附件。
$ file 600005.Txt
600005.Txt: ISO-8859 text, with very long lines, with CRLF, CR, LF line terminators这文件已经是 8859的编码了.
页:
[1]