- 论坛徽章:
- 0
|
为什么通过scoket读到的网页内容里面会有一些奇怪的字符,比如我读取www.google.cn 读到的是
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=GB2312
Set-Cookie: PREF=ID=2089898e46137a4a:NW=1:TM=1181662196:LM=1181662196:S=T5zGUwA1MR8QBCoI; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
Server: GWS/2.1
Transfer-Encoding: chunked
X-Google-Backends: prcsat-gfe.l.google.com:80,mctf10:80
X-Google-Service: www
X-Google-Request-Trace: mctf10:80,prcsat-gfe.l.google.com:80,mctf10:80
Date: Tue, 12 Jun 2007 15:29:26 GMT
bcf -----> 这里几个字符怎么回事?
<html><head><meta http-equiv="content-type" content="text/html; charset=GB2312"><title>Google</title><style><!--
body,td,a,p,.h{font-family:""}
.h{font-size:20px}
.h{color:#3366cc}
.q{color:#00c}
--></style>
<script>
<!--
function sf(){document.f.q.focus();}
// -->
</script>
</head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onload="sf();if(document.images){new Image().src='/images/nav_logo3.png'}" topmargin=3 marginheight=3><div align=right id=guser style="font-size:84%;padding-bottom:4px" width=100%><nobr><a href="https://www.google.com/accounts/Login?continue=http://203.208.33.101/&hl=zh-CN">登录</a></nobr></div><center><br id=lgpd><table cellpadding=0 cellspacing=0 border=0><tr><td align=right valign=bottom><img src=images/hp0.gif width=158 height=78 alt="Google"></td><td valign=bottom><img src=images/hp1.gif width=50 height=78 alt=""></td><td valign=bottom><img src=images/hp2.gif width=68 height=78 alt=""></td></tr><tr><td class=h align=right valign=top><b></b></td><td valign=top><img src=images/hp3.gif width=50 height=32 alt=""></td><td valign=top class=h><font color=#666666 style=font-size:16px><b>中文(简体)</b></font></td></tr></table><br><form action="/search" name=f><style>#lgpd{display:none}</style><script defer><!--
//-->
</script><table border=0 cellspacing=0 cellpadding=4><tr><td nowrap><font size=-1><b>网页</b> <a class=q href="http://images.google.com/imghp?ie=GB2312&oe=GB2312&hl=zh-CN&tab=wi">图片</a> <a class=q href="http://news.google.com/nwshp?ie=GB2312&oe=GB2312&hl=zh-CN&tab=wn">资 讯</a> <a class=q href="http://groups.google.com/grphp?ie=GB2312&oe=GB2312&hl=zh-CN&tab=wg">论坛</a> <b><a href="/intl/zh-CN/options/" class=q>更多 »</a></b></font></td></tr></table><table cellpadding=0 cellspacing=0><tr valign=top><td width=25%> </td><td align=center nowrap><input name=hl type=hidden value=zh-CN><input type=hidden name=ie value="GB2312"><input maxlength=2048 name=q size=55 title="Google 搜索" value=""><br><input name=btnG type=submit value="Google 搜索"><input name=btnI type=submit value="手气不错"></td><td nowrap width=25%><font size=-1> <a href=/advanced_search?hl=zh-CN>高级搜索</a><br> <a href=/preferences?hl=zh-CN>使用偏好</a><br> <a href=/language_tools?hl=zh-CN>语言工具</a></font></td></tr><tr><td align=center colspan=3><font size=-1><input id=all type=radio name=lr value="" checked><label for=all>所有网页 </label><input id=ch type=radio name=lr value="lang_zh-CN|lang_zh-TW"><label for=ch>中文网页 </label><input id=il type=radio name=lr value="lang_zh-CN"><label for=il>简体 中文网页 </label></font></td></tr></table></form><br><br><font size=-1><a href="/intl/zh-CN/ads/">广告计划</a> - <a href="/intl/zh-CN/about.html">Google 大全</a> - <a href=http://www.google.com/ncr>Google.com in English</a></font><p><font size=-1>©2007 Google</font></p></center></body></
5 ---> 还有这里
html>
0 ---> 还有这里
其他网页也有类似的问题,一般是什么原因引起的呢? 求教有经验的达人 |
|