- 论坛徽章:
- 0
|
- GET /nwshp?hl=en&tab=in HTTP/1.1^M
- User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.36 Safari/525.19^M
- Referer: [url]http://images.google.com/imghp?hl=en&tab=ni[/url]^M
- Cache-Control: max-age=0^M
- Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5^MAccept-Encoding: gzip,deflate,bzip2,sdch^M
- Cookie: PREF=ID=4ce304d1d9e7588d:LD=en:NW=1:CR=2:TM=1230087642:LM=1230168860:S=K2jP3ibjhfeYUt1X; NID=18=YTVaGb63qwTY_LBnSA84EYLK_GhfoF-xgV8LQIbcn7w9rBSOJv3MEUIp56xAP1Cy24pLIfp37my9FlM5fr9KXBtmwxoSwW0C8BPIhCtx772tdZR_2KX4z1A4s8iFdeUx^M
- Accept-Language: zh-CN,zh^M
- Accept-Charset: gb18030,*,utf-8^M
- Host: news.google.com^M
- Connection: Keep-Alive^M
复制代码
比如上面的报文中,用我的正则表达式是可以匹配到referer:之后的URL,并且很正常。
但是当GET和HTTP/1.1直接也有一个正常的HTTP的URL时候,就会匹配出错,将HTTP/1.1也匹配进去了。
而这种情况是很多的。 |
|