- 论坛徽章:
- 0
|
- if RUBY_VERSION =~ /1.9/
- Encoding.default_external = Encoding::UTF_8
- Encoding.default_internal = Encoding::UTF_8
- end
- require 'iconv'
- require 'hpricot'
- require 'open-uri'
-
- site=Hash.new
-
- site['url']='http://zu.cq.soufun.com/house/c21000-d22000-g22-s31-kw%bd%f0%c9%bd%c3%fb%b6%bc/'
-
- site['xpath']='//p[@class="housetitle"]/'
- file=open(site['url'])
- puts file.charset
- content=Iconv.conv('UTF-8//IGNORE', file.charset, file.read)
- doc = Hpricot(content)
- doc.search(site['xpath']).each do |link|
- text= link.inner_text
- puts text
- end
复制代码 报错信息 in `<main>': "\x90" on GB2312 (Encoding::InvalidByteSequenceError)
|
|