- 论坛徽章:
- 13
|
本帖最后由 hmchzb19 于 2014-10-21 11:16 编辑
- import urllib
- import urllib2
- import string
- import sys
- from bs4 import BeautifulSoup
- data=sys.argv[1]
- #response = urllib2.urlopen("http://iciba.com/"+"data")
- 问题在这里,下面的就对了
- response = urllib2.urlopen("http://iciba.com/"+data)
- the_page = response.read()
- pool = BeautifulSoup(the_page)
- results1 = pool.find('div',attrs={'class':'group_pos'}).findAll('label')
- results2 = pool.find('div',attrs={'class':'net_paraphrase'}).find('ul').findAll('li')
- answer=''
- net_ans=''
- for result1 in results1:
- answer+=result1.getText()
- for result2 in results2:
- net_ans+=result2.getText()
- print "the meaning of %s is \n %s \n %s" %(sys.argv[1],answer,net_ans)
复制代码 ./get_iciba.py query
the meaning of query is
资料,材料;datum的复数;[计算机]数据,资料;从科学实验中提取的价值
数据;数据的;资料;资料的
最后的结果却是这样,不太明白哪里出问题了,另外page source 如下
<div class="group_prons">
<div class="group_pos">
<p>
<strong class="fl">n.</strong>
<span class="label_list">
<label>问题;</label>
<label>疑问;</label>
<label>询问;</label>
<label>问号</label>
</span>
</p>
<p>
<strong class="fl">vt.</strong>
<span class="label_list">
<label>质疑,对…表示疑问</label>
</span>
</p>
<p>
<strong class="fl">vi.</strong>
<span class="label_list">
<label>询问;</label>
<label>表示怀疑</label>
</span>
</p>
</div>
</div>
<div class="net_paraphrase">
<a href="###" id="net_means_label" title="网络释义">网 络</a>
<ul class="clear">
<li>查询;</li>
<li>质问;</li>
<li>搜索请求;</li>
<li>询问</li>
</ul>
</div>
|
|