- 论坛徽章:
- 0
|
本帖最后由 jachin89 于 2013-07-22 18:39 编辑
import urllib2
import re
from bs4 import BeautifulSoup
#req = urllib2.Request("http://zx.caipiao.163.com/trend/ssq_basic.html")
req = urllib2.Request("http://zx.caipiao.163.com/trend/ssq_basic.html?periodNumber=100")
content = urllib2.urlopen(req).read()
soup = BeautifulSoup(content)
htmlcontent = open('conten.html','wb')
htmlcontent.write(content)
htmlFile = open("caipiao.txt",'w')
i=0
test = open('test.html','wb')
test.write(str(soup('td')))
n=1
htmlFile.write("%d.\t"%(n))
for elem in soup.find_all('td',class_=re.compile("^chartBall0\d*$"),limit=170):
htmlFile.write(elem.string.strip()+ " ")
i+=1
print '%s' %(elem.string.strip()),
if(i == 7):
i = 0
n+=1
htmlFile.write('\n')
htmlFile.write("%d.\t"%(n))
print "\n"
htmlFile.close()
-----------------------------------------------------------------------------------------
结果:
1. 08 19 21 24 28 31 15
2. 14 18 27 30 31 33 15
3. 03 05 08 19 20 27 09
4. 05 18 22 28 29 31 06
5. 07 08 18 25 30 32 06
6. 03 10 12 13 27 30 04
7. 05 20 26 27 28 33 03
8. 01 05 07 13 29 32 13
9. 02 12 15 23 24 32 09
10. 03 06 11 17 21 31 07
11. 01 05 13 25 26 32 13
12. 09 11 17 23 24 26 07
13. 05 14 24 25 26 32 01
14. 10 12 18 22 28 29 07
15. 04 05 11 21 27 28 10
16. 05 07 12 16 28 32 04
17. 06 08 14 15 24 25 06
18. 01 16 18 22 28 30 12
19. 22 23 26 27 28 33 09
20. 06 10 16 20 27 32 08
21. 01 13 14 25 31 32 12
22. 09 10 13 17 22 30 13
23. 02 09 15 22 26 32 01
24. 03 08 17 21 25 32 15
25. 01 04 09 13 16 23 02
26.
-------------------------------------------------------------
find_all(name, attrs, recursive, text, limit, **kwargs) 有个limit设置可以限制匹配到的个数,(注:上面没有设置)调试设置180还是匹配到175个。
请问大虾,大神,有谁知道这个问题---我把content的内容也写到一个文件里,里面有全部的记录,但是find_all只有匹配到175
|
|