- 论坛徽章:
- 1
|
本帖最后由 yakczh_cu 于 2017-04-04 09:37 编辑
测试页面
<!DOCTYPE HTML>
<html >
<head>
<title></title>
</head>
<body>
<table border="1" cellpadding="3">
<tr>
<td><b>Your copy:</b></td>
<td>&#29256;&#26412;4.2 build 1020</td>
</tr>
<tr>
<td><b>Latest version:</b></td>
<td>Version 4.2 build 1146 *</td>
</tr>
</table>
- import html
- from urllib.request import Request,urlopen
- url = 'http://localhost/test.html'
- req = Request(url, None)
- resp = urlopen(req)
- #print(resp.read())
- print(html.unescape(resp.read().decode()) )
复制代码
用unescape 还是没变化
页面中的&需要重新手工输一下,因为发出来会转义
</body>
</html>
|
|