免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 6670 | 回复: 8
打印 上一主题 下一主题

新人问题,xml提取字段值 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2011-06-21 14:15 |只看该作者 |倒序浏览
lst name="jvm">
<str name="version">20.1-b02</str>
<str name="name">Java HotSpot(TM) 64-Bit Server VM</str>
<int name="processors">16</int>

<lst name="memory">
<str name="free">15.1 GB</str>
<str name="total">23.1 GB</str>
<str name="max">23.1 GB</str>
<str name="used">8 GB (%34.</str>

<lst name="raw">
<long name="free">16172727968</long>
<long name="total">24803016704</long>
<long name="max">24803016704</long>
<long name="used">8630288736</long>
<double name="used%">34.795318807361795</double>
</lst>
</lst>

这是打开网页后的一段XML数据,不是本地xml文件
现在我想用python提取出memory字段下,free total max used这几个值
请问该怎么做
一头雾水
感谢!!

论坛徽章:
0
2 [报告]
发表于 2011-06-21 14:19 |只看该作者
使用 xml.etree.ElementTree 模块,find “/lst", 然后再 find "free total max used", 取出text即可。

论坛徽章:
0
3 [报告]
发表于 2011-06-21 14:21 |只看该作者
回复 1# breeze7086


     建议你可以看看minidom库,处理xml数据的方法
   里边很很多方法
   比如,你取到那个节点,然后循环取里边的数据(.data方法)

论坛徽章:
0
4 [报告]
发表于 2011-06-21 16:32 |只看该作者
能来点实际代码我参考不
感激不尽

论坛徽章:
0
5 [报告]
发表于 2011-06-21 16:55 |只看该作者
能来点实际代码我参考不
感激不尽
breeze7086 发表于 2011-06-21 16:32


把要解析的XML传一点可用的上来

论坛徽章:
0
6 [报告]
发表于 2011-06-21 16:57 |只看该作者
本帖最后由 breeze7086 于 2011-06-21 16:58 编辑

<response>

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">5</int>
</lst>

<lst name="core">
<str name="schema">patsnapV5.0</str>
<str name="host">solr-sh-02.patsnap.com</str>
<date name="now">2011-06-21T13:50:59.34Z</date>
<date name="start">2011-06-20T14:10:01.453Z</date>

<lst name="directory">
<str name="cwd">/solr/patents5.0/patsnapIndex</str>
<str name="instance">/solr/patsnapIndex/core0</str>
<str name="data">/solr/patsnapIndex/data</str>
<str name="index">/solr/patents5.0/patsnapIndex/data/index</str>
</lst>
</lst>

<lst name="lucene">
<str name="solr-spec-version">4.0.0.2011.06.15.16.38.04</str>

<str name="solr-impl-version">
4.0-SNAPSHOT ${svnversion} - Administrator - 2011-06-15 16:38:04
</str>
<str name="lucene-spec-version">4.0-SNAPSHOT</str>
<str name="lucene-impl-version">4.0-SNAPSHOT ${svnversion} - 2011-06-15 16:38:57</str>
</lst>

<lst name="jvm">
<str name="version">20.1-b02</str>
<str name="name">Java HotSpot(TM) 64-Bit Server VM</str>
<int name="processors">16</int>

<lst name="memory">
<str name="free">15.1 GB</str>
<str name="total">23.1 GB</str>
<str name="max">23.1 GB</str>
<str name="used">8 GB (%34.</str>

<lst name="raw">
<long name="free">16172727968</long>
<long name="total">24803016704</long>
<long name="max">24803016704</long>
<long name="used">8630288736</long>
<double name="used%">34.795318807361795</double>
</lst>
</lst>

<lst name="jmx">

<str name="bootclasspath">
/usr/java/jdk1.6.0_26/jre/lib/resources.jar:/usr/java/jdk1.6.0_26/jre/lib/rt.jar:/usr/java/jdk1.6.0_26/jre/lib/sunrsasign.jar:/usr/java/jdk1.6.0_26/jre/lib/jsse.jar:/usr/java/jdk1.6.0_26/jre/lib/jce.jar:/usr/java/jdk1.6.0_26/jre/lib/charsets.jar:/usr/java/jdk1.6.0_26/jre/lib/modules/jdk.boot.jar:/usr/java/jdk1.6.0_26/jre/classes
</str>
<str name="classpath">/solr/solr-tomcat/bin/bootstrap.jar</str>

<arr name="commandLineArgs">

<str>
-Djava.util.logging.config.file=/solr/solr-tomcat/conf/logging.properties
</str>
<str>-Xmx26214m</str>
<str>-Xms26214m</str>
<str>-Xmn20g</str>
<str>-XX:MaxPermSize=2g</str>

<str>
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
</str>
<str>-Xmx26214m</str>
<str>-Xms26214m</str>
<str>-Xmn20g</str>
<str>-XX:MaxPermSize=2g</str>
<str>-Djava.endorsed.dirs=/solr/solr-tomcat/endorsed</str>
<str>-Dcatalina.base=/solr/solr-tomcat</str>
<str>-Dcatalina.home=/solr/solr-tomcat</str>
<str>-Djava.io.tmpdir=/solr/solr-tomcat/temp</str>
</arr>
<date name="startTime">2011-06-20T14:09:58.541Z</date>
<long name="upTimeMS">85260800</long>
</lst>
</lst>

<lst name="system">
<str name="name">Linux</str>
<str name="version">2.6.18-238.el5</str>
<str name="arch">amd64</str>
<double name="systemLoadAverage">0.08</double>

</lst>
</response>

论坛徽章:
0
7 [报告]
发表于 2011-06-21 18:42 |只看该作者
我用了你这个XML的一小段:
  1. <response>
  2. <lst name="responseHeader">
  3. <int name="status">0</int>
  4. <int name="QTime">5</int>
  5. </lst>
  6. <lst name="jvm">
  7. <str name="version">20.1-b02</str>
  8. <str name="name">Java HotSpot(TM) 64-Bit Server VM</str>
  9. <int name="processors">16</int>
  10. </lst>
  11. <lst name="memory">
  12. <str name="free">15.1 GB</str>
  13. <str name="total">23.1 GB</str>
  14. <str name="max">23.1 GB</str>
  15. <str name="used">8 GB</str>
  16. </lst>
  17. </response>
复制代码
解析代码:
  1. # -*- encoding: utf-8 -*-

  2. from xml.etree import ElementTree

  3. def main():
  4.     xml = "t.xml"

  5.     jvm = ElementTree.parse(xml)
  6.     for i in jvm.findall('/lst'):
  7.             if i.get("name") == "jvm":
  8.                     print "Version: %s" % i.findall("str")[0].text
  9.                     print "Name: %s" % i.findall("str")[1].text
  10.                     print "Processors: %s" % i.find("int").text

  11. if __name__ == '__main__':
  12.     main()
复制代码

论坛徽章:
0
8 [报告]
发表于 2011-06-22 17:27 |只看该作者
返回的XML:
  1. <user><id>10116</id><email>aa@aa.com</email><nickname>唐僧他妈</nickname><signature>故宫里都是太监吗?</signature><avatar>/users/10116/headIcon/1304911263052.jpg</avatar><avatar_thumb>/users/10116/headIcon/1304911263052_thumb.jpg</avatar_thumb><gender>0</gender><birthday>1985-03-23</birthday><age>26</age><score>2543</score><status>0</status><createTime>1297934782000</createTime><loginTime>1308728311000</loginTime><mediaSize>53</mediaSize><eventsSize>18</eventsSize><followsSize>71</followsSize><friendsSize>159</friendsSize><footprintsSize>25</footprintsSize><location><coordinate><latitude>34.225624</latitude><longitude>108.876226</longitude></coordinate></location><tags>80后</tags><settings><touch>0</touch><footprint>0</footprint><encounter>0</encounter></settings></user>
复制代码
python代码:
  1. #! /usr/bin/env python
  2. #coding=utf-8
  3. import urllib
  4. from xml.etree import ElementTree

  5. def main():
  6.     f = urllib.urlopen("http://172.16.220.166:8080/doudouy/api/v2/users/10116")
  7.     xml = f.read()

  8.     jvm = ElementTree.parse(xml)
  9.     for i in jvm.findall('/user'):
  10.             print i.find("int").text

  11. if __name__ == '__main__':
  12.     main()
复制代码
执行后的错误信息:

> "C:\Python25\pythonw.exe" -u "E:\python\Practise\0622\test xml.py"
Traceback (most recent call last):
  File "E:\python\Practise\0622\test xml.py", line 15, in <module>
    main()
  File "E:\python\Practise\0622\test xml.py", line 10, in main
    jvm = ElementTree.parse(xml)
  File "C:\Python25\lib\xml\etree\ElementTree.py", line 862, in parse
    tree.parse(source, parser)
  File "C:\Python25\lib\xml\etree\ElementTree.py", line 579, in parse
    source = open(source, "rb")
IOError: [Errno 2] No such file or directory: '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><user><id>10116</id><email>aa@aa.com</email><nickname>\xe5\x94\x90\xe5\x83\xa7\xe4\xbb\x96\xe5\xa6\x88</nickname><signature>\xe6\x95\x85\xe5\xae\xab\xe9\x87\x8c\xe9\x83\xbd\xe6\x98\xaf\xe5\xa4\xaa\xe7\x9b\x91\xe5\x90\x97\xef\xbc\x9f</signature><avatar>/users/10116/headIcon/1304911263052.jpg</avatar><avatar_thumb>/users/10116/headIcon/1304911263052_thumb.jpg</avatar_thumb><gender>0</gender><birthday>1985-03-23</birthday><age>26</age><score>2543</score><status>0</status><createTime>1297934782000</createTime><loginTime>1308728311000</loginTime><mediaSize>53</mediaSize><eventsSize>18</eventsSize><followsSize>71</followsSize><friendsSize>159</friendsSize><footprintsSize>25</footprintsSize><location><coordinate><latitude>34.225624</latitude><longitude>108.876226</longitude></coordinate></location><tags>80\xe5\x90\x8e</tags><settings><touch>0</touch><footprint>0</footprint><encounter>0</encounter></settings></user>'

必须要保存成文件才可以吗?

论坛徽章:
0
9 [报告]
发表于 2011-06-22 21:19 |只看该作者
回复 8# xiaomayi0323
http://effbot.org/zone/pythondoc ... Tree.parse-function
arse(source, parser=None)
    Parses an XML document into an element tree.
    source
        A filename or file object containing XML data.
    parser
        An optional parser instance. If not given, the standard XMLTreeBuilder parser is used.
    Returns:
        An ElementTree instance

字符串的话
  1. xml="""
  2.     <response>
  3.     <lst name="responseHeader">
  4.     <int name="status">0</int>
  5.     <int name="QTime">5</int>
  6.     </lst>
  7.     </response>
  8.     """
  9. jvm = ElementTree.XML(xml)
  10. for i in jvm.findall('lst'):
  11.     print i.findall('int')[0].text  
复制代码
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP