免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 869 | 回复: 0
打印 上一主题 下一主题

The Word Is POI [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2007-01-26 18:22 |只看该作者 |倒序浏览

POI stands for Poor Obfuscation Implementation. The POI subproject HSSF (Horrible Spread Sheet Format) can access MS Excel files and HSD (Horrible Document Format), you guessed it, can access Word documents. I am working on a project and I am researching what POI can do for me given a Word document. I didn’t find a lot of example online so I deceived to look into the POI source code. Using the WordDocument class from the org.apache.poi.hdf.extractor package found in the poi-scratchpad 2.5.1.jar file I was able to write out the contents of a Word document as a plain text.
WordDocument doc = new WordDocument("word.doc");
Writer writer = new BufferedWriter(
     new FileWriter("text.txt"));
doc.writeAllText(writer);
writer.flush();
writer.close();
You can also print all the content of the Word document as a XSL-FO file, except that you have to fix a null pointer exception. The code for this is horrible to read with little or no comments. In the end, what I want to do is generate custom XML file given a Word document and I was able to hack what I wanted to do but I had to refactor the hell out of this code.


本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/29515/showart_238383.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP