- 论坛徽章:
- 0
|
我写了个简单的scrapy程序,用于爬取网上的一些内容,代码如下:
items.py
- from scrapy.item import Item, Field
- class DmozItem(Item):
- title = Field()
- link = Field()
- comment = Field()
复制代码 dmoz.py如下:
- from scrapy.spiders import Spider
- from scrapy.selector import Selector
- from dmoz.items import DmozItem
- class dmoz(Spider):
- name = 'Dmoz'
- allowed_domains = ['dmoztools.net']
- start_urls = ['http://dmoztools.net/Society/Philosophy/Aesthetics/']
- def parse(self, response):
- for sel in response.xpath('//*[@id="site-list-content"]/div[1]/div[3]'):
- item = DmozItem()
- item['title'] = sel.xpath('a/@href').extract()
- item['link'] = sel.xpath('a/div/text()').extract()
- item['comment'] = sel.xpath('div/text()').extract()
- yield item
复制代码 我的工程和代码目录如下
dmoz/dmoz/items.py
dmoz/dmoz/spiders/dmoz.py
报错提示:
- D:\Python27\scrapy\dmoz>scrapy crawl Dmoz
- Traceback (most recent call last):
- File "d:\python27\lib\runpy.py", line 174, in _run_module_as_main
- "__main__", fname, loader, pkg_name)
- File "d:\python27\lib\runpy.py", line 72, in _run_code
- exec code in run_globals
- File "D:\Python27\Scripts\scrapy.exe\__main__.py", line 9, in <module>
- File "d:\python27\lib\site-packages\scrapy\cmdline.py", line 141, in execute
- cmd.crawler_process = CrawlerProcess(settings)
- File "d:\python27\lib\site-packages\scrapy\crawler.py", line 238, in __init__
- super(CrawlerProcess, self).__init__(settings)
- File "d:\python27\lib\site-packages\scrapy\crawler.py", line 129, in __init__
- self.spider_loader = _get_spider_loader(settings)
- File "d:\python27\lib\site-packages\scrapy\crawler.py", line 325, in _get_spider_loader
- return loader_cls.from_settings(settings.frozencopy())
- File "d:\python27\lib\site-packages\scrapy\spiderloader.py", line 45, in from_settings
- return cls(settings)
- File "d:\python27\lib\site-packages\scrapy\spiderloader.py", line 23, in __init__
- self._load_all_spiders()
- File "d:\python27\lib\site-packages\scrapy\spiderloader.py", line 32, in _load_all_spiders
- for module in walk_modules(name):
- File "d:\python27\lib\site-packages\scrapy\utils\misc.py", line 71, in walk_modules
- submod = import_module(fullpath)
- File "d:\python27\lib\importlib\__init__.py", line 37, in import_module
- __import__(name)
- File "D:\Python27\scrapy\dmoz\dmoz\spiders\dmoz.py", line 4, in <module>
- from dmoz.items import DmozItem
- ImportError: No module named items
复制代码
|
|