运行pyspider脚本报AttributeError
我是python新手,在用pyspider做爬虫时脚本报了AttributeError,
脚本代码如下:
#!/usr/bin/python
#coding:utf-8
from pyspider.libs.base_handler import *
class Handler(BaseHandler):
crawl_config = {}
def on_start(self):
self.crawl('http://scrapy.org', callback=self.index_page)
def index_page(self, response):
url_list = []
for each in response.doc('a[href^="http"]').items():
url_list.append(each.arrt.href)
return url_list
@config(priority=5)
def detail_page(self, response):
return {
"url": response.url,
"title": response.doc('title').text(),
}
def on_result(self, result):
if not result:
return
assert self.task, "on_result can't outside a callback."
result['callback'] = self.task['process']['callback']
print(result)
if __name__=='__main__':
handler = Handler()
handler.on_start()
错误信息如下:
Traceback (most recent call last):
File "./firstScript.py", line 34, in <module>
handler.on_start()
File "./firstScript.py", line 10, in on_start
self.crawl('http://scrapy.org', callback=self.index_page)
File "/usr/lib/python2.7/site-packages/pyspider/libs/base_handler.py", line 394, in crawl
return self._crawl(url, **kwargs)
File "/usr/lib/python2.7/site-packages/pyspider/libs/base_handler.py", line 338, in _crawl
if cache_key not in self._follows_keys:
AttributeError: 'Handler' object has no attribute '_follows_keys'
在百度和Google上查了好久,都没有找到解决办法,希望大神们帮我看看是什么原因,谢谢!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你不能
if __name__=='__main__':
handler = Handler()
handler.on_start()
这样运行爬虫