XMLSyntaxError
这个问题不知道是lxml的问题还是哪的问题,网上搜了下,没有任何头绪。
http://demo.pyspider.org/debug/test_XMLSyntaxError
只填写了on_start
函数中的第一个网址http://www.hibor.com.cn/microns_4.html
,run了一下,准备在index_page
中配置下一步的时候,出错了
[E 151226 11:37:53 base_handler:195] None
Traceback (most recent call last):
File "/opt/pyspider/pyspider/libs/base_handler.py", line 188, in run_task
result = self._run_task(task, response)
File "/opt/pyspider/pyspider/libs/base_handler.py", line 168, in _run_task
return self._run_func(function, response, task)
File "/opt/pyspider/pyspider/libs/base_handler.py", line 150, in _run_func
return function(*arguments[:len(args) - 1])
File "<test_XMLSyntaxError>", line 19, in index_page
File "/opt/pyspider/pyspider/libs/response.py", line 152, in doc
elements = self.etree
File "/opt/pyspider/pyspider/libs/response.py", line 163, in etree
self._elements = lxml.html.fromstring(self.content, parser=parser)
File "/usr/lib/python2.7/dist-packages/lxml/html/__init__.py", line 634, in fromstring
doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
File "/usr/lib/python2.7/dist-packages/lxml/html/__init__.py", line 532, in document_fromstring
value = etree.fromstring(html, parser, **kw)
File "lxml.etree.pyx", line 2754, in lxml.etree.fromstring (src/lxml/lxml.etree.c:54631)
File "parser.pxi", line 1578, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:82748)
File "parser.pxi", line 1450, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:81481)
File "parser.pxi", line 925, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:77905)
File "parser.pxi", line 569, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:74472)
File "parser.pxi", line 650, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75363)
File "parser.pxi", line 601, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74863)
XMLSyntaxError: None
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
抓取的页面内容为空