使用 lxml 在 Google App Engine 上导入错误
我使用 lxml 来解析页面。当我使用应用程序引擎 sdk 运行代码时,它可以工作,但是当我在云中部署应用程序时,我在这里收到一条消息:
回溯(最近一次调用最后一次): 文件“/base/data/home/apps/s~testparsercyka/1.356245976008257055/handler_info.py”,第 2 行,位于 导入lxml.html 文件“/base/data/home/apps/s~testparsercyka/1.356245976008257055/lxml/html/init.py”,第 12 行,位于 从 lxml 导入 etree 导入错误:无法导入名称 etree
代码:
app.yaml
application: testparsercyka
version: 1
runtime: python27
api_version: 1
threadsafe: false
handlers:
- url: /stylesheets
static_dir: stylesheets
- url: /.*
script: handler_info.py
libraries:
- name: lxml
version: "2.3" # I thought this would allow me to use lxml.etree
handler_info.py
import lxml
import lxml.html
import urllib
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.ext.webapp import template
import os
import cgi
class MainPage(webapp.RequestHandler):
def get(self):
template_values = {}
path = os.path.join(os.path.dirname(__file__), 'index.html')
self.response.out.write(template.render(path, template_values))
class Handlers(webapp.RequestHandler):
def post(self):
#url = "http://habrahabr.ru/"
url = str(self.request.get('url'))
url_temp = url
teg = str(self.request.get('teg'))
attr = str(self.request.get('attr'))
n0 = str(self.request.get('n0'))
n = str(self.request.get('n'))
a = attr.split(':')
for i in range(int(n0),int(n)):
url = url.format(str(i))
self.response.out.write(url)
html = urllib.urlopen(url).read()
doc = lxml.html.document_fromstring(html)
url = url_temp
self.getn(doc.getroottree().getroot(),teg,a)
def getn(self,node,teg,a):
if ((node.tag==teg) and (node.get(a[0])==a[1])):
#print node.tag,node.keys()
self.response.out.write(node.text)
self.response.out.write('
')
for n in node:
self.getn(n,teg,a)
application = webapp.WSGIApplication([('/', MainPage),('/sign',Handlers)],debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
有什么想法为什么这不起作用?
I use lxml to parse the pages. When I run my code with app engine sdk it works, but when I deploy my application in the cloud, I get a messege here:
Traceback (most recent call last):
File "/base/data/home/apps/s~testparsercyka/1.356245976008257055/handler_info.py", line 2, in
import lxml.html
File "/base/data/home/apps/s~testparsercyka/1.356245976008257055/lxml/html/init.py", line 12, in
from lxml import etree
ImportError: cannot import name etree
Code:
app.yaml
application: testparsercyka
version: 1
runtime: python27
api_version: 1
threadsafe: false
handlers:
- url: /stylesheets
static_dir: stylesheets
- url: /.*
script: handler_info.py
libraries:
- name: lxml
version: "2.3" # I thought this would allow me to use lxml.etree
handler_info.py
import lxml
import lxml.html
import urllib
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.ext.webapp import template
import os
import cgi
class MainPage(webapp.RequestHandler):
def get(self):
template_values = {}
path = os.path.join(os.path.dirname(__file__), 'index.html')
self.response.out.write(template.render(path, template_values))
class Handlers(webapp.RequestHandler):
def post(self):
#url = "http://habrahabr.ru/"
url = str(self.request.get('url'))
url_temp = url
teg = str(self.request.get('teg'))
attr = str(self.request.get('attr'))
n0 = str(self.request.get('n0'))
n = str(self.request.get('n'))
a = attr.split(':')
for i in range(int(n0),int(n)):
url = url.format(str(i))
self.response.out.write(url)
html = urllib.urlopen(url).read()
doc = lxml.html.document_fromstring(html)
url = url_temp
self.getn(doc.getroottree().getroot(),teg,a)
def getn(self,node,teg,a):
if ((node.tag==teg) and (node.get(a[0])==a[1])):
#print node.tag,node.keys()
self.response.out.write(node.text)
self.response.out.write('
')
for n in node:
self.getn(n,teg,a)
application = webapp.WSGIApplication([('/', MainPage),('/sign',Handlers)],debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
Any ideas why this does not work?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我知道这是一个老问题,但这是一个我已确认在部署到 App Engine 时有效的答案:
app.yaml
app.py
因此,就比较而言对于您的代码,以下一些更改可能会有所帮助:
script: handler_info.py
更改为script: handler_info.application
。webapp
更好、更新一些。也有可能自 2012 年提出这个问题以来,这个问题就已经自行解决了。
I know this is an old question but here is an answer that I have confirmed to work when deployed to App Engine:
app.yaml
app.py
So in terms of comparing the above with your code, some of the following changes might help:
script: hander_info.py
toscript: handler_info.application
.webapp
.It is also possible that the issue has simply resolved itself since 2012 when this question was asked.