扭曲的Python getPage
我试图获得这方面的支持,但我完全感到困惑。
这是我的代码:
from twisted.internet import reactor
from twisted.web.client import getPage
from twisted.web.error import Error
from twisted.internet.defer import DeferredList
from sys import argv
class GrabPage:
def __init__(self, page):
self.page = page
def start(self, *args):
if args == ():
# We apparently don't need authentication for this
d1 = getPage(self.page)
else:
if len(args) == 2:
# We have our login information
d1 = getPage(self.page, headers={"Authorization": " ".join(args)})
else:
raise Exception('Missing parameters')
d1.addCallback(self.pageCallback)
dl = DeferredList([d1])
d1.addErrback(self.errorHandler)
dl.addCallback(self.listCallback)
def errorHandler(self,result):
# Bad thingy!
pass
def pageCallback(self, result):
return result
def listCallback(self, result):
print result
a = GrabPage('http://www.google.com')
data = a.start() # Not the HTML
我希望获取调用 start() 时提供给 pageCallback 的 HTML。这对我来说是一个皮塔饼。泰!对我糟糕的编码感到抱歉。
I tried to get support on this but I am TOTALLY confused.
Here's my code:
from twisted.internet import reactor
from twisted.web.client import getPage
from twisted.web.error import Error
from twisted.internet.defer import DeferredList
from sys import argv
class GrabPage:
def __init__(self, page):
self.page = page
def start(self, *args):
if args == ():
# We apparently don't need authentication for this
d1 = getPage(self.page)
else:
if len(args) == 2:
# We have our login information
d1 = getPage(self.page, headers={"Authorization": " ".join(args)})
else:
raise Exception('Missing parameters')
d1.addCallback(self.pageCallback)
dl = DeferredList([d1])
d1.addErrback(self.errorHandler)
dl.addCallback(self.listCallback)
def errorHandler(self,result):
# Bad thingy!
pass
def pageCallback(self, result):
return result
def listCallback(self, result):
print result
a = GrabPage('http://www.google.com')
data = a.start() # Not the HTML
I wish to get the HTML out which is given to pageCallback when start() is called. This has been a pita for me. Ty! And sorry for my sucky coding.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您缺少 Twisted 运作方式的基础知识。这一切都围绕着
reactor
展开,而您甚至从未运行过它。将反应器想象成这样:(来源:krondo.com)
在启动反应器之前,通过设置延迟,您所做的就是将它们链接起来,而不需要触发任何事件。
我建议您提供 Twisted Intro by Dave Peticolas 阅读。它速度很快,并且确实为您提供了 Twisted 文档所没有的所有缺失信息。
无论如何,这里是
getPage
最基本的使用示例:由于
getPage
返回延迟,我将回调print_and_stop
添加到延迟链。之后,我启动reactor
。反应器触发getPage
,然后触发print_and_stop
,打印来自 aol.com 的数据,然后停止反应器。编辑以显示 OP 代码的工作示例:
You're missing the basics of how Twisted operates. It all revolves around the
reactor
, which you're never even running. Think of the reactor like this:(source: krondo.com)
Until you start the reactor, by setting up deferreds all you're doing is chaining them with no events from which to fire.
I recommend you give the Twisted Intro by Dave Peticolas a read. It's quick and it really gives you all the missing information that the Twisted documentation doesn't.
Anyways, here is the most basic usage example of
getPage
as possible:Since
getPage
returns a deferred, I'm adding the callbackprint_and_stop
to the deferred chain. After that, I start thereactor
. The reactor firesgetPage
, which then firesprint_and_stop
which prints the data from aol.com and then stops the reactor.Edit to show a working example of OP's code: