使用 Google App Engine 链接延迟任务
我有一个网站,希望每天保持更新并从那里抓取一些内容。我知道该网站会在某个时间手动更新,并且我已经设置了 cron 计划来反映这一点,但由于它是手动更新的,因此可能会晚 10 甚至 20 分钟。
现在,我每 5 分钟就有一次 hack-ish cron 更新,但我想使用延迟库以更精确的方式做事。我正在尝试链接推迟的任务,以便我可以检查是否有更新,如果没有更新,则将同一更新推迟几分钟,如果需要,则再次推迟,直到最终有更新。
我有一些我认为可以工作的代码,但它只推迟一次,而我需要继续推迟直到有更新:(
我正在使用 Python)
class Ripper(object):
def rip(self):
if siteHasNotBeenUpdated:
deferred.defer(self.rip, _countdown=120)
else:
updateMySite()
这显然只是一个简化的摘录。
我认为这很简单,可以工作,但也许我完全错了?
I have a website I am looking to stay updated with and scrape some content from there every day. I know the site is updated manually at a certain time, and I've set cron schedules to reflect this, but since it is updated manually it could be 10 or even 20 minutes later.
Right now I have a hack-ish cron update every 5 minutes, but I'd like to use the deferred library to do things in a more precise manner. I'm trying to chain deferred tasks so I can check if there was an update and defer that same update a for couple minutes if there was none, and defer again if need be until there is finally an update.
I have some code I thought would work, but it only ever defers once, when instead I need to continue deferring until there is an update:
(I am using Python)
class Ripper(object):
def rip(self):
if siteHasNotBeenUpdated:
deferred.defer(self.rip, _countdown=120)
else:
updateMySite()
This was just a simplified excerpt obviously.
I thought this was simple enough to work, but maybe I've just got it all wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你给出的例子应该可以正常工作。您需要添加日志记录来确定 deferred.defer 是否在您认为被调用时被调用。更多信息也会有所帮助:How is siteHasNotBeenUpdated set?
The example you give should work just fine. You need to add logging to determine if deferred.defer is being called when you think it is. More information would help, too: How is siteHasNotBeenUpdated set?