当我们不关心结果时异步 URLfetch? [Python]

发布于 2024-10-26 09:16:42 字数 319 浏览 8 评论 0原文

在我为 GAE 编写的一些代码中,我需要定期对另一个系统上的 URL 执行 GET,本质上是“ping”它,并且我不太关心请求是否失败、超时或成功。

因为我基本上想要“即发即忘”,而不是通过等待请求来减慢我自己的代码速度,所以我使用异步 urlfetch,而不是调用 get_result()。

在我的日志中,我收到一条警告:

发现 1 个 RPC 请求没有匹配的响应(可能是由于超时或其他错误)

我是否缺少一种明显更好的方法来执行此操作?在这种情况下,任务队列或延迟任务(对我来说)似乎有点矫枉过正。

任何意见将不胜感激。

In some code I'm writing for GAE I need to periodically perform a GET on a URL on another system, in essence 'pinging' it and I'm not terribly concerned if the request fails, times out or succeeds.

As I basically want to 'fire and forget' and not slow down my own code by waiting for the request, I'm using an asynchronous urlfetch, and not calling get_result().

In my log I get a warning:

Found 1 RPC request(s) without matching response (presumably due to timeouts or other errors)

Am I missing an obviously better way to do this? A Task Queue or Deferred Task seems (to me) like overkill in this instance.

Any input would appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

芸娘子的小脾气 2024-11-02 09:16:42

任务队列任务是您的最佳选择。您在日志中看到的消息表明请求正在等待 URLFetch 完成后再返回,因此这没有帮助。你说任务“太过分了”,但实际上,它们非常轻量级,而且绝对是做到这一点的最佳方法。延迟甚至允许您直接延迟获取调用,而不必编写要调用的函数。

A task queue task is your best option here. The message you're seeing in the log indicates that the request is waiting for your URLFetch to complete before returning, so this doesn't help. You say a task is 'overkill', but really, they're very lightweight, and definitely the best way to do this. Deferred will even allow you to just defer the fetch call directly, rather than having to write a function to call.

等风来 2024-11-02 09:16:42

async_url_fetch 需要多长时间才能完成以及提供响应需要多长时间?

这是一种利用 api 在 python 中工作方式的可能方法。

需要考虑的一些要点。

  • 许多网络服务器和反向代理一旦启动就不会取消请求。因此,如果您正在 ping 的远程服务器提示请求但需要很长时间才能为其提供服务,请在 create_rpc(deadline=X) 上使用截止日期,以便 X 将因超时而返回。 ping 可能仍会成功。此技术也适用于 appengine 本身。

  • GAE RPC

    • 通过 make_call/make_fetch_call 提示后的 RPC 实际上仅在等待其中一个后才会被调度。
    • 此外,当当前等待的 rpc 完成时,任何刚刚完成的 rpc 都会调用其回调。
    • 您可以创建一个 async_urlfetch rpc 并在处理您的请求时尽早使用 make_fetch_call 将其排入队列,暂时不要等待。
    • 执行实际的页面服务工作,例如内存缓存/数据存储调用以让工作继续进行。第一次调用其中一个将执行等待,这将调度您的 async_urlfetch。
    • 如果 urlfetch 在其他活动期间完成,则将调用 urlfetch 上的回调,从而允许您处理结果。
    • 如果您确实调用 get_result() ,它将在 wait() 上阻塞,直到截止日期,否则除非结果准备就绪,否则它会返回。

回顾一下。

为长时间运行的 url_fetch 准备一个合理的截止日期和回调。使用 make_fetch_call 将其排队。完成您想要为页面做的工作。无论 url_fetch 是否完成或截止,都返回页面,并且无需等待。

GAE 中的底层 RPC 层都是异步的,似乎有一种更复杂的方法来选择您希望在工作中等待的内容。

这些示例使用 sleep 和 url_fetch 来访问同一应用程序的第二个实例。

wait() 调度 rpc 工作的示例:

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        # url will generate a 404
        make_fetch_call(rpc, url="http://<my_app>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("wait")
        rpc.wait()
        _log.info("get_result")
        rpc.get_result()
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

休眠 4 秒后调用的 Wait 显示

2011-03-23 17:08:35.673 /delay/4.0 200 4093ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:08:31.583 create rpc
I 2011-03-23 17:08:31.583 make fetch call
I 2011-03-23 17:08:31.585 sleep for 4.0
I 2011-03-23 17:08:35.585 wait
I 2011-03-23 17:08:35.663 get_result
I 2011-03-23 17:08:35.663 return
I 2011-03-23 17:08:35.669 Saved; key: __appstats__:011500, part: 48 bytes, full: 4351 bytes, overhead: 0.000 + 0.006; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:08:35.636 /hereiam 404 9ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

异步调度调用的调度。

E 2011-03-23 17:08:35.632 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:08:35.634 Saved; key: __appstats__:015600, part: 27 bytes, full: 836 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

显示使用 memcache RPC 等待开始工作。

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        make_fetch_call(rpc, url="http://<myapp>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("memcache's wait")
        memcache.get('foo')
        _log.info("sleep again")
        sleep(sleepy)
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

Appengine 产品日志:

2011-03-23 17:27:47.389 /delay/2.0 200 4018ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:27:43.374 create rpc
I 2011-03-23 17:27:43.375 make fetch call
I 2011-03-23 17:27:43.377 sleep for 2.0
I 2011-03-23 17:27:45.378 memcache's wait
I 2011-03-23 17:27:45.382 sleep again
I 2011-03-23 17:27:47.382 return
W 2011-03-23 17:27:47.383 Found 1 RPC request(s) without matching response (presumably due to timeouts or other errors)
I 2011-03-23 17:27:47.386 Saved; key: __appstats__:063300, part: 66 bytes, full: 6869 bytes, overhead: 0.000 + 0.003; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:27:45.452 /hereiam 404 10ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

当 memcache.get 调用 wait() 时调度异步 url 获取

E 2011-03-23 17:27:45.446 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:27:45.449 Saved; key: __appstats__:065400, part: 27 bytes, full: 835 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

How long does it take for the async_url_fetch to complete and how long does it take to provide your response?

Here is a possible approach to leverage the way the api works in python.

Some points to consider.

  • Many webservers and reverse proxies will not cancel a request once it has been started. So if your remote server you are pinging cues the request but takes a long time to service it, use a deadline on your create_rpc(deadline=X) such that X will return due to timeout. The ping may still succeed. This technique works against appengine itself as well.

  • GAE RPCs

    • RPCs after being cued via make_call/make_fetch_call are actually only dispatched once one of them is waited on.
    • Also any just finished rpc will have its callback called when the currently waited on one finishes.
    • You can create an async_urlfetch rpc and enqueue it using make_fetch_call as early as possible in handling your request, don't wait on it yet.
    • Do the actual page serving work, like memcache/datastore calls to get the work going. The first call to one of this will perform a wait which will dispatch your async_urlfetch.
    • If the urlfetch completes during this other activity the callback on the urlfetch will be called, allowing you to do handle the result.
    • If you do call get_result() it will block on wait() till the deadline or it returns unless the result is ready.

To recap.

Prepare the long running url_fetch with a reasonable deadline and callback. Enqueue it using make_fetch_call. Do the work you wanted to for the page. Return the page regardless of wether the url_fetch completed or deadlined and without waiting for it.

The underlying RPC layer in GAE is all asynchronous, there seems to be a more sophisticated way to choose what you wish to wait on in the works.

These examples use sleep and a url_fetch to a second instance of the same app.

Example of wait() dispatching rpc work:

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        # url will generate a 404
        make_fetch_call(rpc, url="http://<my_app>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("wait")
        rpc.wait()
        _log.info("get_result")
        rpc.get_result()
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

Wait called after sleeping for 4 seconds shows dispatch of

2011-03-23 17:08:35.673 /delay/4.0 200 4093ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:08:31.583 create rpc
I 2011-03-23 17:08:31.583 make fetch call
I 2011-03-23 17:08:31.585 sleep for 4.0
I 2011-03-23 17:08:35.585 wait
I 2011-03-23 17:08:35.663 get_result
I 2011-03-23 17:08:35.663 return
I 2011-03-23 17:08:35.669 Saved; key: __appstats__:011500, part: 48 bytes, full: 4351 bytes, overhead: 0.000 + 0.006; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:08:35.636 /hereiam 404 9ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

Async dispatched call.

E 2011-03-23 17:08:35.632 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:08:35.634 Saved; key: __appstats__:015600, part: 27 bytes, full: 836 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

Showing using a memcache RPC's wait to kick off the work.

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        make_fetch_call(rpc, url="http://<myapp>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("memcache's wait")
        memcache.get('foo')
        _log.info("sleep again")
        sleep(sleepy)
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

Appengine Prod Log:

2011-03-23 17:27:47.389 /delay/2.0 200 4018ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:27:43.374 create rpc
I 2011-03-23 17:27:43.375 make fetch call
I 2011-03-23 17:27:43.377 sleep for 2.0
I 2011-03-23 17:27:45.378 memcache's wait
I 2011-03-23 17:27:45.382 sleep again
I 2011-03-23 17:27:47.382 return
W 2011-03-23 17:27:47.383 Found 1 RPC request(s) without matching response (presumably due to timeouts or other errors)
I 2011-03-23 17:27:47.386 Saved; key: __appstats__:063300, part: 66 bytes, full: 6869 bytes, overhead: 0.000 + 0.003; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:27:45.452 /hereiam 404 10ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

Async url fetch dispatched when memcache.get calls wait()

E 2011-03-23 17:27:45.446 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:27:45.449 Saved; key: __appstats__:065400, part: 27 bytes, full: 835 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文