当前位置：文江博客话题详情

当我们不关心结果时异步 URLfetch？ [Python]

发布于 2024-10-26 09:16:42 字数 319 浏览 8 评论 0原文

在我为 GAE 编写的一些代码中，我需要定期对另一个系统上的 URL 执行 GET，本质上是“ping”它，并且我不太关心请求是否失败、超时或成功。

因为我基本上想要“即发即忘”，而不是通过等待请求来减慢我自己的代码速度，所以我使用异步 urlfetch，而不是调用 get_result()。

在我的日志中，我收到一条警告：

发现 1 个 RPC 请求没有匹配的响应（可能是由于超时或其他错误）

我是否缺少一种明显更好的方法来执行此操作？在这种情况下，任务队列或延迟任务（对我来说）似乎有点矫枉过正。

任何意见将不胜感激。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

芸娘子的小脾气 2024-11-02 09:16:42

任务队列任务是您的最佳选择。您在日志中看到的消息表明请求正在等待 URLFetch 完成后再返回，因此这没有帮助。你说任务“太过分了”，但实际上，它们非常轻量级，而且绝对是做到这一点的最佳方法。延迟甚至允许您直接延迟获取调用，而不必编写要调用的函数。

回复收藏 0 原文

等风来 2024-11-02 09:16:42

async_url_fetch 需要多长时间才能完成以及提供响应需要多长时间？

这是一种利用 api 在 python 中工作方式的可能方法。

需要考虑的一些要点。

许多网络服务器和反向代理一旦启动就不会取消请求。因此，如果您正在 ping 的远程服务器提示请求但需要很长时间才能为其提供服务，请在 create_rpc(deadline=X) 上使用截止日期，以便 X 将因超时而返回。 ping 可能仍会成功。此技术也适用于 appengine 本身。
GAE RPC
- 通过 make_call/make_fetch_call 提示后的 RPC 实际上仅在等待其中一个后才会被调度。
- 此外，当当前等待的 rpc 完成时，任何刚刚完成的 rpc 都会调用其回调。
- 您可以创建一个 async_urlfetch rpc 并在处理您的请求时尽早使用 make_fetch_call 将其排入队列，暂时不要等待。
- 执行实际的页面服务工作，例如内存缓存/数据存储调用以让工作继续进行。第一次调用其中一个将执行等待，这将调度您的 async_urlfetch。
- 如果 urlfetch 在其他活动期间完成，则将调用 urlfetch 上的回调，从而允许您处理结果。
- 如果您确实调用 get_result() ，它将在 wait() 上阻塞，直到截止日期，否则除非结果准备就绪，否则它会返回。

回顾一下。

为长时间运行的 url_fetch 准备一个合理的截止日期和回调。使用 make_fetch_call 将其排队。完成您想要为页面做的工作。无论 url_fetch 是否完成或截止，都返回页面，并且无需等待。

GAE 中的底层 RPC 层都是异步的，似乎有一种更复杂的方法来选择您希望在工作中等待的内容。

这些示例使用 sleep 和 url_fetch 来访问同一应用程序的第二个实例。

wait() 调度 rpc 工作的示例：

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        # url will generate a 404
        make_fetch_call(rpc, url="http://<my_app>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("wait")
        rpc.wait()
        _log.info("get_result")
        rpc.get_result()
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

休眠 4 秒后调用的 Wait 显示

2011-03-23 17:08:35.673 /delay/4.0 200 4093ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:08:31.583 create rpc
I 2011-03-23 17:08:31.583 make fetch call
I 2011-03-23 17:08:31.585 sleep for 4.0
I 2011-03-23 17:08:35.585 wait
I 2011-03-23 17:08:35.663 get_result
I 2011-03-23 17:08:35.663 return
I 2011-03-23 17:08:35.669 Saved; key: __appstats__:011500, part: 48 bytes, full: 4351 bytes, overhead: 0.000 + 0.006; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:08:35.636 /hereiam 404 9ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

异步调度调用的调度。

E 2011-03-23 17:08:35.632 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:08:35.634 Saved; key: __appstats__:015600, part: 27 bytes, full: 836 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

显示使用 memcache RPC 等待开始工作。

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        make_fetch_call(rpc, url="http://<myapp>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("memcache's wait")
        memcache.get('foo')
        _log.info("sleep again")
        sleep(sleepy)
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

Appengine 产品日志：

2011-03-23 17:27:47.389 /delay/2.0 200 4018ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:27:43.374 create rpc
I 2011-03-23 17:27:43.375 make fetch call
I 2011-03-23 17:27:43.377 sleep for 2.0
I 2011-03-23 17:27:45.378 memcache's wait
I 2011-03-23 17:27:45.382 sleep again
I 2011-03-23 17:27:47.382 return
W 2011-03-23 17:27:47.383 Found 1 RPC request(s) without matching response (presumably due to timeouts or other errors)
I 2011-03-23 17:27:47.386 Saved; key: __appstats__:063300, part: 66 bytes, full: 6869 bytes, overhead: 0.000 + 0.003; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:27:45.452 /hereiam 404 10ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

当 memcache.get 调用 wait() 时调度异步 url 获取

E 2011-03-23 17:27:45.446 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:27:45.449 Saved; key: __appstats__:065400, part: 27 bytes, full: 835 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

How long does it take for the async_url_fetch to complete and how long does it take to provide your response?

Here is a possible approach to leverage the way the api works in python.

Some points to consider.

Many webservers and reverse proxies will not cancel a request once it has been started. So if your remote server you are pinging cues the request but takes a long time to service it, use a deadline on your create_rpc(deadline=X) such that X will return due to timeout. The ping may still succeed. This technique works against appengine itself as well.
GAE RPCs
- RPCs after being cued via make_call/make_fetch_call are actually only dispatched once one of them is waited on.
- Also any just finished rpc will have its callback called when the currently waited on one finishes.
- You can create an async_urlfetch rpc and enqueue it using make_fetch_call as early as possible in handling your request, don't wait on it yet.
- Do the actual page serving work, like memcache/datastore calls to get the work going. The first call to one of this will perform a wait which will dispatch your async_urlfetch.
- If the urlfetch completes during this other activity the callback on the urlfetch will be called, allowing you to do handle the result.
- If you do call get_result() it will block on wait() till the deadline or it returns unless the result is ready.

To recap.

Prepare the long running url_fetch with a reasonable deadline and callback. Enqueue it using make_fetch_call. Do the work you wanted to for the page. Return the page regardless of wether the url_fetch completed or deadlined and without waiting for it.

The underlying RPC layer in GAE is all asynchronous, there seems to be a more sophisticated way to choose what you wish to wait on in the works.

These examples use sleep and a url_fetch to a second instance of the same app.

Example of wait() dispatching rpc work:

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        # url will generate a 404
        make_fetch_call(rpc, url="http://<my_app>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("wait")
        rpc.wait()
        _log.info("get_result")
        rpc.get_result()
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

Wait called after sleeping for 4 seconds shows dispatch of

2011-03-23 17:08:35.673 /delay/4.0 200 4093ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:08:31.583 create rpc
I 2011-03-23 17:08:31.583 make fetch call
I 2011-03-23 17:08:31.585 sleep for 4.0
I 2011-03-23 17:08:35.585 wait
I 2011-03-23 17:08:35.663 get_result
I 2011-03-23 17:08:35.663 return
I 2011-03-23 17:08:35.669 Saved; key: __appstats__:011500, part: 48 bytes, full: 4351 bytes, overhead: 0.000 + 0.006; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:08:35.636 /hereiam 404 9ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

Async dispatched call.

E 2011-03-23 17:08:35.632 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:08:35.634 Saved; key: __appstats__:015600, part: 27 bytes, full: 836 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

Showing using a memcache RPC's wait to kick off the work.

class AsyncHandler(RequestHandler):

    def get(self, sleepy=0.0):
        _log.info("create rpc")
        rpc = create_rpc()
        _log.info("make fetch call")
        make_fetch_call(rpc, url="http://<myapp>.appspot.com/hereiam")
        _log.info("sleep for %r", sleepy)
        sleep(sleepy)
        _log.info("memcache's wait")
        memcache.get('foo')
        _log.info("sleep again")
        sleep(sleepy)
        _log.info("return")
        return "<BODY><H1>Holla %r</H1></BODY>" % sleepy

Appengine Prod Log:

2011-03-23 17:27:47.389 /delay/2.0 200 4018ms 23cpu_ms 0kb Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16,gzip(gfe)
I 2011-03-23 17:27:43.374 create rpc
I 2011-03-23 17:27:43.375 make fetch call
I 2011-03-23 17:27:43.377 sleep for 2.0
I 2011-03-23 17:27:45.378 memcache's wait
I 2011-03-23 17:27:45.382 sleep again
I 2011-03-23 17:27:47.382 return
W 2011-03-23 17:27:47.383 Found 1 RPC request(s) without matching response (presumably due to timeouts or other errors)
I 2011-03-23 17:27:47.386 Saved; key: __appstats__:063300, part: 66 bytes, full: 6869 bytes, overhead: 0.000 + 0.003; link: http://<myapp>.appspot.com/_ah/stats/details?tim
2011-03-23 17:27:45.452 /hereiam 404 10ms 0cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine; appid: s~<myapp>),gzip(gfe)

Async url fetch dispatched when memcache.get calls wait()

E 2011-03-23 17:27:45.446 404: Not Found Traceback (most recent call last): File "distlib/tipfy/__init__.py", line 430, in wsgi_app rv = self.dispatch(request) File "di
I 2011-03-23 17:27:45.449 Saved; key: __appstats__:065400, part: 27 bytes, full: 835 bytes, overhead: 0.000 + 0.002; link: http://<myapp>.appspot.com/_ah/stats/details?time

回复收藏 0 原文

~没有更多了~