当前位置：文江博客话题详情

如果应用程序在 Google App Engine 上调用超过 10 个异步 URL 提取，会发生什么情况？

发布于 2024-09-18 04:43:51 字数 297 浏览 10 评论 0原文

阅读有关异步网址提取的 Google App Engine文档：

该应用程序最多可以同时运行 10 个异步 URL 获取调用

如果应用程序一次调用超过 10 个异步获取，会发生什么情况？
Google App Engine 是否会引发异常或只是将剩余的调用排队等待为其提供服务？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

蓬勃野心 2024-09-25 04:43:52

嗯，Swizzec 是不正确的。足够容易测试：

rpc = []
for i in range(1,20):
    rpc.append(urlfetch.createrpc())
    urlfetch.make_fetch_call(rpc[-1],"http://stackoverflow.com/questions/3639855/what-happens-if-i-call-more-than-10-asynchronous-url-fetch")

for r in rpc:
    response = r.get_result().status_code

这不会返回任何异常。事实上，这很好用！请注意，对于非计费应用程序，您的结果可能会有所不同。

Swizec 报告的是一个不同的问题，与应用程序的最大同时连接数有关。顺便说一句，对于计费应用程序，这里没有实际限制，它只是横向扩展（遵守 1000 毫秒规则）。

GAE 无法知道您的请求处理程序将发出阻塞 URL 获取，因此他看到的连接 500 与他的应用程序实际执行的操作无关（顺便说一句，如果您的平均请求响应时间 > 1000 毫秒，那么这是一个过于简单化的情况） 500 的可能性增加）。

Umm, Swizzec is incorrect. Easy enough to test:

rpc = []
for i in range(1,20):
    rpc.append(urlfetch.createrpc())
    urlfetch.make_fetch_call(rpc[-1],"http://stackoverflow.com/questions/3639855/what-happens-if-i-call-more-than-10-asynchronous-url-fetch")

for r in rpc:
    response = r.get_result().status_code

This does not return any exceptions. In fact, this works just fine! Note that your results may vary for non-billable applications.

What Swizec is reporting is a different problem, related to maximum simultaneous connections INTO your application. For billable apps there is no practical limit here btw, it just scales out (subject to the 1000ms rule).

GAE has no way of knowing that your request handler will issue a blocking URL fetch, so the connection 500's he is seeing are not related to what his app is actually doing (that's an oversimplification btw, if your average request response time is > 1000ms your likelyhood of 500's increases).

回复收藏 0 原文

满栀 2024-09-25 04:43:52

这是一个老问题，但我相信接受的答案是不正确或过时的，可能会让人们感到困惑。我实际上已经测试了几个月了，但根据我的经验，Swizec 是完全正确的，即 GAE 不会排队，而是会导致大多数异步 URL 获取超过每个请求大约 10 个并发请求的限制。

请参阅 https://developers.google.com/appengine/docs/python/urlfetch/#Python_Making_requests 和 https://groups.google.com/forum/#!topic/google-appengine /EoYTmnDvg8U 了解限制的说明。

David Underhill 提出了一个Python 的 URL 获取管理器，它对超出应用程序代码限制的异步 URL 获取进行排队。

我已经为 Java 实现了类似的东西，它同步阻止（由于缺少回调函数或 ListenableFutures）额外的请求：

/**
 * A URLFetchService wrapper that ensures that only 10 simultaneous asynchronous fetch requests are scheduled. If the
 * limit is reached, the fetchAsync operations will block until another request completes.
 */
public class BlockingURLFetchService implements URLFetchService {
    private final static int MAX_SIMULTANEOUS_ASYNC_REQUESTS = 10;

    private final URLFetchService urlFetchService = URLFetchServiceFactory.getURLFetchService();
    private final Queue<Future<HTTPResponse>> activeFetches = new LinkedList<>();

    @Override
    public HTTPResponse fetch(URL url) throws IOException {
        return urlFetchService.fetch(url);
    }

    @Override
    public HTTPResponse fetch(HTTPRequest request) throws IOException {
        return urlFetchService.fetch(request);
    }

    @Override
    public Future<HTTPResponse> fetchAsync(URL url) {
        block();

        Future<HTTPResponse> future = urlFetchService.fetchAsync(url);
        activeFetches.add(future);
        return future;
    }

    @Override
    public Future<HTTPResponse> fetchAsync(HTTPRequest request) {
        block();

        Future<HTTPResponse> future = urlFetchService.fetchAsync(request);
        activeFetches.add(future);
        return future;
    }

    private void block() {
        while (activeFetches.size() >= MAX_SIMULTANEOUS_ASYNC_REQUESTS) {
            // Max. simultaneous async requests reached; wait for one to complete
            Iterator<Future<HTTPResponse>> it = activeFetches.iterator();
            while (it.hasNext()) {
                if (it.next().isDone()) {
                    it.remove();
                    break;
                }
            }
        }
    }
}

This is an old question, but I believe the accepted answer is incorrect or outdated and may confuse people. It's been a couple of months that I actually tested this, but in my experience Swizec is quite right that GAE will not queue but rather fail most asynchronous URL fetches exceeding the limit of around 10 simultaneous ones per request.

See https://developers.google.com/appengine/docs/python/urlfetch/#Python_Making_requests and https://groups.google.com/forum/#!topic/google-appengine/EoYTmnDvg8U for a description of the limit.

David Underhill has come up with a URL Fetch Manager for Python, which queues asynchronous URL fetches that exceed the limit in application code.

I have implemented something similar for Java, which synchronously blocks (due to the lack of a callback function or ListenableFutures) additional requests:

/**
 * A URLFetchService wrapper that ensures that only 10 simultaneous asynchronous fetch requests are scheduled. If the
 * limit is reached, the fetchAsync operations will block until another request completes.
 */
public class BlockingURLFetchService implements URLFetchService {
    private final static int MAX_SIMULTANEOUS_ASYNC_REQUESTS = 10;

    private final URLFetchService urlFetchService = URLFetchServiceFactory.getURLFetchService();
    private final Queue<Future<HTTPResponse>> activeFetches = new LinkedList<>();

    @Override
    public HTTPResponse fetch(URL url) throws IOException {
        return urlFetchService.fetch(url);
    }

    @Override
    public HTTPResponse fetch(HTTPRequest request) throws IOException {
        return urlFetchService.fetch(request);
    }

    @Override
    public Future<HTTPResponse> fetchAsync(URL url) {
        block();

        Future<HTTPResponse> future = urlFetchService.fetchAsync(url);
        activeFetches.add(future);
        return future;
    }

    @Override
    public Future<HTTPResponse> fetchAsync(HTTPRequest request) {
        block();

        Future<HTTPResponse> future = urlFetchService.fetchAsync(request);
        activeFetches.add(future);
        return future;
    }

    private void block() {
        while (activeFetches.size() >= MAX_SIMULTANEOUS_ASYNC_REQUESTS) {
            // Max. simultaneous async requests reached; wait for one to complete
            Iterator<Future<HTTPResponse>> it = activeFetches.iterator();
            while (it.hasNext()) {
                if (it.next().isDone()) {
                    it.remove();
                    break;
                }
            }
        }
    }
}

回复收藏 0 原文