Google App Engine 云基础架构上符合 Stack Exchange API 要求的请求限制实施

发布于 2024-09-27 10:06:54 字数 993 浏览 12 评论 0原文

我一直在编写Stack Exchange 的 Google Chrome 扩展。这是一个简单的扩展,可让您跟踪您的声誉并获得 Stack Exchange 站点上的评论通知。

目前我遇到了一些我自己无法处理的问题。 我的扩展程序使用 Google App Engine 作为后端,向 Stack Exchange API 发出外部请求。来自扩展程序的每个客户端对单个站点上的新评论的请求可能会导致对 api 端点的大量请求,以准备响应,即使对于非敏捷的用户也是如此。普通用户至少在 Stack Exchange 网络的 3 个站点上拥有帐户,有些拥有 > 10!

Stack Exchange API 有请求限制:
单个 IP 地址每天只能发出一定数量的 API 请求(10,000 个)。
如果我在 5 秒内从单个 IP 地址发出超过 30 个请求,API 将切断我的请求。

很明显,所有请求都应限制为每 5 秒 30 个,目前我已经基于 memcached 的分布式锁实现了请求限制逻辑。我使用 memcached 作为简单的锁管理器来协调 GAE 实例的活动并限制 UrlFetch 请求。
但我认为将如此强大的基础设施限制为每 5 秒发出不超过 30 个请求是一个很大的失败。这样的 API 请求率不允许我继续开发新的有趣且有用的功能,有一天它会完全停止正常工作。
现在我的应用程序有 90 个用户并且还在不断增长,我需要想出如何最大化请求率的解决方案。

众所周知,App Engine 通过不同 IP 的同一池发出外部 UrlFetch 请求。 我的目标是编写请求限制功能,以确保符合 API 使用条款并利用 GAE 分布式功能。

所以我的问题是如何提供最大的实际 API 吞吐量,同时遵守 API 使用条款并利用 GAE 分布式功能。

在我看来,建议使用其他平台/主机/代理毫无用处.

I have been writing a Google Chrome extension for Stack Exchange. It's a simple extension that allows you to keep track of your reputation and get notified of comments on Stack Exchange sites.

Currently I've encountered with some issues that I can't handle myself.
My extension uses Google App Engine as its back-end to make external requests to Stack Exchange API. Each single client request from extension for new comments on single site can cause plenty of requests to api endpoint to prepare response even for non-skeetish user. Average user has accounts at least on 3 sites from Stack Exchange network, some has > 10!

Stack Exchange API has request limits:
A single IP address can only make a certain number of API requests per day (10,000).
The API will cut my requests off if I make more than 30 requests over 5 seconds from single IP address.

It's clear that all requests should be throttled to 30 per 5 seconds and currently I've implemented request throttle logic based on a distributed lock with memcached. I'm using memcached as a simple lock manager to coordinate the activity of GAE instances and throttle UrlFetch requests.
But I think it's a big failure to limit such powerful infrastructure to issue no more than 30 requests per 5 sec. Such api request rate does not allow me to continue development of new interesting and useful features and one day it will stop working properly at all.
Now my app has 90 users and growing and I need come up with solution how to maximize request rate.

As known App Engine makes external UrlFetch requests via the same pool of different IP's.
My goal is to write request throttle functionality to ensure compliance with the api terms of usage and to utilize GAE distributed capabilities.

So my question is how-to provide maximum practical API throughput while complying with api terms of usage and utilizing GAE distributed capabilities.

Advise to use another platform/host/proxy is just useless in my mind.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

花间憩 2024-10-04 10:06:54

如果您正在寻找一种以编程方式管理 Google App Engine 共享 IP 池的方法,我坚信您不走运。

不管怎样,引用这个建议是常见问题的一部分,我认为你有更多而不是继续运行您出色的应用程序的机会

如果我需要更多该怎么办
每天的请求数?

某些类型的应用程序 -
仅举两个服务和网站 -
可以合法地拥有更高的
每日请求需求超过
典型应用。如果可以的话
表明需要更高的
申请配额,请联系我们。

编辑:
我错了,其实你没有机会。
Google App Engine [应用]注定要失败< /a>.

If you are searching a way to programmatically manage Google App Engine shared pool of IPs, I firmly believe that you are out of luck.

Anyway, quoting this advice that is part of the faq, I think you have more than a chance to keep on running your awesome app:

What should I do if I need more
requests per day?

Certain types of applications -
services and websites to name two -
can legitimately have much higher
per-day request requirements than
typical applications. If you can
demonstrate a need for a higher
request quota, contact us.

EDIT:
I was wrong, actually you don't have any chance.
Google App Engine [app]s are doomed.

半﹌身腐败 2024-10-04 10:06:54

首先:我正在使用你的扩展,它非常棒!

您是否考虑过使用 memcached 并缓存结果?
不要直接从 API 获取结果,而是首先尝试在缓存中查找它们是否正在使用,如果没有:检索它们并缓存它们,让它们在 X 分钟后过期。

其次,尝试批量处理用户请求,而不是询问单个用户的声誉,而是询问多个用户的声誉。

First off: I'm using your extension and it rocks!

Have you consider using memcached and caching the results?
Instead of taking the results from the API directly, try first to find them on the cache if they are use it and if they are not: retrieve them and cache them and let them expire after X minutes.

Second, try to batch up users requests, instead of asking the reputation of a single user ask the reputation of several users together.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文