需要帮助构建并行 HTTP 请求

发布于 2025-01-03 11:18:08 字数 793 浏览 1 评论 0原文

这是我的案例。我有三个表 BookPublisherPrice。我有一个管理命令,它循环遍历每本书,对于每本书,它查询出版商以获取价格,然后将其存储到价格表中。这是我为获取价格而发出的一个非常简单的 HTTP GET 或 UDP 请求。我的代码框架如下所示:

@transaction.commit_on_success
def handle(self, *args, **options):
    for book in Book.objects.all():
        for publisher book.publisher_set.objects.all():
            price = check_the_price(publisher.url, book.isbn)
            Price.objects.create(book=book, publisher=publisher, price=price)

代码很简单,但是当我有 10000 本书时,它会变得非常慢且耗时。我可以通过发出并行 HTTP 请求轻松加快速度。我可以发出 50 个并行请求,这将很快完成,但我不知道如何构建此代码。

我的网站本身是一个非常小的轻量级网站,我试图远离 RabbitMQ/Celery 的东西。我只是觉得现在要承担的是一件大事。

关于如何在保持交易完整性的同时做到这一点有什么建议吗?


编辑#1:这是我实际正在做的事情的类比。在写这个类比时,我忘记提及我还需要发出一些 UDP 请求。

Here's my case. I have three tables Book, Publisher and Price. I have a management command that does loops over each book and for each book, it queries the publisher to get the price which it then stores into the Prices table. It's a very simple HTTP GET or UDP request that I make to get the price. Here what the skeleton of my code looks like:

@transaction.commit_on_success
def handle(self, *args, **options):
    for book in Book.objects.all():
        for publisher book.publisher_set.objects.all():
            price = check_the_price(publisher.url, book.isbn)
            Price.objects.create(book=book, publisher=publisher, price=price)

The code is simple, but it gets really slow and time consuming when I have 10000 books. I could easily speed this up by making parallel HTTP requests. I could make 50 parallel requests this would be done in a jiffy but I don't know how to structure this code.

My site itself is very and small and light-weight site and I'm trying to stay away from RabbitMQ/Celery stuff. I just feel it's a big thing to take on right now.

Any recommendations on how to do this while maintaining transactional integrity?


Edit #1: This is used as an analogy for what I'm actually doing. In writing this analogy I forgot to mention that I also need to make a few UDP requests.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

风吹短裙飘 2025-01-10 11:18:08

您可以使用 requests 包,它提供基于 gevent绿色线程requests 允许您构建多个请求对象,然后“并行”执行这些对象。请参阅此示例

绿色线程实际上并不并行运行,而是协作地产生执行控制。 gevent 可以修补标准库的 I/O 函数(例如 urllib2 使用的函数),以便在它们阻塞 I/O 时产生控制权。 request 包将其包装到一个函数调用中,该函数调用接受多个请求并返回多个响应对象。没有比这更容易的了。

You could use the requests package which provides quasi-parallel request processing based on gevent's green threads. requests lets you build a number of request objects which are then executed in "parallel". See this example.

Green threads do not actually run in parallel, but cooperatively yield execution control. gevent can patch the standard library's I/O functions (e.g. the ones used by urllib2) to yield control whenever they would block on I/O otherwise. The request package wraps that into a single function call which takes a number of requests and returns a number of response objects. It doesn't get much easier than that.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文