在 Python 中进行多线程/并发编程有哪些选择?

发布于 2024-08-21 09:57:45 字数 339 浏览 8 评论 0原文

我正在编写一个简单的站点蜘蛛,我决定借此机会学习 Python 并发编程的新知识。我决定尝试其他方法,而不是使用线程和队列,但我不知道什么适合我。

我听说过 Stackless、Celery、Twisted、Tornado 等。我不想设置数据库和 Celery 的所有其他依赖项,但如果它很适合我的目的,我会这样做。

我的问题是:我的应用程序的适用性和总体实用性之间的良好平衡是什么?我已经查看了 Stackless 中的 tasklet,但我不确定 urlopen() 调用不会阻塞或者它们会并行执行,我没有在任何地方看到提到的。

有人可以给我一些有关我的选择的详细信息以及最好使用什么吗?

谢谢。

I'm writing a simple site spider and I've decided to take this opportunity to learn something new in concurrent programming in Python. Instead of using threads and a queue, I decided to try something else, but I don't know what would suit me.

I have heard about Stackless, Celery, Twisted, Tornado, and other things. I don't want to have to set up a database and the whole other dependencies of Celery, but I would if it's a good fit for my purpose.

My question is: What is a good balance between suitability for my app and usefulness in general? I have taken a look at the tasklets in Stackless but I'm not sure that the urlopen() call won't block or that they will execute in parallel, I haven't seen that mentioned anywhere.

Can someone give me a few details on my options and what would be best to use?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

温柔女人霸气范 2024-08-28 09:57:45

Tornado 是一个网络服务器,所以它对你编写蜘蛛没有多大帮助。 Twisted 更加通用(并且不可避免地复杂),适合各种网络任务(并且与多个 GUI 框架的事件循环良好集成)。事实上,曾经有一个twisted.web.spider(但它在几年前被删除,因为它没有维护——所以你必须在Twisted提供的设施之上推出自己的)。

Tornado is a web server, so it wouldn't help you much in writing a spider. Twisted is much more general (and, inevitably, complex), good for all kinds of networking tasks (and with good integration with the event loop of several GUI frameworks). Indeed, there used to be a twisted.web.spider (but it was removed years ago, since it was unmaintained -- so you'll have to roll your own on top of the facilities Twisted does provide).

无声情话 2024-08-28 09:57:45

我必须说 Twisted 得到了我的投票。

在 Twisted 中执行事件驱动任务相当简单。与 GTK+ 和 DBus 等其他重要系统组件的集成非常容易。

HTTP 客户端支持目前是基本的,但正在改进(>9.0.0):查看相关问题

额外的好处是 Twisted 在 Ubuntu 默认存储库中可用;-)

I must say that Twisted gets my vote.

Performing event-drive tasks is fairly straightforward in Twisted. Integration with other important system components such as GTK+ and DBus is very easy.

The HTTP client support is basic for now but improving (>9.0.0): see related question.

The added bonus is that Twisted is available in the Ubuntu default repository ;-)

水水月牙 2024-08-28 09:57:45

For a quick look at package sizes, see
ohloh.net/p/compare .
Of course source size is only a rough metric (what I'd really like is nr pages doc, nr pages examples,
dependencies), but it can help.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文