Python GUI Scraper 挂起问题

发布于 2024-08-31 04:38:48 字数 504 浏览 7 评论 0原文

不久前我用 python 写了一个爬虫，它在命令行中运行得很好。我现在已经为该应用程序制作了一个 GUI，但我遇到了一个问题。当我尝试更新 gui 内的文本（例如“获取 URL 12/50”）时，我无法看到抓取器内的功能正在抓取 100 多个链接。此外，当从一个抓取函数转到一个应该更新 gui 的函数，再到另一个函数时，在运行下一个抓取函数时，gui 更新函数似乎会被跳过。一个例子是：

scrapeLinksA() #takes 20 seconds
updateInfo("LinksA done")
scrapeLinksB() #takes another 20 seconds

在上面的例子中，updateInfo永远不会被执行，除非我用键盘中断结束程序。

我认为我的解决方案是线程，但我不确定。我可以做什么来解决这个问题？

我正在使用：

PyQt4
urllib2
BeautifulSoup

原文

I wrote a scraper using python a while back, and it worked fine in the command line. I have made a GUI for the application now, but I am having trouble with one issue. When I attempt to update text inside the gui (e.g. 'fetching URL 12/50'), I am unable seeing as the function within the scraper is grabbing 100+ links. Also when going from one scraping function, to a function that should update the gui, to another function, the gui update function seems to be skipped over while the next scrape function is run. An example would be:

scrapeLinksA() #takes 20 seconds
updateInfo("LinksA done")
scrapeLinksB() #takes another 20 seconds

in the above example, updateInfo is never executed, unless I end the program with a KeyboardInterrupt.

I'm thinking my solution is threading, but I'm not sure. What can I do to fix this?

I am using:

PyQt4
urllib2
BeautifulSoup

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱殇璃 2024-09-07 04:38:48

Lukáš Lalinský 的回答非常好。

另一种可能性是使用 PyQt 线程。

如果问题仅仅是“更新”部分（而不是需要异步处理），请尝试将此调用：

QCoreApplication.processEvents()

放在 scrapeLinksA 和 scrapeLinksB 之间，看看是否有帮助（它暂时中断主事件循环以查看是否有其他（例如绘制请求）待处理）。

如果没有，请向我们提供 updateInfo 的来源。

Lukáš Lalinský 's answer is very good.

Another possibility would be to use the PyQt threads.

If the problem is merely the 'updating' part (and not the need for asynchronous processing), try putting this call:

QCoreApplication.processEvents()

between scrapeLinksA and scrapeLinksB to see if that helps (it temporarily interrupts the main event loop to see if there are other (paint requests e.g.) pending).

If that doesn't, please provide us with the source of updateInfo.

回复收藏 0 原文