更快地提出python请求和美丽的人
嗨,我想从网站(用户和用户数据)中获取一些数据,并将数据保存在 sqlite 数据库中。
- 我想使此过程更快,
- 而在完成刮擦后,对于每个提交或彗星数据的女巫是更好的彗星数据。
我使用 lxml 和 cChardet 。但这并不是有效的改变。
Hi I want to get some data from a website (users and user's data) and them save data in a SQLite database.
- I want to make this processes more faster
- and witch is better comet data for every filed or comet data after finishing scraping.
I use lxml and cchardet. But that's not make efficient change.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您想减少刮擦部分中的时间,我建议构建一个多线程程序来提出请求。
condurrent.futures
是多线程的最简单方法之一,尤其是使用 threadpoolexecutor 。他们甚至在文档中有一个简单的多线程URL请求示例。另外,您可能需要查看python scrapy框架,它会同时刮擦数据,还带有许多功能例如自动吹捧,旋转的代理和用户代理,您也可以轻松地与数据库集成。
If you want to reduce time in the scraping part, I suggest build a multithreaded program to make requests.
concurrent.futures
is one of the easiest ways to multithread these kinds of requests, in particular using the ThreadPoolExecutor. They even have a simple multithreaded URL request example in the documentation.Also, you may want to check out python scrapy framework, it will scrape the data concurrently, also it comes with many features such as auto-throttle, rotating proxies and user-agents, you can easily integrate with your databases as well.