需要加速我的 Feed 解析和处理 PHP
我一直忙于开发从 twitter 搜索 API 获取提要的应用程序,然后需要从提要中的每个状态中提取所有 URL,最后由于许多 URL 被缩短,我正在检查以下的响应标头每个 URL 获取其指向的真实 URL。 对于 100 个条目的提要,此过程可能会超过一分钟! (仍在我的电脑上本地工作) 我正在为每个提要启动一次 Curl 资源,并保持其打开状态,直到我完成所有 URL 扩展,尽管这有点帮助,但我仍然担心上线时会遇到麻烦,
有什么想法可以加快速度?
I'm keeping my self busy working on app that gets a feed from twitter search API, then need to extract all the URLs from each status in the feed, and finally since lots of the URLs are shortened I'm checking the response header of each URL to get the real URL it leads to.
for a feed of 100 entries this process can be more then a minute long!! (still working local on my pc)
i'm initiating Curl resource one time per feed and keep it open until I'm finished all the URL expansions though this helped a bit i'm still warry that i'l be in trouble when going live
any ideas how to speed things up?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如 Asaph 指出的那样,问题在于您是在单线程进程中执行此操作,因此所有网络延迟都被序列化。
这一切是否都必须在 http 请求内发生,或者您可以将 URL 排队到某个地方,并让一些后台进程仔细研究它们吗?
如果你能做到后者,那就是正确的方法。
如果你必须做前者,你可以做同样的事情。
无论哪种方式,您都希望找到并行处理请求的方法。您可以编写一个分叉的命令行 PHP 脚本来完成此任务,尽管您最好考虑使用支持线程的语言(例如 ruby 或 python)来编写这样的庞然大物。
The issue is, as Asaph points out, that you're doing this in a single-threaded process, so all of the network latency is being serialized.
Does this all have to happen inside an http request, or can you queue URLs somewhere, and have some background process chew through them?
If you can do the latter, that's the way to go.
If you must do the former, you can do the same sort of thing.
Either way, you want to look at way to chew through the requests in parallel. You could write a command-line PHP script that forks to accomplish this, though you might be better off looking into writing such a beast in language that supports threading, such as ruby or python.
通过使应用程序成为多线程,您也许能够显着提高性能。 PHP 本身不直接支持多线程,但您可以启动多个 PHP 进程,每个进程执行并发处理作业。
You may be able to get significantly increased performance by making your application multithreaded. Multi-threading is not supported directly by PHP per se, but you may be able to launch several PHP processes, each working on a concurrent processing job.