AJAX 调用检查后台运行进程的状态,超时
我已经到处寻找任何关于这个问题的答案,但一直找不到答案,希望有人能指出我正确的方向,我想我已经很接近了,
我有两个主机,我们称它们为 host1.mydomain.com 和host2.mydomain.com(为了解决每个主机/每个浏览器的 2 个并发连接问题),因此它们都指向相同的内容,一个只是另一个
用户的别名 用户转到 host1.mydomain.com,输入一些信息注册,点击Go,加载同一页面上的 iframe 指向 host2.mydomain.com 上的页面,该页面通过 exec("curl") 调用 php 脚本,将请求发送到后台以启动网站抓取工具,然后将进程 ID 存储在数据库中用户。 iframe 成功加载后(因为创建后台进程只需要 1 秒),我设置了一个按时间间隔设置的 AJAX 请求,以定期检查 cURL 进程的状态(通过数据库中的进程 ID),以便我可以显示刮刀的当前步骤(总共 6 个步骤)。到目前为止一切都很好。
问题是 AJAX 请求在抓取器的第 4 步之后超时(浏览器默认超时为 115/120 秒),尽管它不应该这样,因为我正在使用两个不同的主机......意思是说它是几乎就像我堵塞了 host1.mydomain.com 上的两个连接,而我却没有堵塞,因为我从 host2 启动了抓取程序
iframe 加载了此 URL:http://host2.mydomain.com/page.php PHP脚本调用的内容:
exec("curl -o /dev/null 'http://host2.mydomain.com/page.php?method=process' > /dev/null & echo $!", $op);
然后我的ajax请求正在轮询 http://host1.mydomain .com/status.php?pid=x 在数据库中查找以通过进程 ID 检查状态
,一旦爬虫到达步骤 4,我的 ajax 请求就会超时,
我想我对这个问题的解释感到困惑,但希望有人能帮助我
I've looked all over for any answer on this and haven't been able to find one, hopefully someone can point me in the right direction, I think I'm close
I have two hosts let's call them host1.mydomain.com and host2.mydomain.com (to get around the 2 concurrent connections per host/per browser issue), so they both point to the same content one is just an alias of the other
User goes to host1.mydomain.com, enters some information to register, clicks Go, which loads an iframe on the same page pointing to a page on host2.mydomain.com which calls a php script via exec("curl") sending the request to the background to start a website scraper, the process ID is then stored in the database for the user. After the iframe has successfully loaded (only takes 1 second since it's creating a background process) I have an AJAX request set on an interval to check the status periodically of the cURL process (by it's process ID in the database) so that I can display the current step of the scraper (there are 6 steps in total). All good so far.
The problem is that the AJAX requests are timing out after step 4 of the scraper (browser default timeout is 115/120 seconds) even though it shouldn't be because I'm working with two different hosts...meaning to say that it's almost as if I'm clogging both connections on host1.mydomain.com when I'm not because I initiated the scraper from host2
The iframe loads this URL: http://host2.mydomain.com/page.php
The contents of the PHP script calls:
exec("curl -o /dev/null 'http://host2.mydomain.com/page.php?method=process' > /dev/null & echo $!", $op);
Then my ajax request is polling http://host1.mydomain.com/status.php?pid=x which looks up in the database to check the status by the process ID
and once the scraper gets to step 4, my ajax requests are timing out
I think I confused myself explaining this, but hopefully someone can help me
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
事实证明,我成功地解决了每个服务器/浏览器的 2 个连接限制...但是,在进行一些研究时,我发现我的 ajax 请求挂起的原因是因为我试图从两个服务器/浏览器访问和写入会话数据请求。更深入地挖掘,我发现了一个 session_write_close() ,它关闭了读/写会话,我基本上必须在抓取器的每个页面请求之后调用它,然后重新初始化会话,这允许我的 ajax 请求通过并停止阻塞的请求。
如果您偶然发现同样的问题,希望其他人会发现这很有用,
干杯!
杰夫
Turns out I was successfully getting around the 2 connections per server/browser limitation...however in doing some research I found the reason why my ajax request was hanging is because I was trying to access and write to the session data from both of the requests. Digging a little deeper I found a session_write_close() which closes the session for reading/writing, I basically have to call this after each page request of the scraper and then reinitialize the session, this allows my ajax requests to go through and stops the blocking of the request.
Hopefully someone else finds this useful if you stumble across the same issue
Cheers!
Jeff
您应该生成新进程,而不是等待请求完成在服务器后台运行。并使用 javascript 每隔几秒“检查一次”以查看执行何时完成。然后您所要做的就是获取结果并显示它。
此外,您可能想确保只生成一个 php 进程。
Instead of waiting for the request to finish, you should spawn new process which runs in the background on server. And use javascript to "check back" each few seconds to see when the execution has finished. Then all you have to do is pick up the result and display it.
Additionally you might want to make sure that only one php process is spawned.