PHP 中使用 PECL HTTP 类的并行 HTTP 请求 [答案:HttpRequestPool 类]

发布于 2024-07-06 07:18:21 字数 2163 浏览 6 评论 0原文


<强> HttpRequestPool 类提供了一个解决方案。 非常感谢那些指出这一点的人。

可以在以下位置找到简短的教程: http://www.phptutorial.info/?HttpRequestPool-construct


问题

我想在 PHP 中发出并发/并行/同步 HTTP 请求。 我想避免连续的请求,因为:

  • 一组请求需要很长时间才能完成; 请求越多,
  • 一组中的一个请求的超时时间就越长,可能会导致后面的请求无法发出(如果脚本有执行时间限制)

我已经设法找到了制作 simultaneuos [原文如此] PHP 中使用 cURL 的 HTTP 请求,但是我想明确使用 PHP如果可能的话,HTTP 函数

具体来说,我需要同时将数据 POST 到一组 URL。 发布数据的 URL 超出了我的控制范围; 它们是用户设置的。

我不介意是否需要等待所有请求完成才能处理响应。 如果我为每个请求设置 30 秒的超时,并且请求是并发发出的,我知道我必须等待最多 30 秒(也许更长一点)才能完成所有请求。

我找不到有关如何实现这一目标的详细信息。 不过,我最近确实注意到 PHP 手册中提到 PHP5+ 能够处理并发 HTTP 请求 - 我当时打算记下来,忘记了,再也找不到了。

单个请求示例(工作正常)

<?php
$request_1 = new HttpRequest($url_1, HTTP_METH_POST);
$request_1->setRawPostData($dataSet_1);
$request_1->send();
?>

并发请求示例(不完整,清晰)

<?php
$request_1 = new HttpRequest($url_1, HTTP_METH_POST);
$request_1->setRawPostData($dataSet_1);

$request_2 = new HttpRequest($url_2, HTTP_METH_POST);
$request_2->setRawPostData($dataSet_2);

// ...

$request_N = new HttpRequest($url_N, HTTP_METH_POST);
$request_N->setRawPostData($dataSet_N);

// Do something to send() all requests at the same time
?>

任何想法将不胜感激!

澄清 1:我想坚持使用 PECL HTTP 函数,因为:

  • 它们提供了一个很好的 OOP 接口,
  • 它们在相关应用程序中广泛使用,并且坚持使用已经在使用的功能应该会受益从维护的角度
  • 来看,与使用 cURL 相比,我通常需要编写更少的代码行来使用 PECL HTTP 函数发出 HTTP 请求 - 从维护的角度来看,更少的代码行也应该是有益的

澄清 2:我意识到 PHP 的 HTTP 函数不是内置的,也许我在那里用词错误,我将更正。 我不担心人们必须安装额外的东西 - 这不是一个要分发的应用程序,它是一个带有服务器的 Web 应用程序。

澄清 3:如果有人权威地声明 PECL HTTP 无法做到这一点,我会非常高兴。



The HttpRequestPool class provides a solution. Many thanks to those who pointed this out.

A brief tutorial can be found at: http://www.phptutorial.info/?HttpRequestPool-construct


Problem

I'd like to make concurrent/parallel/simultaneous HTTP requests in PHP. I'd like to avoid consecutive requests as:

  • a set of requests will take too long to complete; the more requests the longer
  • the timeout of one request midway through a set may cause later requests to not be made (if a script has an execution time limit)

I have managed to find details for making simultaneuos [sic] HTTP requests in PHP with cURL, however I'd like to explicitly use PHP's HTTP functions if at all possible.

Specifically, I need to POST data concurrently to a set of URLs. The URLs to which data are posted are beyond my control; they are user-set.

I don't mind if I need to wait for all requests to finish before the responses can be processed. If I set a timeout of 30 seconds on each request and requests are made concurrently, I know I must wait a maximum of 30 seconds (perhaps a little more) for all requests to complete.

I can find no details of how this might be achieved. However, I did recently notice a mention in the PHP manual of PHP5+ being able to handle concurrent HTTP requests - I intended to make a note of it at the time, forgot, and cannot find it again.

Single request example (works fine)

<?php
$request_1 = new HttpRequest($url_1, HTTP_METH_POST);
$request_1->setRawPostData($dataSet_1);
$request_1->send();
?>

Concurrent request example (incomplete, clearly)

<?php
$request_1 = new HttpRequest($url_1, HTTP_METH_POST);
$request_1->setRawPostData($dataSet_1);

$request_2 = new HttpRequest($url_2, HTTP_METH_POST);
$request_2->setRawPostData($dataSet_2);

// ...

$request_N = new HttpRequest($url_N, HTTP_METH_POST);
$request_N->setRawPostData($dataSet_N);

// Do something to send() all requests at the same time
?>

Any thoughts would be most appreciated!

Clarification 1: I'd like to stick to the PECL HTTP functions as:

  • they offer a nice OOP interface
  • they're used extensively in the application in question and sticking to what's already in use should be beneficial from a maintenance perspective
  • I generally have to write fewer lines of code to make an HTTP request using the PECL HTTP functions compared to using cURL - fewer lines of code should also be beneficial from a maintenance perspective

Clarification 2: I realise PHP's HTTP functions aren't built in and perhaps I worded things wrongly there, which I shall correct. I have no concerns about people having to install extra stuff - this is not an application that is to be distributed, it's a web app with a server to itself.

Clarification 3: I'd be perfectly happy if someone authoritatively states that the PECL HTTP cannot do this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

飘过的浮云 2024-07-13 07:18:22

我很确定 HttpRequestPool 就是您正在寻找的。

详细说明一下,您可以使用分叉来实现您正在寻找的内容,但这似乎不必要地复杂,并且在 HTML 上下文中不是很有用。 虽然我还没有测试过,但这段代码应该是这样的:

// let $requests be an array of requests to send
$pool = new HttpRequestPool();
foreach ($requests as $request) {
  $pool->attach($request);
}
$pool->send();
foreach ($pool as $request) {
  // do stuff
}

I'm pretty sure HttpRequestPool is what you're looking for.

To elaborate a little, you can use forking to achieve what you're looking for, but that seems unnecessarily complex and not very useful in a HTML context. While I haven't tested, this code should be it:

// let $requests be an array of requests to send
$pool = new HttpRequestPool();
foreach ($requests as $request) {
  $pool->attach($request);
}
$pool->send();
foreach ($pool as $request) {
  // do stuff
}
不离久伴 2024-07-13 07:18:22

您是否尝试过 HttpRequestPool (它是 Http 的一部分)? 看起来它会汇集请求对象并处理它们。 我知道我在某处读到 Http 将支持同时请求,除了 pool 之外我也找不到任何东西。

Did you try HttpRequestPool (it's part of Http)? It looks like it would pool up the request objects and work them. I know I read somewhere that Http would support simultaneous requests and aside from pool I can't find anything either.

ζ澈沫 2024-07-13 07:18:22

我曾经不得不解决类似的问题:在不累积响应时间的情况下执行多个请求。

该解决方案最终成为一个使用非阻塞套接字的自定义构建函数。
它的工作原理如下:

$request_list = array(
  # address => http request string
  #
   '127.0.0.1' => "HTTP/1.1  GET /index.html\nServer: website.com\n\n",
   '192.169.2.3' => "HTTP/1.1 POST /form.dat\nForm-data: ...",
  );

foreach($request_list as $addr => $http_request) {
    # first, create a socket and fire request to every host
    $socklist[$addr] = socket_create();
    socket_set_nonblock($socklist[$addr]); # Make operation asynchronious

    if (! socket_connect($socklist[$addr], $addr, 80))
        trigger_error("Cannot connect to remote address");

    # the http header is send to this host
    socket_send($socklist[$addr], $http_request, strlen($http_request), MSG_EOF);
}

$results = array();

foreach(array_keys($socklist) as $host_ip) {
    # Now loop and read every socket until it is exhausted
    $str = socket_read($socklist[$host_ip], 512, PHP_NORMAL_READ);
    if ($str != "") 
        # add to previous string
        $result[$host_ip] .= $str;
    else
        # Done reading this socket, close it
        socket_close($socklist[$host_ip]);
}
# $results now contains an array with the full response (including http-headers)
# of every connected host.

由于 thunked 响应是以半并行方式获取的,因此速度要快得多,因为 socket_read 不会等待响应,而是在套接字缓冲区尚未满时返回。

您可以将其包装在适当的 OOP 接口中。 您需要自己创建HTTP请求字符串,当然还要处理服务器响应。

I once had to solve similar problem: doing multiple requests without cumulating the response times.

The solution ended up being a custom-build function which used non-blocking sockets.
It works something like this:

$request_list = array(
  # address => http request string
  #
   '127.0.0.1' => "HTTP/1.1  GET /index.html\nServer: website.com\n\n",
   '192.169.2.3' => "HTTP/1.1 POST /form.dat\nForm-data: ...",
  );

foreach($request_list as $addr => $http_request) {
    # first, create a socket and fire request to every host
    $socklist[$addr] = socket_create();
    socket_set_nonblock($socklist[$addr]); # Make operation asynchronious

    if (! socket_connect($socklist[$addr], $addr, 80))
        trigger_error("Cannot connect to remote address");

    # the http header is send to this host
    socket_send($socklist[$addr], $http_request, strlen($http_request), MSG_EOF);
}

$results = array();

foreach(array_keys($socklist) as $host_ip) {
    # Now loop and read every socket until it is exhausted
    $str = socket_read($socklist[$host_ip], 512, PHP_NORMAL_READ);
    if ($str != "") 
        # add to previous string
        $result[$host_ip] .= $str;
    else
        # Done reading this socket, close it
        socket_close($socklist[$host_ip]);
}
# $results now contains an array with the full response (including http-headers)
# of every connected host.

It's much faster since thunked reponses are fetched in semi-parallel since socket_read doesn't wait for the response but returns if the socket-buffer isn't full yet.

You can wrap this in appropriate OOP interfaces. You will need to create the HTTP-request string yourself, and process the server response of course.

偷得浮生 2024-07-13 07:18:22

最近,一位朋友向我推荐了 CurlObjects ( http://trac.curlobjects.com/trac ),它我发现使用curl_multi非常有用。

<代码>
$curlbase = 新的 CurlBase;
$curlbase->defaultOptions[ CURLOPT_TIMEOUT ] = 30;
$curlbase->add( new HttpPost($url, array('name'=> 'value', 'a' => 'b')));
$curlbase->add( new HttpPost($url2, array('name'=> 'value', 'a' => 'b')));
$curlbase->add( new HttpPost($url3, array('name'=> 'value', 'a' => 'b')));
$curlbase->perform();

foreach($curlbase->requests as $request) {
...
}

A friend pointed me to CurlObjects ( http://trac.curlobjects.com/trac ) recently, which I found quite useful for using curl_multi.


$curlbase = new CurlBase;
$curlbase->defaultOptions[ CURLOPT_TIMEOUT ] = 30;
$curlbase->add( new HttpPost($url, array('name'=> 'value', 'a' => 'b')));
$curlbase->add( new HttpPost($url2, array('name'=> 'value', 'a' => 'b')));
$curlbase->add( new HttpPost($url3, array('name'=> 'value', 'a' => 'b')));
$curlbase->perform();

foreach($curlbase->requests as $request) {
...
}

橪书 2024-07-13 07:18:22

PHP 的 HTTP 函数也不是内置的 - 它们是PECL 扩展。 如果您担心人们必须安装额外的东西,那么这两种解决方案都会遇到相同的问题 - 我想 cURL 更有可能被安装,因为它是我曾经使用过的每个网络主机的默认设置。

PHP's HTTP functions aren't built in, either - they're a PECL extension. If your concern is people having to install extra stuff, both solutions will have the same problem - and cURL is more likely to be installed, I'd imagine, as it comes default with every web host I've ever been on.

亢潮 2024-07-13 07:18:22

您可以使用 pcntl_fork() 为每个请求创建一个单独的进程,然后等待它们结束:

http://www.php.net/manual/en/function.pcntl-fork.php

您有什么理由不想使用 cURL 吗? curl_multi_* 函数允许同时处理多个请求。

You could use pcntl_fork() to create a separate process for each request, then wait for them to end:

http://www.php.net/manual/en/function.pcntl-fork.php

Is there any reason you don't want to use cURL? The curl_multi_* functions would allow for multiple requests at the same time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文