pcntl_fork 然后 pcntl_exec 或 pcntl_exec 然后 pcntl_fork

发布于 2024-09-28 17:05:29 字数 1829 浏览 1 评论 0原文

说明: 我正在创建的应用程序的一部分需要检查数千条记录并及时对它们采取行动。因此,对于每条记录,我都想分叉一个新流程。但是,我需要一个数据库连接来对该记录进行更多检查。据我了解,孩子继承了数据库连接。所以后续的fork会出现DB错误。

我以为我可以 pcntl_exec('php /path/script.php');然后是 pcntl_fork ,这样调用进程就不会被阻塞。

或者我可以在子进程中使用 pcntl_fork 然后 pcntl_exec 。或者也许我应该使用 exec() 而不是 pcntl_exec()。

我的问题:这两种订单都有缺点或优点吗?

注释: 也许我正在想象这个问题,因为我认为调用 php 进程会等待 pcntl_exec 返回。但这不是文档所说的:

出错时返回 FALSE,成功时不返回。

函数如何有时返回值而其他时候不返回值?这听起来像是写得不好的文档。

fahadsadah 的评论指出:

一旦执行的进程结束,控制权返回到网络服务器进程。

如果是这样的话,那么我需要分叉。

编辑:为困惑的人提供的代码 - 包括我;)

<?php

class Process
{

    public function __construct($arg = false)
    {
        if ($arg == "child")
        {
            $this->act();
        }
        else
        {
            $this->run();
        }
    }

    public function run()
    {
        echo "parent before fork:", getmypid(), PHP_EOL;
        $pid = @ pcntl_fork();
        echo $pid, PHP_EOL;

        if ($pid == -1)
        {
            throw new Exception(self::COULD_NOT_FORK);
        }
        if ($pid)
        {
        // parent
            echo "parent after fork:", getmypid(), PHP_EOL;
        }
        elseif ($pid == 0)
        {
        // child
            echo "child after fork:", getmypid(), PHP_EOL;
            //echo exec('php Process.php child');
            echo pcntl_exec('/usr/bin/php', array('Process.php', 'child'));
        }
        return 0;
    }

    private function act()
    {
        sleep(1);
        echo "forked child new process:", getmypid(), PHP_EOL;
        return 0;
    }
}

$proc = new Process($argv[1]);

如果您取消注释 exec 并注释 pcntl_exec,您将看到 pcntl_exec 替换了该进程。我猜这可以节省一些资源。

Explanation:
Part of the app that I'm creating requires checking of thousands of records and acting on them in a timely manner. So for every record, I want to fork a new process. However, I need a DB connection to do some more checking of that record. As I understand it, the child inherits the DB connection. So subsequent forks have DB errors.

I thought I could pcntl_exec('php /path/script.php'); and then pcntl_fork so that the calling process is not held up.

Or I can pcntl_fork and then pcntl_exec in the child process. Or maybe I should be using exec() instead of pcntl_exec().

My question: Are there any drawbacks or advantages to either order?

Notes:
Maybe I'm imagining this issue, as I thought that the calling php process would wait for pcntl_exec to return. But that's not what the docs state:

Returns FALSE on error and does not return on success.

How can a function return a value sometimes and none other times? That sounds like poorly written docs.

fahadsadah's comments state:

Once the executed process ends, control returns to the webserver process.

If that is the case, then I need to fork.

Edit: code for the confused - including me ;)

<?php

class Process
{

    public function __construct($arg = false)
    {
        if ($arg == "child")
        {
            $this->act();
        }
        else
        {
            $this->run();
        }
    }

    public function run()
    {
        echo "parent before fork:", getmypid(), PHP_EOL;
        $pid = @ pcntl_fork();
        echo $pid, PHP_EOL;

        if ($pid == -1)
        {
            throw new Exception(self::COULD_NOT_FORK);
        }
        if ($pid)
        {
        // parent
            echo "parent after fork:", getmypid(), PHP_EOL;
        }
        elseif ($pid == 0)
        {
        // child
            echo "child after fork:", getmypid(), PHP_EOL;
            //echo exec('php Process.php child');
            echo pcntl_exec('/usr/bin/php', array('Process.php', 'child'));
        }
        return 0;
    }

    private function act()
    {
        sleep(1);
        echo "forked child new process:", getmypid(), PHP_EOL;
        return 0;
    }
}

$proc = new Process($argv[1]);

If you uncomment the exec and comment the pcntl_exec, you will see that pcntl_exec replaces the process. Which I'm guessing saves some resources.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

自找没趣 2024-10-05 17:05:29

这确实令人困惑 - 您正在尝试应用非常复杂的技术 - 但您以完全错误的方式应用它们。

fork 创建当前进程的新的运行副本。 Exec 启动一个新进程。您不会同时使用它们来启动单个进程。

但在我开始解释如何正确使用 fork 和 exec 之前,我应该指出它们不是解决这个问题的正确工具。

应尽可能避免批处理。数据通常以有限的速率到达(尽管该速率可能是随机的)——通常避免批处理的正确方法是同步或通过队列处理请求。在批处理不可避免的情况下,并行化和/或流水线处理通常可以提高吞吐量。虽然有许多复杂的方法可以实现这一点(例如映射减少),但简单地对数据进行分片通常就足够了。虽然您的基本想法相当于将数据分片成单个片段,但是:

1) 比处理小批量的效率要低

2) 很难限制系统的资源消耗(如果您生成 500 个进程,而您的 DBMS 仅支持 200 个进程,该怎么办?并发连接?)

假设您无法同步处理处理并且运行具有多个订阅者的队列是不切实际的,我建议将数据分成(有限数量的)较小的批次并生成进程来处理这些。请注意,popen()、proc_open() 和 pcntl_fork() 在生成进程的执行期间不会阻塞。 (提示 - 使用模数运算符

如果你想从 HTTP 请求启动处理(或者有其他原因在单独的会话组中运行它们),然后在 google 上搜索“PHP 长时间运行的进程 setid”。

This is really confused - you're trying to apply very sophisticated techniques - but you are applying them in completely the wrong way.

fork creates a new running copy of the current process. Exec starts a new process. You would not use them both to start a single process.

But before I get into a en explanation of how to use fork and exec correctly, I should point out that they are not the right tools for addressing this problem.

Batch processing should be avoided wherever possible. Data typically arrives at a finite rate (albeit that the rate may be stochastic) - usually the right approach to avoid batching is to deal with requests synchronously or via queueing. Where batch processing is unavoidable, parallelizing and/or pipelining processing usually improves throughput. While there are many sophisticated methods for achieving this (e.g. map-reduce) simply sharding the data is usually adequate. While your basic idea amounts to sharding into single pieces, this:

1) will be less efficient than dealing with small batches

2) makes it very difficult to limit resource consumption by the system (what if you spawn 500 processes and your DBMS only supports 200 concurrent connections?)

Assuming that you can't deal with the processing synchronously and runiing a queue with multiple subscribers is not practical, I'd suggest just splitting the data into (a limited number of) smaller batches and spawning processes to deal with those. Note that popen(), proc_open() and pcntl_fork() do not block for the duration of execution of the spawned process. (hint - use the modulus operator)

If you want to to launch the processing from an HTTP request (or have another reason for running them in seperate session groups) then have a google for 'PHP long running processes setsid).

流年里的时光 2024-10-05 17:05:29

这没有道理。一旦你执行了 exec(),你就会运行不同的代码,所以你不能在之后 fork()。 成功后不会返回。

This doesn't make sense. Once you exec() you're running different code so you can't fork() afterwards. Does not return on success.

淡淡離愁欲言轉身 2024-10-05 17:05:29

现在 fork 和 exec 已经被定义了,fork 不会做你想要的事情。它将把当前脚本环境的所有内容复制到新空间,包括文件描述符指针。是的,这是正确的 - 子级与父级共享相同的文件描述符。

想象一下使用维护自己的文件描述符的 mysql 扩展时会造成的严重破坏;-)

Now that fork and exec have been defined, fork won't do what you want. It will copy ALL of the current script environment to new space, including the file descriptor pointers. Yes, that is correct - the children share the same file descriptors as the parents.

Imagine the havoc when using the mysql extension which maintains its own file descriptors ;-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文