子进程和交换JSON：如何无障碍物在stdin上使用Read（）？

发布于 2025-02-10 16:07:10 字数 1066 浏览 1 评论 0原文

我有一个主要过程，将JSON结构化数据发送到子过程。子过程正在使用此数据，并回馈有关主要过程百分比进度的信息（这将更新用户界面中的进度栏）。

问题是，仅当子过程已经完成时，仅由主过程收到子过程的输出。我想它会阻止read（） - 语句。一旦儿童流程发布到其Stdout的行，我如何才能获得与响应一起工作的主要过程？

这是最小的工作示例：

parent.py.py

from json import dumps
import subprocess
from time import sleep

lines_to_exchange = ["this is line one", "this is line two", "this is line three", "this is line four", "this is line five"]
command = ["python", "-u", "./child.py"]
print("start process")


sub = subprocess.Popen(command, text=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE)

sub.stdin.write(dumps(lines_to_exchange))
sub.stdin.close()


while True:
    sleep(0.1)
    stdout = sub.stdout.read()
    print(stdout)

    if sub.poll() is not None:
        print("process completed")
        break

child.py

from time import sleep
from json import loads
import sys
lines = loads(input())

for line in lines:
    sleep(1)
    print(line)
    sys.stdout.flush()

我正在使用Python 3.10和Pycharm IDE的Windows工作。

原文

I have a main process, that sends json structured data to a subprocess. The subprocess is working with this data and giving back information on the progress in a percentage to the main process (which shall update a progress bar in the user interface).

The problem is, that the output of the subprocess is only received by the main process when the subprocess is already finished. It blocks on the read()-Statement I suppose. How can I get the main process to work with the response, as soon as child process posts a line to its stdout?

Here's the minimal working example:

parent.py

from json import dumps
import subprocess
from time import sleep

lines_to_exchange = ["this is line one", "this is line two", "this is line three", "this is line four", "this is line five"]
command = ["python", "-u", "./child.py"]
print("start process")


sub = subprocess.Popen(command, text=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE)

sub.stdin.write(dumps(lines_to_exchange))
sub.stdin.close()


while True:
    sleep(0.1)
    stdout = sub.stdout.read()
    print(stdout)

    if sub.poll() is not None:
        print("process completed")
        break

child.py

from time import sleep
from json import loads
import sys
lines = loads(input())

for line in lines:
    sleep(1)
    print(line)
    sys.stdout.flush()

I am working on Windows with python 3.10 and pycharm IDE.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

执笔绘流年 2025-02-17 16:07:10

tl; dr：使用os.set_blocking（fd，false）用于非块读数。

真是个好问题！
提供 mre 的荣誉。

我真的很喜欢{true，sleep epsilon}安全措施。
总是无条件睡觉的好习惯，
因此，越野车代码不会意外地钉住核心。

NIT：显式.flush（）非常好。
有些人喜欢使其成为打印的一部分：print（line，flush = true）

-U无封闭的标志在显式齐平时是多余的。
但是当然，我明白了，你把一切都扔了
希望使它起作用，很酷。

无论如何，孩子的行为表现完美。

父母几乎是正确的，您非常接近。
绊倒你的是孩子产生几个
总计少于一百个字节的小消息，
父母默认为大型缓冲区。

要查看这一点，请将.Read（）更改为Eg .Read（10）。

这是一个默认的缓冲区大小设置：

>>> import select
>>> select.PIPE_BUF
512

您真正想要的是父母中的非阻滞I/O。
这是我测试过的其他地方的一些设置：

    with Popen(cmd, stdout=PIPE) as sub:
        stdout = io.TextIOWrapper(sub.stdout)
        fd = stdout.fileno()
        os.set_blocking(fd, False)

随时放弃复杂性
如果您不需要的话，则以线条为导向的包装器。
关键项目是设置非障碍物，
因为这改变了.read（）的行为。

我们有机会替换
“轮询” sleep（），呼叫select（），
这将暂停恰好适当的数量
时间并立即醒来数据。

            select([fd], [], [], timeout.total_seconds())

最后一个安全录音参数可以是任何值
合适的大，
说30秒，如果您相信孩子会永远
在此间隔内有话要说。

现在，当您.read（）时，
或。它将立即完成，因为它是非阻滞的。这意味着它 can 返回零字节，而且通常会比您要求的要少的字节少。没关系，只需处理字节然后返回循环是否还活着。

请注意，您发布的父代码具有
种族条件；它可能没有阅读所有内容
孩子说。小时候做一些最后的读物
死亡以确保没有任何损失。
父母的责任是继续阅读
直到该文件描述符上的EOF。

tl;dr: Use os.set_blocking(fd, False) for non-blocking reads.

What a great question!
Kudos for offering an MRE in a nice educational way.

I really like the {while True, sleep epsilon} safety measure.
Always a good practice to unconditionally sleep a moment,
so buggy code won't accidentally peg a core.

nit: The explicit .flush() is very nice.
Some folks like to make it part of the print: print(line, flush=True)

The -u unbuffered flag is redundant with the explicit flush.
But sure, I get it, you were throwing everything at it
in hopes of making it work, cool.

In any event, the child is behaving perfectly.

The parent is almost correct, you're very close.
What's tripping you up is that child produces several
small messages totaling less than a hundred bytes,
and parent is defaulting to a large buffer.

To see this, change .read() to e.g. .read(10).

Here is one default buffer size setting:

>>> import select
>>> select.PIPE_BUF
512

What you really want is non-blocking I/O in the parent.
Here is some setup that elsewhere I have tested as working:

    with Popen(cmd, stdout=PIPE) as sub:
        stdout = io.TextIOWrapper(sub.stdout)
        fd = stdout.fileno()
        os.set_blocking(fd, False)

Feel free to discard the complexity of
a line-oriented wrapper if you don't need that.
The critical item is to set non-blocking,
as that alters the behavior of .read().

We have an opportunity to replace
the "polling" sleep() with a call to select(),
which will pause for exactly the right amount
of time and wake up immediately upon data being ready.

            select([fd], [], [], timeout.total_seconds())

That last safety-valve parameter can be any value
suitably large,
say 30 seconds if you believe child will always
have something to say within that interval.

Now when you .read(),
or perhaps .read(PIPE_BUF) with explicit buffer size,
it will complete immediately since it is non-blocking.
This means it can return zero bytes,
and often will return fewer bytes than you asked for.
That's ok, just process the bytes and go back
to looping if child is still alive.

Note that the parent code you posted has a
race condition; it may not read everything
the child said. Do some final reads after child
death to ensure that nothing is lost.
Parent's responsibility is to keep reading
until EOF on that file descriptor.

回复收藏 0 原文

~没有更多了~