与 C 程序通信时，子进程 Popen 参数无效/管道损坏

发布于 2024-10-17 09:18:31 字数 2315 浏览 5 评论 0原文

我有这段代码

所有需要的库都被导入

class VERTEX(Structure):
 _fields_ = [("index", c_int),
            ("x", c_float),
            ("y", c_float)]

其他东西

这个从顶点 bpy.data.objects[nomeID].data.vertices 列表创建和数组

def writelist_buf(size, nomeID): 
 Nvert_VERTEX_Array_Type = VERTEX * len(bpy.data.objects[nomeID].data.vertices)
 passarr = Nvert_VERTEX_Array_Type()
 for i in range(len(passarr)):
  vert = bpy.data.objects[nomeID].data.vertices[i]
  passarr[i] = VERTEX(vert.index, vert.co[0], vert.co[1])
 return passarr

是一个顶点列表。

其他内容

这是在 def 内部，并与 C 程序通信，

input = writelist_buf(size, nomeID)
c_program_and_args = "here is the program with his arguments(it works)"
cproc = Popen(c_program_and_args, stdin=PIPE, stdout=PIPE)
out, err = cproc.communicate(input)
#the program returns 2 integers separed by a space
return [int(i) for i in out.decode.split()]

在 writelist 调用之前声明了先前的数组大小和 nomeID。

经过一番“调试”后，我发现 writelist_buf 传递的类型是“合法的”（它是字节，因为它是用 c_types 创建的数组），但我不断收到 Errno32 Broken Pipe 或 Errno22 Invalid argument...C程序只需读取 stdin 即可检索所有顶点（如下面的 C 代码）。

奇怪的想法是，在我正在编写的代码中“集成”之前，我尝试了一种更简单的代码：这个，并且有用！

from subprocess import Popen, PIPE
from ctypes import *

class VERTEX(Structure):
 _fields_ = [("index", c_int),
            ("x", c_float),
            ("y", c_float)]

nverts = 5
vlist = [VERTEX(0,1,1), VERTEX(1,2,2), VERTEX(2,3,3), VERTEX(3,4,4), VERTEX(4,5,5)]
array = VERTEX * nverts
input = array()
for i in range(nverts):
 input[i] = vlist[i]
print(type(input))
cproc = Popen("pipeinout.exe random arg", stdin=PIPE, stdout=PIPE)
out, err = cproc.communicate(input)
print(out.decode())

和C代码

#include<stdio.h>
#include<stdlib.h>
typedef struct {
    int index;
    float x;
    float y;
} vertex;

int main(int argc, char* argv[]) {
    int n=5;
    int i;
    printf("%s",argv[1]);
    vertex* VV;
    VV=(vertex*)malloc(sizeof(vertex)*n);
    fread(VV,sizeof(vertex),n,stdin);
    //fread(&VV,sizeof(VV),1,stdin);//metti nel valore di VV(non a quello che punta) l'indirizzo passato||sizeof(VV) is the size of a pointer
    for(i=0;i<n;i++)
        printf(" %i , %f , %f\n",VV[i].index,VV[i].x,VV[i].y);
}

原文

I have this code

All the needed libraries are imported

class VERTEX(Structure):
 _fields_ = [("index", c_int),
            ("x", c_float),
            ("y", c_float)]

Other stuff

This create and array from a list of vertex

def writelist_buf(size, nomeID): 
 Nvert_VERTEX_Array_Type = VERTEX * len(bpy.data.objects[nomeID].data.vertices)
 passarr = Nvert_VERTEX_Array_Type()
 for i in range(len(passarr)):
  vert = bpy.data.objects[nomeID].data.vertices[i]
  passarr[i] = VERTEX(vert.index, vert.co[0], vert.co[1])
 return passarr

bpy.data.objects[nomeID].data.vertices is a list of vertices.

Other stuff

This is inside a def, and communicate to a C program the previous array

input = writelist_buf(size, nomeID)
c_program_and_args = "here is the program with his arguments(it works)"
cproc = Popen(c_program_and_args, stdin=PIPE, stdout=PIPE)
out, err = cproc.communicate(input)
#the program returns 2 integers separed by a space
return [int(i) for i in out.decode.split()]

size and nomeID are declared before the writelist call.

After a bit of "debugging" i found that the type passed by the writelist_buf is "legal"(it's bytes, since is an array created with c_types), but i keep receiving a Errno32 Broken Pipe or Errno22 Invalid argument... The C program just make a read in the stdiin to retrive all the vertices(like the C code below)..

The strange think is that before "integrating" inside the code i was working on, i have tried a simpler code: this one, and it works!

from subprocess import Popen, PIPE
from ctypes import *

class VERTEX(Structure):
 _fields_ = [("index", c_int),
            ("x", c_float),
            ("y", c_float)]

nverts = 5
vlist = [VERTEX(0,1,1), VERTEX(1,2,2), VERTEX(2,3,3), VERTEX(3,4,4), VERTEX(4,5,5)]
array = VERTEX * nverts
input = array()
for i in range(nverts):
 input[i] = vlist[i]
print(type(input))
cproc = Popen("pipeinout.exe random arg", stdin=PIPE, stdout=PIPE)
out, err = cproc.communicate(input)
print(out.decode())

And the C code

#include<stdio.h>
#include<stdlib.h>
typedef struct {
    int index;
    float x;
    float y;
} vertex;

int main(int argc, char* argv[]) {
    int n=5;
    int i;
    printf("%s",argv[1]);
    vertex* VV;
    VV=(vertex*)malloc(sizeof(vertex)*n);
    fread(VV,sizeof(vertex),n,stdin);
    //fread(&VV,sizeof(VV),1,stdin);//metti nel valore di VV(non a quello che punta) l'indirizzo passato||sizeof(VV) is the size of a pointer
    for(i=0;i<n;i++)
        printf(" %i , %f , %f\n",VV[i].index,VV[i].x,VV[i].y);
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏见 2024-10-24 09:18:31

从您的评论中，我了解到您将数百万个项目数百次传递给 C 程序。对于您的情况，下面的方法（使用子进程进行管道输入）可能太慢。可能的替代方案是编写 C 扩展（例如，使用 Cython）或使用 ctypes 直接调用 C 函数。您可以提出一个单独的问题，详细描述您的用例，了解哪种方法更可取。

如果您选择了一种方法，请确保它在任何优化之前都能正常工作（编写一些测试，测量性能，并且仅在需要时才进行优化）-- 使其工作、使其正确、使其快速。

另一方面，没有必要在那些已知稍后会被丢弃的方法上投入太多时间 - 快速失败。

如果 C 程序的输出是有界的；代码中的 .communicate() 方法有效（来源）：

import struct, sys    
from subprocess import Popen, PIPE

vertex_struct = struct.Struct('i f f')

def pack(vertices, n):    
    yield struct.pack('i', n)
    for v in vertices:
        yield vertex_struct.pack(*v)

def main():
    try: n = int(sys.argv[1])
    except IndexError:
        n = 100
    vertices = ((i,i+1,i+2) for i in range(n))

    p = Popen(["./echo_vertices", "random", "arg"], stdin=PIPE, stdout=PIPE)
    out, _ = p.communicate(b''.join(pack(vertices, n)))

    index, x, y = vertex_struct.unpack(out)
    assert index == (n-1) and int(x) == n and int(y) == (n+1)

if __name__ == '__main__':
    main()

这是问题评论中的代码。在我的机器上，对于较大的 n 值，它可以正常工作：

import struct, sys
from subprocess import Popen, PIPE
from threading import Thread

def pack(vertices, n):
    yield struct.pack('i', n)
    s = struct.Struct('i f f')
    for v in vertices:
        yield s.pack(*v)

def write(output_file, chunks):
    for chunk in chunks:
        output_file.write(chunk)
    output_file.close()

def main():
    try: n = int(sys.argv[1])
    except IndexError:
        n = 100
    vertices = ((i,i+1,i+2) for i in range(n))

    p = Popen(["./echo_vertices", "random", "arg"], stdin=PIPE, stdout=PIPE)

    Thread(target=write, args=[p.stdin, pack(vertices, n)]).start()

    for line in iter(p.stdout.readline, b''):
        pass
    p.stdout.close()
    sys.stdout.buffer.write(line)
    p.wait()

if __name__ == '__main__':
    main()

Q&A

问：我不太了解这个包
函数（我知道yield返回
可迭代的可迭代对象
只有一次，但在你的代码中你
使用 2 产量，所以我不明白它是什么
返回。

pack() 是一个生成器。生成器并不像您所描述的那样工作，例如：

>>> def f():
...     yield 1
...     yield 2
... 
>>> for i in f():
...     print(i)
...     
1
2

请注意每个yield都会生成一个值。

>>> def g(n):
...     for i in range(n):
...         yield i
... 
>>> list(g(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

这里的 yield 在文本中仅出现一次，但它被执行了 10 次，并且每次都会生成一个值（在本例中为整数）。请参阅 Python 教程中的生成器。 “系统程序员的生成器技巧”包含多个关于如何使用生成器从简单到高级的示例。

问：另外我不知道什么
(*v) 表示第 10 行)

s.pack(*v) 使用参数解包：

>>> def h(a, b):
...     print(a, b)
... 
>>> h(*[1, 'a'])
1 a
>>> h(*range(2))
0 1
>>> h(0, 1)
0 1

问：我不明白怎么办
第 25 行的线程有效，

Thread(target=write, args=[p.stdin, pack(vertices, n)]).start()

该行启动一个新线程，该线程使用来自 args 关键字参数（即 output_file=p.stdin）的参数调用 write() 函数 和 chunks=pack(vertices, n)。这种情况下的 write() 函数相当于：

p.stdin.write(struct.pack('i', n))
p.stdin.write(s.pack(0, 1, 2))
p.stdin.write(s.pack(1, 2, 3))
...
p.stdin.write(s.pack(n-1, n, n+1))
p.stdin.close()

之后线程退出。

问： ...以及所有
程序的读取输出..它
没有存储在变量中，是吗？

整个输出不存储在任何地方。代码：

for line in iter(p.stdout.readline, b''):
    pass

从 p.stdout 逐行读取，直到 .readline() 返回空字符串 b'' 并存储当前line 变量中的行（请参阅 iter( ） 文档）。所以：

sys.stdout.buffer.write(line)

只打印输出的最后一行。

问： 1）启动线程后，
python 脚本等待直到完成，
对吗？

不，主线程退出。启动的线程不是守护进程。它会一直运行直到完成，即脚本（程序）在完成之前不会退出。

问： 2）我理解你的阅读方式
来自 C 程序的标准输出，但是我
当你开始的时候不明白。Afa i
明白了，通过 write 函数我们
写入缓冲区（或类似的东西
文件在内存中）我们想要的数据，以及
当我们运行c程序时，它可以读取
从中我们写入了数据。但是当我们
在你的代码中启动C程序？ :)

C 程序由 p = Popen(...) 启动。

p.stdin.write() 写入 C 程序的 stdin（中间有很多缓冲区，但我们可以暂时忽略它）。该过程与以下相同：

$ echo abc | some_program

问： 3）最后一件事：为什么要使用等待
p？有警告
http://docs.python.org/library/subprocess.html？ #subprocess.Popen.wait

对于提供的 C 代码，无需在单独的线程中写入 p.stdin。我使用线程正是为了避免警告中描述的情况，即，C 程序在脚本完成写入其 stdin 之前产生足够的输出（您的 C 代码在完成读取之前不会写入任何内容，因此不需要线程）。

换句话说，在这种情况下，p.wait() 是安全的。

如果没有 p.wait()，C 程序的 stderr 输出可能会丢失。虽然我只能使用脚本在 jython 上重现 stderr 丢失。对于提供的 C 代码来说，这并不重要，因为它没有向 stderr 写入任何内容。

From your comments I understand that you pass millions of items hundreds of times to a C program. The approach below (pipe input using subprocess) might be too slow in your case. Possible alternatives could be to write a C extension (e.g., using Cython) or to use ctypes to call C functions directly. You could ask a separate question describing your use case in detail about what approach could be preferable.

If you've chosen an approach then make sure that it works correctly before any optimization (write some tests, measure performance and only after optimize it if needed) -- Make it work, make it right, make it fast.

On the other hand there is no point to invest too much time in approaches that are known to be thrown away later -- Fail fast.

if the output of the C program is bounded; the .communicate() method from your code works (source):

import struct, sys    
from subprocess import Popen, PIPE

vertex_struct = struct.Struct('i f f')

def pack(vertices, n):    
    yield struct.pack('i', n)
    for v in vertices:
        yield vertex_struct.pack(*v)

def main():
    try: n = int(sys.argv[1])
    except IndexError:
        n = 100
    vertices = ((i,i+1,i+2) for i in range(n))

    p = Popen(["./echo_vertices", "random", "arg"], stdin=PIPE, stdout=PIPE)
    out, _ = p.communicate(b''.join(pack(vertices, n)))

    index, x, y = vertex_struct.unpack(out)
    assert index == (n-1) and int(x) == n and int(y) == (n+1)

if __name__ == '__main__':
    main()

Here's the code from the comments to the question. It works without errors for large n values on my machine:

import struct, sys
from subprocess import Popen, PIPE
from threading import Thread

def pack(vertices, n):
    yield struct.pack('i', n)
    s = struct.Struct('i f f')
    for v in vertices:
        yield s.pack(*v)

def write(output_file, chunks):
    for chunk in chunks:
        output_file.write(chunk)
    output_file.close()

def main():
    try: n = int(sys.argv[1])
    except IndexError:
        n = 100
    vertices = ((i,i+1,i+2) for i in range(n))

    p = Popen(["./echo_vertices", "random", "arg"], stdin=PIPE, stdout=PIPE)

    Thread(target=write, args=[p.stdin, pack(vertices, n)]).start()

    for line in iter(p.stdout.readline, b''):
        pass
    p.stdout.close()
    sys.stdout.buffer.write(line)
    p.wait()

if __name__ == '__main__':
    main()

Q&A

Q: I don't really understand the pack
functions (i know that yield returns
an iterable object that is iterable
only one time, but in your code you
use 2 yield, so i don't get what it
returns.

pack() is a generator. Generators do not work how you've described them, e.g.:

>>> def f():
...     yield 1
...     yield 2
... 
>>> for i in f():
...     print(i)
...     
1
2

Note each yield produces a value.

>>> def g(n):
...     for i in range(n):
...         yield i
... 
>>> list(g(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Here's the yield is present in the text only one time but it is executed 10 times and each time it produces a value (an integer in this case). See Generators in the Python tutorial. "Generator Tricks for Systems Programmers" contains multiple examples on how to use generators from a simple to an advanced usage.

Q: In addition i dont know what
(*v) means at line 10)

s.pack(*v) calls the pack method using argument unpacking:

>>> def h(a, b):
...     print(a, b)
... 
>>> h(*[1, 'a'])
1 a
>>> h(*range(2))
0 1
>>> h(0, 1)
0 1

Q: I don't get how
the Thread in line 25 works,

Thread(target=write, args=[p.stdin, pack(vertices, n)]).start()

This line starts a new thread that calls write() function with the arguments from the args keyword argument i.e. output_file=p.stdin and chunks=pack(vertices, n). The write() function in this case is equivalent to:

p.stdin.write(struct.pack('i', n))
p.stdin.write(s.pack(0, 1, 2))
p.stdin.write(s.pack(1, 2, 3))
...
p.stdin.write(s.pack(n-1, n, n+1))
p.stdin.close()

After that the thread exits.

Q: ...and all
the read output of the program.. It
isn't stored in a variable, is it?

The whole output is not stored anywhere. The code:

for line in iter(p.stdout.readline, b''):
    pass

reads from p.stdout line-by-line until the .readline() returns empty string b'' and stores the current line in the line variable (see iter() docs). So:

sys.stdout.buffer.write(line)

just prints the last line of the output.

Q: 1)after starting the Thread, the
python script waits until it finished,
right?

No, the main thread exits. The started thread is not daemon. It runs until it completes i.e., the script (the program) doesn't exit until it completes.

Q: 2)i understood how you read
from the stdout of the C program,but i
don't get when you start it.Afa i
understood,with the write function we
write in a buffer(or something like a
file in the ram) the data we want,and
when we run the c program, it can read
from it the data we wrote.But when we
start the C program in your code? :)

The C program is started by p = Popen(...).

p.stdin.write() writes to stdin of the C program (there are number of buffers in between but we can forget about it for a moment). The process is the same as in:

$ echo abc | some_program

Q: 3)last thing: why do you use a wait on
p? There's a warning
http://docs.python.org/library/subprocess.html?#subprocess.Popen.wait

For the provided C code it is not necessary to write to p.stdin in a separate thread. I use the thread exactly to avoid the situation described in the warning i.e., C program produces enough output before the script finishes writing to its stdin (your C code doesn't write anything before it finishes reading so the thread is not necessary).

In other words p.wait() is safe in this case.

Without p.wait() stderr output from the C program might be lost. Though I can reproduce the stderr loss only on jython with the scripts. Yet again for the provided C code it doesn't matter due to it is not writing to stderr anything.

回复收藏 0 原文

~没有更多了~