从 Python Popen 进程获取输出文件?

发布于 2024-10-30 18:34:54 字数 717 浏览 1 评论 0原文

我编写了一个 python 程序来与已编译的程序(称为 ProgramX)交互,该程序有一些难以处理的特性。我需要通过我的 python 程序向 ProgramX 提供数千个输入文件。我想做的是获取 ProgramX 每次运行时创建的输出文件,并将其重命名为合理的名称,例如 inputfilename.output。

问题出在由 ProgramX 编写的输出文件中——它是通过一种不可预测的方法命名的,如果输出文件已经存在,该方法将写入并“无情地覆盖”输出文件(大多数情况下都是这种情况) 。节省的恩典可能来自于输出文件有一个标准前缀:想想 ProgramX.notQuiteRandomNumber。

我唯一能想到的就是在我的 bash shell 中做这样的事情:

PROGRAMXOUTPUT=$(ls -ltr ProgramX* | tail -n -1 | awk '{print $8}')
mv $PROGRAMXOUTPUT input.output

它可以完成我需要的 90%,但在我将所有 bash 编程为一系列 Popen 语句之前,是否有更好的方法来做到这一点?这个问题感觉人们可能有比我想象的更好的解决方案。

旁注:我可以毫无问题地获取程序的标准输出,但这是我需要获取的输出文件。

奖励:我计划在同一目录中运行程序的一堆实例,所以我上面的天真的方法可能会开始出现不可预见的问题。因此,也许有一些奇特的东西可以监视 ProgramX 的 PID 并跟踪其输出。

I have written a python program to interface with a compiled program (call it ProgramX) that has some idiosyncrasies that are proving difficult to deal with. I need to feed many thousands of input files to ProgramX via my python program. What I would like to do is to grab the output file that ProgramX creates with each run, and rename it something sensible, like inputfilename.output.

The problem comes in the output file that is written by ProgramX -- it is named via an unpredictable method, which will write, and "mercilessly overwrite", the output file if it already exists (which is the case the majority of the time). The saving grace probably comes with the fact that there is a standard prefix to the output files: think ProgramX.notQuiteRandomNumber.

The only think I can think to do is something like this in my bash shell:

PROGRAMXOUTPUT=$(ls -ltr ProgramX* | tail -n -1 | awk '{print $8}')
mv $PROGRAMXOUTPUT input.output

Which does 90% of what I need, but before I program all that bash into a series of Popen statements, is there a better way to do this? This problem feels like something people might have a much better solution than what I'm thinking.

Sidenote: I can grab the program's standard output without problems, however it's the output file that I need to grab.

Bonus: I was planning on running a bunch of instantiations of the program in the same directory, so my naive approach above may start to have unforeseen problems. So perhaps something fancy that watches the PID of ProgramX and follows its output.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

演多会厌 2024-11-06 18:34:54

要执行上面的 shell 脚本的操作,假设当前目录中只有一个 ProgramX*

import glob, os

programxoutput = glob.glob('ProgramX*')[0]
os.rename(programxoutput, 'input.output')

如果您需要按时间等排序,也有一些方法可以做到这一点(请参阅在 os.stat 中),但如果您要同时运行 ProgramX 的多个副本,则使用最近的修改日期会导致严重的竞争条件。

我建议您为 ProgramX 的每次运行创建并更改为一个新的(可能是临时的)目录,这样运行就不可能相互干扰。 tempfile 模块可以帮助解决这个问题。

To do what your shell script above does, assuming you've only got one ProgramX* in the current directory:

import glob, os

programxoutput = glob.glob('ProgramX*')[0]
os.rename(programxoutput, 'input.output')

If you need to sort by time, etc., there are ways to do that too (look at os.stat), but using the most recent modification date is a recipe for nasty race conditions if you'll be running multiple copies of ProgramX concurrently.

I'd suggest instead that you create and change to a new, perhaps temporary directory for each run of ProgramX, so the runs have no possibility of treading on each other. The tempfile module can help with this.

青巷忧颜 2024-11-06 18:34:54

我看到两个选项:

  1. 您可以使用 lsof 查找打开的文件来查找 ProgramX 正在写入的文件。
  2. 另一种方法是在临时目录中运行 ProgramX(有关简单方法,请参阅 tempfile在运行 ProgramX 之间,如果您计划同时运行多个 copyProgramX,则可以清理该目录或继续请求新的临时目录。

Two options that I see:

  1. You could use lsof to find open files to find the files that ProgramX is writing.
  2. A different approach would be to run ProgramX in a temporary directory (see tempfile for an easy way of setting up directories. Between runs of ProgramX, you can clean that directory or keep requesting new temp directories, if you are planning on running multiple copieProgramX at the same time.
尴尬癌患者 2024-11-06 18:34:54

如果只有一个 ProgramX* 文件,那么:

mv ProgramX* input.output

If there is only one ProgramX* file, then what about just:

mv ProgramX* input.output
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文