无法从 STDIN 正确读取
我在 python 脚本中从 STDIN 读取时遇到一个奇怪的问题。
这是我的用例。我为 rsyslog 配置了输出模块,以便 rsyslog 可以通过管道将日志消息传输到我的 Python 脚本。
我的 Python 脚本非常简单:
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
fd = open('/tmp/testrsyslogomoutput.txt', 'a')
fd.write("Receiving log message : \n%s\n" % ('-'.join(sys.stdin.readlines())))
fd.close()
如果我运行 echo "foo" | mypythonscript.py
我可以在目标文件 /tmp/testrsyslogomoutput.txt
中获取“foo”。但是,当我在 rsyslog 中运行它时,似乎仅当我停止/重新启动 rsyslog 时才会发送消息(我相信某些缓冲区在某个时刻被刷新)。
我一开始以为是Rsyslog的问题。所以我用 shell 程序替换了我的 python 程序,没有对 rsyslog 配置进行任何更改。 shell 脚本与 rsyslog 完美配合,正如您在下面的代码中看到的,该脚本非常简单:
#! /bin/sh
cat /dev/stdin >> /tmp/testrsyslogomoutput.txt
由于我的 shell 脚本可以工作,但我的 Python 脚本不能,我相信我在 Python 代码中的某个地方犯了错误,但我不能找到哪里。如果您能指出我的错误,那就太好了。
提前致谢 :)
I have a weird problem to read from STDIN in a python script.
Here is my use case. I have rsyslog configured with an output module so rsyslog can pipe log messages to my Python script.
My Python script is really trivial :
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
fd = open('/tmp/testrsyslogomoutput.txt', 'a')
fd.write("Receiving log message : \n%s\n" % ('-'.join(sys.stdin.readlines())))
fd.close()
If I run echo "foo" | mypythonscript.py
I can get "foo" in the target file /tmp/testrsyslogomoutput.txt
. However when I run it within rsyslog, messages seems to be sent only when I stop/restart rsyslog (I believe some buffer is flushed at some point).
I first thought it was a problem with Rsyslog. So I replaced my python program with a shell one, without changing anything to the rsyslog configuration. The shell script works perfectly with rsyslog and as you can see in the code below, the script is really trivial:
#! /bin/sh
cat /dev/stdin >> /tmp/testrsyslogomoutput.txt
Since my shell script works but my Python one does not, I believe I made a mistake somewhere in my Python code but I can not find where. If you could point me to my mistake(s) that would be great.
Thanks in advance :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
readlines
在完成读取文件之前不会返回。由于输入 stdin 的管道永远不会完成,因此 readlines 也永远不会完成。停止 rsyslog 会关闭管道并让它完成。readlines
will not return until it has finished reading the file. Since the pipe feeding stdin never finishes,readlines
never finishes either. Stopping rsyslog closes the pipe and lets it finish.我还怀疑原因是 rsyslog 没有终止。
readlines()
在到达真正的 EOF 之前不应返回。但为什么 shell 脚本会有不同的行为呢?也许使用 /dev/stdin 就是原因。尝试这个版本,看看它是否仍然可以运行而不会挂起:如果这有影响,您还可以修复:从 python 打开并读取 /dev/stdin,而不是 sys.stdin。
编辑:所以
cat
以某种方式读取在 stdin 处等待的内容并返回,但 python 会阻塞并等待,直到 stdin 耗尽。奇怪的。您还可以尝试将readlines()
替换为单个read()
后跟split("\n")
,但此时我怀疑这会有帮助。所以,忘记诊断,让我们尝试一个解决方法:强制 stdin 执行非阻塞 I/O。以下内容应该可以做到这一点:
您可能希望将其与 python -u 结合使用。希望它有效!
I'd also suspect the reason is that rsyslog does not terminate.
readlines()
should not return until it reaches a real EOF. But why would the shell script act differently? Perhaps the use of /dev/stdin is the reason. Try this version and see if it still runs without hanging:If this makes a difference, you'll also have a fix: open and read /dev/stdin from python, instead of sys.stdin.
Edit: So
cat
somehow reads whatever is waiting at stdin and returns, but python blocks and waits until stdin is exhausted. Strange. You can also try replacingreadlines()
with a singleread()
followed bysplit("\n")
, but at this point I doubt that will help.So, forget the diagnosis and let's try a work-around: Force stdin to do non-blocking i/o. The following is supposed to do that:
You probably want to use that in combination with
python -u
. Hope it works!如果您使用
readline()
代替,它将返回\n
,尽管这只会写入一行然后退出。如果您想继续编写行,只要它们在那里,您可以使用简单的
for
:If you use
readline()
instead, it will return on\n
, though this will only write one line then quit.If you want to keep writing lines as long they are there, you can use a simple
for
: