subprocess.Popen(..).communicate(..) 与 graphviz 一起使用时随机丢弃数据!
我正在使用 graphviz 的 dot 为 Web 应用程序生成一些 svg 图表。我使用 Popen 调用 dot:
p = subprocess.Popen(u'/usr/bin/dot -Kfdp -Tsvg', shell=True,\
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
str = u'long-unicode-string-i-want-to-convert'
(stdout,stderr) = p.communicate(str)
发生的情况是 dot 程序抛出如下错误:
Error: not well-formed (invalid token) in line 1
... <tr><td cellpadding="4bgcolor="#EEE8AA"> ...
in label of node n260
这个明显的错误肯定不在输入字符串中。特别是,如果我使用 utf-8 编码将其保存到 str.txt,
/usr/bin/dot -Kfdp -Tsvg < str.txt > myimg.svg
我是否会得到所需的输出。 str 唯一的“特殊”之处在于它包含丹麦语 øæå 等字符。
现在我不知道我应该做什么。问题很可能出在点上;但它肯定似乎是由 Popen 与使用 << 不同触发的。从外壳开始,我不知道从哪里开始。任何关于替代调用点的帮助或想法(除了将所有数据写入文件并调用它!)将不胜感激!
I am using graphviz's dot to generate some svg graphs for a web application. I call dot using Popen:
p = subprocess.Popen(u'/usr/bin/dot -Kfdp -Tsvg', shell=True,\
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
str = u'long-unicode-string-i-want-to-convert'
(stdout,stderr) = p.communicate(str)
What happends is that the dot program throw errors like:
Error: not well-formed (invalid token) in line 1
... <tr><td cellpadding="4bgcolor="#EEE8AA"> ...
in label of node n260
That obvious error is most certainly NOT in the input string. In particular, if I save it to str.txt with utf-8 encoding and do
/usr/bin/dot -Kfdp -Tsvg < str.txt > myimg.svg
I get the desired output. The only 'special' thing about str is that it contain characters like the danish øæå.
Right now I have no clue what I should do. The problem may very well be in dot; but it certainly seem to be triggered by Popen being different than using < from the shell, and i have no idea where to begin. Any help or ideas for alternatively calling dot (besides writing all the data to a file and calling that!) would be very appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
听起来你应该这样做:(
当然,除了你不应该隐藏内置
str
。)Python 中的 unicode 类型保存 unicode 数据,而不是 UTF- 8.如果你想要UTF-8,你需要显式编码它。最重要的是,没有理由在该代码片段中使用
shell=True
,传递给 subprocess.Popen 的 unicode 文字也不是一个特别好的主意(无论如何它都会被编码为 ASCII。)末尾的反斜杠是不必要的——Python 知道该行是继续的,因为你有一个尚未关闭的左括号。所以,使用:Sounds like you should be doing:
(except, of course, that you shouldn't shadow the builtin
str
.) The unicode type in Python holds unicode data, not UTF-8. If you want UTF-8, you need to explicitly encode it.On top of that, there's no reason to use
shell=True
in that snippet, nor is the unicode literal passed to subprocess.Popen a particularly good idea (it just gets encoded to ASCII anyway.) And the backslash at the end is unnecessary -- Python knows the line is continued, because you have an open parenthesis that hasn't been closed yet. So, use: