subprocess.Popen(..).communicate(..) 与 graphviz 一起使用时随机丢弃数据！

发布于 2024-08-21 09:52:48 字数 807 浏览 9 评论 0原文

我正在使用 graphviz 的 dot 为 Web 应用程序生成一些 svg 图表。我使用 Popen 调用 dot：

    p = subprocess.Popen(u'/usr/bin/dot -Kfdp -Tsvg', shell=True,\
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    str = u'long-unicode-string-i-want-to-convert'
    (stdout,stderr) = p.communicate(str)

发生的情况是 dot 程序抛出如下错误：

    Error: not well-formed (invalid token) in line 1 
 ... <tr><td cellpadding="4bgcolor="#EEE8AA"> ...
in label of node n260

这个明显的错误肯定不在输入字符串中。特别是，如果我使用 utf-8 编码将其保存到 str.txt，

/usr/bin/dot -Kfdp -Tsvg < str.txt > myimg.svg

我是否会得到所需的输出。 str 唯一的“特殊”之处在于它包含丹麦语 øæå 等字符。

现在我不知道我应该做什么。问题很可能出在点上；但它肯定似乎是由 Popen 与使用 << 不同触发的。从外壳开始，我不知道从哪里开始。任何关于替代调用点的帮助或想法（除了将所有数据写入文件并调用它！）将不胜感激！

原文

I am using graphviz's dot to generate some svg graphs for a web application. I call dot using Popen:

    p = subprocess.Popen(u'/usr/bin/dot -Kfdp -Tsvg', shell=True,\
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    str = u'long-unicode-string-i-want-to-convert'
    (stdout,stderr) = p.communicate(str)

What happends is that the dot program throw errors like:

    Error: not well-formed (invalid token) in line 1 
 ... <tr><td cellpadding="4bgcolor="#EEE8AA"> ...
in label of node n260

That obvious error is most certainly NOT in the input string. In particular, if I save it to str.txt with utf-8 encoding and do

/usr/bin/dot -Kfdp -Tsvg < str.txt > myimg.svg

I get the desired output. The only 'special' thing about str is that it contain characters like the danish øæå.

Right now I have no clue what I should do. The problem may very well be in dot; but it certainly seem to be triggered by Popen being different than using < from the shell, and i have no idea where to begin. Any help or ideas for alternatively calling dot (besides writing all the data to a file and calling that!) would be very appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

缘字诀 2024-08-28 09:52:48

听起来你应该这样做：（

stdout, stderr = p.communicate(str.encode('utf-8'))

当然，除了你不应该隐藏内置 str。）Python 中的 unicode 类型保存 unicode 数据，而不是 UTF- 8.如果你想要UTF-8，你需要显式编码它。

最重要的是，没有理由在该代码片段中使用 shell=True ，传递给 subprocess.Popen 的 unicode 文字也不是一个特别好的主意（无论如何它都会被编码为 ASCII。）末尾的反斜杠是不必要的——Python 知道该行是继续的，因为你有一个尚未关闭的左括号。所以，使用：

p = subprocess.Popen(['/usr/bin/dot', '-Kfdp', '-Tsvg'],
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)

Sounds like you should be doing:

stdout, stderr = p.communicate(str.encode('utf-8'))

(except, of course, that you shouldn't shadow the builtin str.) The unicode type in Python holds unicode data, not UTF-8. If you want UTF-8, you need to explicitly encode it.

On top of that, there's no reason to use shell=True in that snippet, nor is the unicode literal passed to subprocess.Popen a particularly good idea (it just gets encoded to ASCII anyway.) And the backslash at the end is unnecessary -- Python knows the line is continued, because you have an open parenthesis that hasn't been closed yet. So, use:

p = subprocess.Popen(['/usr/bin/dot', '-Kfdp', '-Tsvg'],
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)

回复收藏 0 原文

~没有更多了~

关于作者

淡淡的优雅

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

subprocess.Popen(..).communicate(..) 与 graphviz 一起使用时随机丢弃数据！

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

苦中寻乐

lueluelue

嗼ふ静

王权女流氓

与花如笺

残酷

友情链接

subprocess.Popen(..).communicate(..) 与 graphviz 一起使用时随机丢弃数据！

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

苦中寻乐

lueluelue

嗼ふ静

王权女流氓

与花如笺

残酷

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。