当前位置：文江博客话题详情

Python UTF-8 cx-freeze

cx_freeze 和 utf-8 字符未显示的问题

发布于 2024-11-30 15:42:00 字数 648 浏览 1 评论 0 原文

我正在尝试编译一个包含西班牙语字符串的 python 脚本。

如果我运行 .py，它会正确显示。编译运行正常，但是当我运行生成的 .exe 时，非 ASCII 字符被错误字符替换，并且没有报告错误。

我找不到任何人询问同样的问题，我是唯一一个尝试编译 ñ 的人还是我在编译中遗漏了某些内容？

我在 win xp 上使用 python 3.1.2 和 cx_freeze 4.2.1。使用基本编译 (\Scripts\cxfreeze) 和高级 (setup.py)

测试代码，main.py

# coding=UTF-8
print('mensaje de prueba \u00e1ñ ó \xf1')

运行 .py

“正确的输出”

运行.exe

编辑：

冻结Machin测试源

frozen Machin 测试源

原文

I'm trying to compile a python script which contains spanish strings.

If i run the .py, it's displayed correctly. Compilation runs fine, but when I run the resulting .exe, the non-ascii characters are replaced with error chars, and no error reported.

I couldn't find anyone asking about the same problem, am I the only one trying to compile an ñ or am I missing something in my compilation?

I'm using python 3.1.2 with cx_freeze 4.2.1 on win xp. The problem is consistent usin basic compilation (\Scripts\cxfreeze) and advanced (setup.py)

test code, main.py

# coding=UTF-8
print('mensaje de prueba \u00e1ñ ó \xf1')

running .py

correct output

running .exe

cx_freeze output

EDIT:

frozen Machin test source

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜｀诱少女 2024-12-07 15:42:00

不可能确定，但假设源文件中显示的内容和显示的内容在传输中没有发生变形，您的问题是这样的：

您期望看到 (a-acute, n-tilde, o-acute），但您实际上看到“错误字符”（不间断空格又名 NBSP、货币符号、分符号）。

我没有cxfreeze。我的猜测是 cxfreeze 对您的输出进行了双重编码。这是基于在 Windows 7 上使用 Python 3.2.0 运行以下源文件。您会注意到，我对文本字符使用了转义序列，以排除由源编码问题引起的任何噪音。

# coding: ascii ... what you see is what you've got.
# expected output: a-acute(e1) n-tilde(f1) o-acute(f3)
import sys
import unicodedata as ucd
text = '\xe1\xf1\xf3'
print("expected output:")
for c in text:
    print(ascii(c), ucd.name(c))
print("seen output[%s]" % text)
sse = sys.stdout.encoding
print(sse)
print("Expected raw bytes output:", text.encode(sse))
whoops = text.encode(sse).decode('latin1')
print("whoops:")
for w in whoops:
    print(ascii(w), ucd.name(w))

这是它的输出。

expected output:
'\xe1' LATIN SMALL LETTER A WITH ACUTE
'\xf1' LATIN SMALL LETTER N WITH TILDE
'\xf3' LATIN SMALL LETTER O WITH ACUTE
seen output[áñó]
cp850
Expected raw bytes output: b'\xa0\xa4\xa2'
whoops:
'\xa0' NO-BREAK SPACE
'\xa4' CURRENCY SIGN
'\xa2' CENT SIGN

在“看到的输出”后面的括号中，我按预期看到了 a-acute、n-tilde 和 o-acute。请在使用或不使用 cxfreezing 的情况下运行该脚本，并报告（用文字）您所看到的内容。如果冻结的“看到的输出”实际上是一个空格，后跟一个货币符号和一个分号，您应该向 cxfreeze 维护者报告该问题（带有指向此答案的链接）。

It is not possible to be certain, but assuming that what appears to be in your source file and what appears to be displayed has not been transmogrified in transmission, your problem is this:

You expect to see (a-acute, n-tilde, o-acute), but you actually see "error characters" (no-break space aka NBSP, currency sign, cent sign).

I don't have cxfreeze. My guess is that cxfreeze is doubly encoding your output. This is based on running the following source file using Python 3.2.0 on Windows 7. You will notice that I have used escape sequences for the text characters in order to rule out any noise caused by source encoding problems.

# coding: ascii ... what you see is what you've got.
# expected output: a-acute(e1) n-tilde(f1) o-acute(f3)
import sys
import unicodedata as ucd
text = '\xe1\xf1\xf3'
print("expected output:")
for c in text:
    print(ascii(c), ucd.name(c))
print("seen output[%s]" % text)
sse = sys.stdout.encoding
print(sse)
print("Expected raw bytes output:", text.encode(sse))
whoops = text.encode(sse).decode('latin1')
print("whoops:")
for w in whoops:
    print(ascii(w), ucd.name(w))

and here is its output.

expected output:
'\xe1' LATIN SMALL LETTER A WITH ACUTE
'\xf1' LATIN SMALL LETTER N WITH TILDE
'\xf3' LATIN SMALL LETTER O WITH ACUTE
seen output[áñó]
cp850
Expected raw bytes output: b'\xa0\xa4\xa2'
whoops:
'\xa0' NO-BREAK SPACE
'\xa4' CURRENCY SIGN
'\xa2' CENT SIGN

In the brackets after "seen output", I see a-acute, n-tilde, and o-acute as expected. Please run the script with and without cxfreezing, and report (in words) what you see. If the frozen "seen output" is in fact a space followed by a currency sign and a cent sign, you should report the problem (with a link to this answer) to the cxfreeze maintainer.

回复收藏 0 原文

~没有更多了~