如何在C中将字符串编码更改为utf 8
如何将字符串的字符编码更改为 UTF-8?我正在对 python 程序进行一些 execv 调用,但 python 返回带有某些字符的字符串。我不知道这是 python 问题还是 c 问题,但我想如果我可以更改 c 中的字符串编码然后将其传递给 python,它应该可以解决问题。那么我该怎么做呢?
谢谢。
How can i change character encoding of a string to UTF-8? I am making some execv calls to a python program but python returns the strings with the some characters cut of. I don't know if this a python issue or c issue but i thought if i can change the strings encoding in c and then pass it to python, it should do the trick. So how can i do that?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
C 作为一种语言不利于字符串编码。 AC 字符串只是一个以 null 结尾的字符序列(在大多数系统上为 8 位有符号整数)。
宽字符串(具有
wchar_t
类型的字符,通常为 16 位整数)也可用于保存更大的字符值;然而,C 标准库函数和数据类型完全不知道字符串编码的任何概念。您的问题的答案是确保您传递给 Python 的字符串被编码为 UTF-8。
然而,为了帮助您以任何详细的能力完成此任务,您必须提供有关字符串当前如何形成、它们包含什么以及如何为 exec 构建参数列表的更多信息。
C as a language does not facilitate string encoding. A C string is simply a null-terminated sequence of characters (8-bit signed integers, on most systems).
A wide string (with characters of type
wchar_t
, typically 16-bit integers) can also be used to hold larger character values; however, again, C standard library functions and data types are in no way aware of any concept of string encoding.The answer to your question is to ensure that the strings you're passing into Python are encoded as UTF-8.
In order to help you accomplish that in any detailed capacity, however, you will have to provide more information about how your strings are currently formed, what they contain, and how you're constructing your argument list for exec.
C 中不存在字符编码这样的东西。
char*
可以保存任何数据,如何解释字符取决于您。例如,printf
通常会将字符按原样转储到标准输出,如果您的控制台将这些字符解释为 UFT8,它们将按原样显示。如果你想在C端进行不同编码之间的转换,可以看看ICU。
如果你想在Python端的编码之间进行转换,请查看http://docs.python.org /howto/unicode.html。
There is no such thing as character encoding in C.
A
char*
can hold any data, how you interpret the characters is up to you. For instance,printf
will typically dump the characters as they are to the standard output, and if your console interprets those characters as UFT8, they'll appear as such.If you want to convert between different encodings in the C side, you can have a look at ICU.
If you want to convert between encodings in the Python side, look at http://docs.python.org/howto/unicode.html.