在python 2中用十六进制字符解码字符串

发布于 2024-09-06 02:09:06 字数 168 浏览 2 评论 0原文

我有一个十六进制字符串,我想将其转换为utf8以插入mysql。 (我的数据库是utf8)

hex_string = 'kitap ara\xfet\xfdrmas\xfd'
...
result = 'kitap araştırması'

我该怎么做?

I have a hex string and i want to convert it utf8 to insert mysql. (my database is utf8)

hex_string = 'kitap ara\xfet\xfdrmas\xfd'
...
result = 'kitap araştırması'

How can I do that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

九命猫 2024-09-13 02:09:06

尝试(Python 3.x):

import codecs
codecs.decode("707974686f6e2d666f72756d2e696f", "hex").decode('utf-8')

来自此处

Try(Python 3.x):

import codecs
codecs.decode("707974686f6e2d666f72756d2e696f", "hex").decode('utf-8')

From here.

还如梦归 2024-09-13 02:09:06

假设Python 2.6,

>>> print('kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9'))
kitap araştırması
>>> 'kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9').encode('utf-8')
'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1'

Assuming Python 2.6,

>>> print('kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9'))
kitap araştırması
>>> 'kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9').encode('utf-8')
'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1'
电影里的梦 2024-09-13 02:09:06

字符串文字 解释了如何在 Python 源代码中使用 UTF8 字符串。

String literals explains how to use UTF8 strings in Python source.

中性美 2024-09-13 02:09:06

尝试

hex_string.decode("cp1254").encode("utf-8")

cp1254iso-8859-9 是土耳其语代码页,前者是 Windows 平台上的常用名称,但在 Python 中,两者都同样有效)

Try

hex_string.decode("cp1254").encode("utf-8")

(cp1254 or iso-8859-9 are the Turkish codepages, the former being the usual name on Windows platforms, but in Python, both work equally well)

愁杀 2024-09-13 02:09:06

首先,您需要从您拥有的编码字节中对其进行解码。这似乎是 ISO-8859-9 (latin-5),或者,如果您使用的是 Windows,可能是 代码页 1254,基于 latin-5。

>>> 'kitap ara\xfet\xfdrmas\xfd'.decode('cp1254')
u'kitap ara\u015ft\u0131rmas\u0131' # u'kitap araştırması'

如果您使用Windows,那么根据您获取这些字节的位置,可能将它们解码为mbcs更合适,它翻译为到“本地系统正在使用的代码页”。如果字符串仅位于 .py 文件中,则最好在源代码中编写 u'kitap araştırması' 并设置 -*- coding 声明来指导 Python 对其进行解码。请参阅 PEP 263

至于如何将数据库的 unicode 字符串编码为 UTF-8,如果您愿意,您可以手动执行:

>>> u'kitap ara\u015ft\u0131rmas\u0131'.encode('utf-8')
'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1'

但是如果您有COLLATION。

First you need to decode it from the encoded bytes you have. That appears to be ISO-8859-9 (latin-5), or, if you are using Windows, probably code page 1254, which is based on latin-5.

>>> 'kitap ara\xfet\xfdrmas\xfd'.decode('cp1254')
u'kitap ara\u015ft\u0131rmas\u0131' # u'kitap araştırması'

If you are using Windows, then depending on where you are getting those bytes, it might be more appropriate to decode them as mbcs, which translates to ‘whichever code page the local system is using’. If the string is just sitting in a .py file, you would be better off just writing u'kitap araştırması' in the source and setting a -*- coding declaration to direct Python to decode it. See PEP 263.

As to how to encode unicode strings to UTF-8 for the database, well, if you want to you can do it manually:

>>> u'kitap ara\u015ft\u0131rmas\u0131'.encode('utf-8')
'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1'

but a good data access layer is likely to do that automatically for you, if you've got the COLLATION of the tables the data is going into right.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文