Django 转储数据 UTF-8 (Unicode)
有没有一种简单的方法可以从数据库转储 UTF-8 数据?
我知道这个命令:
manage.py dumpdata > mydata.json
但是我在文件mydata.json中得到的数据,Unicode数据看起来像:
"name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8"
我想看到一个真正的Unicode字符串,如全球卫星定位系统
(中文)。
Is there a easy way to dump UTF-8 data from a database?
I know this command:
manage.py dumpdata > mydata.json
But the data I got in the file mydata.json, Unicode data looks like:
"name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8"
I would like to see a real Unicode string like 全球卫星定位系统
(Chinese).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
@Julian Polard 的帖子中的这个解决方案对我有用。
基本上只需在运行此命令时在
py
或python
前面添加-Xutf8
即可:请投票 他的答案以及如果这对您有用^_^
This solution worked for me from @Julian Polard's post.
Basically just add
-Xutf8
in front ofpy
orpython
when running this command:Please upvote his answer as well if this worked for you ^_^
在解决类似问题后,我发现 xml 格式化程序可以正确处理 UTF8。
我必须将数据从 Django 0.96 传输到 Django 1.3。经过多次尝试转储/加载数据后,我终于成功地使用了 xml。目前没有副作用。
希望这会对某人有所帮助,因为我在寻找解决方案时已登陆此线程。
After struggling with similar issues, I've just found, that xml formatter handles UTF8 properly.
I had to transfer data from Django 0.96 to Django 1.3. After numerous tries with dump/load data, I've finally succeeded using xml. No side effects for now.
Hope this will help someone, as I've landed at this thread when looking for a solution..
django-admin.py dumpdata yourapp 可以转储那个目的。
或者,如果您使用 MySQL,则可以使用 mysqldump 命令转储整个数据库。
此线程有多种转储数据的方法,包括手动方法。
更新:因为OP编辑了问题。
要将 JSON 编码字符串转换为人类可读的字符串,您可以使用以下命令:
django-admin.py dumpdata yourapp could dump for that purpose.
Or if you use MySQL, you could use the mysqldump command to dump the whole database.
And this thread has many ways to dump data, including manual methods.
UPDATE: because OP edited the question.
To convert from JSON encoding string to human readable string you could use this:
您需要在 Django 代码中找到对
json.dump*()
的调用,并传递附加选项ensure_ascii=False
,然后对结果进行编码,或者您需要使用 json.load*() 加载 JSON,然后使用该选项转储它。You need to either find the call to
json.dump*()
in the Django code and pass the additional optionensure_ascii=False
and then encode the result after, or you need to usejson.load*()
to load the JSON and then dump it with that option.我在这里写了一个片段。
对我有用!
Here I wrote a snippet for that.
Works for me!
您可以创建自己的序列化程序,将
ensure_ascii=False
参数传递给json.dumps
函数:然后注册新的序列化程序(例如在您的应用中
__init__.py file):
然后你可以运行:
manage.py dumpdata --format=json-no-uescape >输出.json
You can create your own serializer which passes
ensure_ascii=False
argument tojson.dumps
function:Then register new serializer (for example in your app
__init__.py
file):Then you can run:
manage.py dumpdata --format=json-no-uescape > output.json
由于您提供了一个很好的答案被接受,应该被认为是 python 3 区分文本和二进制数据,因此两个文件都必须以二进制模式打开:
否则,将引发错误
AttributeError: 'str' object has no attribute 'decode'
。As YOU has provided a good answer that is accepted, it should be considered that python 3 distincts text and binary data, so both files must be opened in binary mode:
Otherwise, the error
AttributeError: 'str' object has no attribute 'decode'
will be raised.我通常会在 Makefile 中添加下一个字符串:
对于标准 django 项目结构来说没问题,如果您的项目结构不同,请修复。
I'm usually add next strings in my Makefile:
It's ok for standard django project structure, fix if your project structure is different.
问题已针对 Django 3.1。
This problem has been fixed for both JSON and YAML in Django 3.1.
这是一个新的解决方案。
我刚刚在 github 上分享了一个存储库:django-dump-load-utf8。
不过,我认为这是 django 的一个 bug,希望有人可以将我的项目合并到 django 中。
一个不错的解决方案,但我认为修复 django 中的错误会更好。
here's a new solution.
I just shared a repo on github: django-dump-load-utf8.
However, I think this is a bug of django, and hope someone can merge my project to django.
A not bad solution, but I think fix the bug in django would be better.
我遇到了同样的问题。阅读完所有答案后,我想出了 Ali 和 darthwade 的回答:
在 Python 3 中,我必须以二进制模式打开文件并解码为unicode-escape >。当我以写入(二进制)模式打开时,我还添加了 utf-8 。
我希望它有帮助:)
I encountered the same issue. After reading all the answers, I came up with a mix of Ali and darthwade's answers:
In Python 3, I had to open the file in binary mode and decode as unicode-escape. Also I added utf-8 when I open in write (binary) mode.
I hope it helps :)
这是来自 djangoproject.com 的解决方案
您转到“设置”,在“语言”-“管理语言设置”-“更改系统区域设置”-“区域设置”中,有一个“使用 Unicode UTF-8 获得全球语言支持”框。如果我们应用它并重新启动,那么我们就会从 Python 获得一个合理的、现代的默认编码。
djangoproject.com
Here is the solution from djangoproject.com
You go to Settings there's a "Use Unicode UTF-8 for worldwide language support", box in "Language" - "Administrative Language Settings" - "Change system locale" - "Region Settings". If we apply that, and reboot, then we get a sensible, modern, default encoding from Python.
djangoproject.com
到了 2023 年,我仍然过得很艰难。我必须遵循 @wertartem 的建议,然后更改输出文件的文件编码才能使其正常工作。似乎“-Xutf8”标签对我来说不是必需的,但阅读本文的人可能需要遵循所有 3 个步骤。
我还有一个较小的问题,我通过从导出中排除 admin.logentry 来解决(添加这些标签“-e auth -e contenttypes -e auth.Permission -e admin.logentry”)
我的完整过程:
启用全球语言支持。要做到这一点,(至少对于
Windows 11)转到“时间和语言”> “语言与地区”。在下面
“相关设置”,点击“管理语言设置”。点击
“更改系统区域设置”。选中“Beta:使用 Unicode UTF-8”复选框
以获得全球语言支持”。重新启动计算机。启用后,
以后导出时请跳过此步骤。
--format=json --natural-foreign --natural-primary -e auth -e contenttypes -e auth.Permission -e admin.logentry >
Databases/seeds/dump.json
“更改文件编码”以使用 UTF-8 编码保存。如果 vscode
崩溃,这可以在 Sublime Text 中通过打开文件来完成
从文件菜单中使用编码保存。
您的步骤 2 命令可能需要(但不要求)稍作修改。看看这个: https://docs.djangoproject.com/en /4.2/ref/django-admin/#dumpdata
In 2023, I still had a rough time with this. I had to follow @wertartem's suggestion and then Change the file encoding of the outputted file to get it to work. It seems the "-Xutf8" tag wasn't necessary for me, but someone reading this might need to follow all 3 steps.
I also had a smaller issue I solved by excluding the admin.logentry from the export (added these tags "-e auth -e contenttypes -e auth.Permission -e admin.logentry")
My full process:
worldwide language support is enabled. To do this, (at least for
Windows 11) go to "Time & Language" > "Language & Region". Under
"Related Settings", click "Administrative Language Settings". Click
"Change System Locale". Check the box for "Beta: Use Unicode UTF-8
for worldwide language support". Restart the computer. Once enabled,
skip this step for future exports.
--format=json --natural-foreign --natural-primary -e auth -e contenttypes -e auth.Permission -e admin.logentry >
databases/seeds/dump.json
"Change File Encoding" to save with UTF-8 encoding. If vscode
crashes, this can be done in sublime text instead by opening the file and
saving with encoding from the file menu.
Your step 2 command may desire (but not require) slight modification. Check out this: https://docs.djangoproject.com/en/4.2/ref/django-admin/#dumpdata