Django 转储数据 UTF-8 (Unicode)

发布于 2024-08-19 08:09:08 字数 380 浏览 4 评论 0原文

有没有一种简单的方法可以从数据库转储 UTF-8 数据?

我知道这个命令:

manage.py dumpdata > mydata.json

但是我在文件mydata.json中得到的数据,Unicode数据看起来像:

"name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8"

我想看到一个真正的Unicode字符串,如全球卫星定位系统(中文)。

Is there a easy way to dump UTF-8 data from a database?

I know this command:

manage.py dumpdata > mydata.json

But the data I got in the file mydata.json, Unicode data looks like:

"name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8"

I would like to see a real Unicode string like 全球卫星定位系统 (Chinese).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

最舍不得你 2024-08-26 08:09:08

@Julian Polard 的帖子中的这个解决方案对我有用。

基本上只需在运行此命令时在 pypython 前面添加 -Xutf8 即可:

python -Xutf8 manage.py dumpdata > data.json

请投票 他的答案以及如果这对您有用^_^

This solution worked for me from @Julian Polard's post.

Basically just add -Xutf8 in front of py or python when running this command:

python -Xutf8 manage.py dumpdata > data.json

Please upvote his answer as well if this worked for you ^_^

怪异←思 2024-08-26 08:09:08

在解决类似问题后,我发现 xml 格式化程序可以正确处理 UTF8。

manage.py dumpdata --format=xml > output.xml

我必须将数据从 Django 0.96 传输到 Django 1.3。经过多次尝试转储/加载数据后,我终于成功地使用了 xml。目前没有副作用。

希望这会对某人有所帮助,因为我在寻找解决方案时已登陆此线程。

After struggling with similar issues, I've just found, that xml formatter handles UTF8 properly.

manage.py dumpdata --format=xml > output.xml

I had to transfer data from Django 0.96 to Django 1.3. After numerous tries with dump/load data, I've finally succeeded using xml. No side effects for now.

Hope this will help someone, as I've landed at this thread when looking for a solution..

指尖凝香 2024-08-26 08:09:08

django-admin.py dumpdata yourapp 可以转储那个目的。

或者,如果您使用 MySQL,则可以使用 mysqldump 命令转储整个数据库。

此线程有多种转储数据的方法,包括手动方法。

更新:因为OP编辑了问题。

要将 JSON 编码字符串转换为人类可读的字符串,您可以使用以下命令:

open("mydata-new.json","wb").write(open("mydata.json").read().decode("unicode_escape").encode("utf8"))

django-admin.py dumpdata yourapp could dump for that purpose.

Or if you use MySQL, you could use the mysqldump command to dump the whole database.

And this thread has many ways to dump data, including manual methods.

UPDATE: because OP edited the question.

To convert from JSON encoding string to human readable string you could use this:

open("mydata-new.json","wb").write(open("mydata.json").read().decode("unicode_escape").encode("utf8"))
轮廓§ 2024-08-26 08:09:08

您需要在 Django 代码中找到对 json.dump*() 的调用,并传递附加选项 ensure_ascii=False,然后对结果进行编码,或者您需要使用 json.load*() 加载 JSON,然后使用该选项转储它。

You need to either find the call to json.dump*() in the Django code and pass the additional option ensure_ascii=False and then encode the result after, or you need to use json.load*() to load the JSON and then dump it with that option.

夜访吸血鬼 2024-08-26 08:09:08

我在这里写了一个片段
对我有用!

Here I wrote a snippet for that.
Works for me!

陌路黄昏 2024-08-26 08:09:08

您可以创建自己的序列化程序,将 ensure_ascii=False 参数传递给 json.dumps 函数:

# serfializers/json_no_uescape.py
from django.core.serializers.json import *


class Serializer(Serializer):

    def _init_options(self):
        super(Serializer, self)._init_options()
        self.json_kwargs['ensure_ascii'] = False

然后注册新的序列化程序(例如在您的应用中 __init__.py file):

from django.core.serializers import register_serializer

register_serializer('json-no-uescape', 'serializers.json_no_uescape')

然后你可以运行:

manage.py dumpdata --format=json-no-uescape >输出.json

You can create your own serializer which passes ensure_ascii=False argument to json.dumps function:

# serfializers/json_no_uescape.py
from django.core.serializers.json import *


class Serializer(Serializer):

    def _init_options(self):
        super(Serializer, self)._init_options()
        self.json_kwargs['ensure_ascii'] = False

Then register new serializer (for example in your app __init__.py file):

from django.core.serializers import register_serializer

register_serializer('json-no-uescape', 'serializers.json_no_uescape')

Then you can run:

manage.py dumpdata --format=json-no-uescape > output.json

小情绪 2024-08-26 08:09:08

由于提供了一个很好的答案被接受,应该被认为是 python 3 区分文本和二进制数据,因此两个文件都必须以二进制模式打开:

open("mydata-new.json","wb").write(open("mydata.json", "rb").read().decode("unicode_escape").encode("utf8"))

否则,将引发错误 AttributeError: 'str' object has no attribute 'decode'

As YOU has provided a good answer that is accepted, it should be considered that python 3 distincts text and binary data, so both files must be opened in binary mode:

open("mydata-new.json","wb").write(open("mydata.json", "rb").read().decode("unicode_escape").encode("utf8"))

Otherwise, the error AttributeError: 'str' object has no attribute 'decode' will be raised.

傲世九天 2024-08-26 08:09:08

我通常会在 Makefile 中添加下一个字符串:

.PONY: dump

# make APP=core MODEL=Schema dump
dump:
    @python manage.py dumpdata --indent=2 --natural-foreign --natural-primary ${APP}.${MODEL} | \
    python -c "import sys; sys.stdout.write(sys.stdin.read().encode().decode('unicode_escape'))" \
    > ${APP}/fixtures/${MODEL}.json

对于标准 django 项目结构来说没问题,如果您的项目结构不同,请修复。

I'm usually add next strings in my Makefile:

.PONY: dump

# make APP=core MODEL=Schema dump
dump:
    @python manage.py dumpdata --indent=2 --natural-foreign --natural-primary ${APP}.${MODEL} | \
    python -c "import sys; sys.stdout.write(sys.stdin.read().encode().decode('unicode_escape'))" \
    > ${APP}/fixtures/${MODEL}.json

It's ok for standard django project structure, fix if your project structure is different.

柠檬 2024-08-26 08:09:08

问题已针对 Django 3.1

This problem has been fixed for both JSON and YAML in Django 3.1.

新一帅帅 2024-08-26 08:09:08

这是一个新的解决方案。

我刚刚在 github 上分享了一个存储库:django-dump-load-utf8

不过,我认为这是 django 的一个 bug,希望有人可以将我的项目合并到 django 中。

一个不错的解决方案,但我认为修复 django 中的错误会更好。

manage.py dumpdatautf8 --output data.json
manage.py loaddatautf8 data.json

here's a new solution.

I just shared a repo on github: django-dump-load-utf8.

However, I think this is a bug of django, and hope someone can merge my project to django.

A not bad solution, but I think fix the bug in django would be better.

manage.py dumpdatautf8 --output data.json
manage.py loaddatautf8 data.json
你与昨日 2024-08-26 08:09:08
import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, 'r').read().decode('string-escape')
codecs.open(dst, "wb").write(source)
import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, 'r').read().decode('string-escape')
codecs.open(dst, "wb").write(source)
倦话 2024-08-26 08:09:08

我遇到了同样的问题。阅读完所有答案后,我想出了 Alidarthwade 的回答:

manage.py dumpdata app.category --indent=2 > categories.json
manage.py shell

import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, "rb").read().decode('unicode-escape')
codecs.open(dst, "wb","utf-8").write(source)

在 Python 3 中,我必须以二进制模式打开文件并解码为unicode-escape >。当我以写入(二进制)模式打开时,我还添加了 utf-8

我希望它有帮助:)

I encountered the same issue. After reading all the answers, I came up with a mix of Ali and darthwade's answers:

manage.py dumpdata app.category --indent=2 > categories.json
manage.py shell

import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, "rb").read().decode('unicode-escape')
codecs.open(dst, "wb","utf-8").write(source)

In Python 3, I had to open the file in binary mode and decode as unicode-escape. Also I added utf-8 when I open in write (binary) mode.

I hope it helps :)

2024-08-26 08:09:08

这是来自 djangoproject.com 的解决方案
您转到“设置”,在“语言”-“管理语言设置”-“更改系统区域设置”-“区域设置”中,有一个“使用 Unicode UTF-8 获得全球语言支持”框。如果我们应用它并重新启动,那么我们就会从 Python 获得一个合理的、现代的默认编码。
djangoproject.com

Here is the solution from djangoproject.com
You go to Settings there's a "Use Unicode UTF-8 for worldwide language support", box in "Language" - "Administrative Language Settings" - "Change system locale" - "Region Settings". If we apply that, and reboot, then we get a sensible, modern, default encoding from Python.
djangoproject.com

氛圍 2024-08-26 08:09:08

到了 2023 年,我仍然过得很艰难。我必须遵循 @wertartem 的建议,然后更改输出文件的文件编码才能使其正常工作。似乎“-Xutf8”标签对我来说不是必需的,但阅读本文的人可能需要遵循所有 3 个步骤。

我还有一个较小的问题,我通过从导出中排除 admin.logentry 来解决(添加这些标签“-e auth -e contenttypes -e auth.Permission -e admin.logentry”)

我的完整过程:

  1. 至少为了正确的编码对于 Windows,请确保 utf-8
    启用全球语言支持。要做到这一点,(至少对于
    Windows 11)转到“时间和语言”> “语言与地区”。在下面
    “相关设置”,点击“管理语言设置”。点击
    “更改系统区域设置”。选中“Beta:使用 Unicode UTF-8”复选框
    以获得全球语言支持”。重新启动计算机。启用后,
    以后导出时请跳过此步骤。
  2. 在终端中运行此命令(在这里,我导出到一个子目录并从导出中排除多个应用程序和模型): python -Xutf8 manage.py dumpdata
    --format=json --natural-foreign --natural-primary -e auth -e contenttypes -e auth.Permission -e admin.logentry >
    Databases/seeds/dump.json
  3. 打开此“dump.json”文件并运行 vscode 命令
    “更改文件编码”以使用 UTF-8 编码保存。如果 vscode
    崩溃,这可以在 Sublime Text 中通过打开文件来完成
    从文件菜单中使用编码保存。
  4. 更改与新数据库的连接。
  5. python manage.py reset_db
  6. python manage.py migrate
  7. python manage.py loaddata "databases/seeds/dump.json"

您的步骤 2 命令可能需要(但不要求)稍作修改。看看这个: https://docs.djangoproject.com/en /4.2/ref/django-admin/#dumpdata

In 2023, I still had a rough time with this. I had to follow @wertartem's suggestion and then Change the file encoding of the outputted file to get it to work. It seems the "-Xutf8" tag wasn't necessary for me, but someone reading this might need to follow all 3 steps.

I also had a smaller issue I solved by excluding the admin.logentry from the export (added these tags "-e auth -e contenttypes -e auth.Permission -e admin.logentry")

My full process:

  1. For proper encoding, at least for Windows, make sure utf-8 for
    worldwide language support is enabled. To do this, (at least for
    Windows 11) go to "Time & Language" > "Language & Region". Under
    "Related Settings", click "Administrative Language Settings". Click
    "Change System Locale". Check the box for "Beta: Use Unicode UTF-8
    for worldwide language support". Restart the computer. Once enabled,
    skip this step for future exports.
  2. Run this command in terminal (here, I'm exporting to a subdirectory and excluding several apps and models from the export): python -Xutf8 manage.py dumpdata
    --format=json --natural-foreign --natural-primary -e auth -e contenttypes -e auth.Permission -e admin.logentry >
    databases/seeds/dump.json
  3. Open this "dump.json" file and run the vscode command
    "Change File Encoding" to save with UTF-8 encoding. If vscode
    crashes, this can be done in sublime text instead by opening the file and
    saving with encoding from the file menu.
  4. Change connection to the new database.
  5. python manage.py reset_db
  6. python manage.py migrate
  7. python manage.py loaddata "databases/seeds/dump.json"

Your step 2 command may desire (but not require) slight modification. Check out this: https://docs.djangoproject.com/en/4.2/ref/django-admin/#dumpdata

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文