使用Python生成独特的JSON的问题

发布于 2025-01-19 07:06:35 字数 798 浏览 3 评论 0原文

经过几次尝试和研究,我一直在尝试如何使用 cur.itersize 解决错误内存问题,但仍然遇到问题,因为它没有生成我期望的包含所有行的唯一 json 文件,我得到了什么是一次又一次生成相同的文件,不知道如何将块附加到唯一的 json 文件中。 我尝试在一个独特的查询中运行它,而不需要

for row in cur:

和更改 fetchmany() 和 fetchall(),但表查询很大,并且在 PostgreSQL 中弹出内存不足的错误,我需要整个数据集。

for y in x:
    cur = connection.cursor('test')
    cur.itersize = 2000
    cur.execute(
        " SELECT * FROM table "
    print("fetching data for " + y)
    for row in cur:
        rows = cur.fetchmany(2000)
        print("Generating json")
        rowarray_list = []
        print("parsing rows to json file")
        json_data = json.dumps(rows)
        filename = '%s.json' % y
        print('File: ' + filename + ' created')
        with open(filename, "w") as f:
            f.write(json_data)

After several try and research I have been trying how to solve the issue of the error memory by using cur.itersize but still having problems as it not generating what I'm expecting a unique json file with all the rows, what I'm getting is that is generating the same file once after other, not sure how to do it to make the chunks being appended within a unique json file.
I have tried to run it in a unique query without

for row in cur:

and and changing fetchmany() and for fetchall() but the table querying is huge and pop errors of out of memory in PostgreSQL, I need the whole dataset.

for y in x:
    cur = connection.cursor('test')
    cur.itersize = 2000
    cur.execute(
        " SELECT * FROM table "
    print("fetching data for " + y)
    for row in cur:
        rows = cur.fetchmany(2000)
        print("Generating json")
        rowarray_list = []
        print("parsing rows to json file")
        json_data = json.dumps(rows)
        filename = '%s.json' % y
        print('File: ' + filename + ' created')
        with open(filename, "w") as f:
            f.write(json_data)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

聆听风音 2025-01-26 07:06:35

您可以采取一些措施来解决内存问题和错误,并提高性能。

  • 使用 limit/offset 限制查询中的结果,并使用 fetchall 而不是使用 fetchmany。即“SELECT * FROM 表LIMIT 10 OFFSET 0”;
  • open 中的模式设置为“a”(追加)而不是“w”(写入);

我建议的代码:

limit = 10

for y in x:
    print("fetching data for " + y)
    filename = '%s.json' % y

    page = 0

    while 1:
        offset = limit * page

        with connection.cursor() as cur:
            cur.execute("SELECT * FROM table LIMIT %s OFFSET %s", (limit, offset))
            rows = cur.fetchall()

        if not rows:
            break

        with open(filename, "a") as f:
            f.write(json.dumps(rows))

        page += 1

there are some things you could do to solve the memory problem, and bugs, and also increase performance.

  • limit the result in the query using limit/offset and use fetchall instead of using fetchmany. i.e. "SELECT * FROM table LIMIT 10 OFFSET 0";
  • set the mode in open to "a" (append) instead of "w" (write);

my suggested code:

limit = 10

for y in x:
    print("fetching data for " + y)
    filename = '%s.json' % y

    page = 0

    while 1:
        offset = limit * page

        with connection.cursor() as cur:
            cur.execute("SELECT * FROM table LIMIT %s OFFSET %s", (limit, offset))
            rows = cur.fetchall()

        if not rows:
            break

        with open(filename, "a") as f:
            f.write(json.dumps(rows))

        page += 1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文