大规模的python串联串

发布于 2025-02-09 05:06:54 字数 508 浏览 1 评论 0原文

说我有大量词典清单（200万字典）。我需要从本质上进行每个字典的json.dumps（） 将其插入一个庞大的字符串中（以提出AWS OpenSearch请求的正文）。到目前为止，我已经有：

json_data = ''
action = {'index': {}}
for item in data:
    json_data += f'{json.dumps(action)}\n'
    json_data += f'{json.dumps(item)}\n'

其中数据是大型字典。这平均要在0.9到1秒之间。是否有更有效的方法可以做到这一点？

其他问题得出的结论是，如果这是一个简单的字符串添加，必须完成一次，那么c = a + b是最快的方法，但是，我必须继续附加到这种情况下BE C。我必须多次重复此操作，因此加速此操作将非常有帮助。有没有一种方法可以加快此功能，如果是这样，这些优化会是什么样？

原文

Say that I have a massive list of dictionaries (2 million dictionaries). I need to essentially do a json.dumps() of each dictionary into a massive string (to put in the body of a request to AWS OpenSearch). So far I have this:

json_data = ''
action = {'index': {}}
for item in data:
    json_data += f'{json.dumps(action)}\n'
    json_data += f'{json.dumps(item)}\n'

where data is the large dictionary. This takes on average between 0.9 and 1 second. Is there a more efficient way to do this?

Other SO questions conclude that if this was a simple string addition that has to be done once, doing c = a + b is the fastest way, however, I have to keep appending to what in this case would be c. I have to repeat this operation many times, so speeding this up would be immensely helpful. Is there a way to speed up this function, and if so, what would those optimizations look like?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

蓝戈者 2025-02-16 05:06:54

重复的字符串串联速度很慢。更好的方法是建立一个字符串列表，然后在最后加入。我无法访问您的数据，因此我无法测试此问题，但是您会沿着以下方式进行一些事情：

json_data = []
action = {'index': {}}
for item in data:
    json_data.append(action)
    json_data.append(item)
result = '\n'.join([json.dumps(blob) for blob in json_data])

Repeated string concatenation is slow. A better approach would be to build up a list of strings, and then join them at the end. I don't have access to your data, so I can't test this, but you'd be going for something along the lines of:

json_data = []
action = {'index': {}}
for item in data:
    json_data.append(action)
    json_data.append(item)
result = '\n'.join([json.dumps(blob) for blob in json_data])

回复收藏 0 原文

花心好男孩 2025-02-16 05:06:54

变化...

import json
json_data = []
action = json.dumps({'index': {}}) # dumps is only called on this once
for item in data:
    # json_data will be a list of strings
    json_data.append(action)
    json_data.append(json.dumps(item))
result = '\n'.join(json_data)

Variation...

import json
json_data = []
action = json.dumps({'index': {}}) # dumps is only called on this once
for item in data:
    # json_data will be a list of strings
    json_data.append(action)
    json_data.append(json.dumps(item))
result = '\n'.join(json_data)

回复收藏 0 原文

~没有更多了~