大规模的python串联串
说我有大量词典清单(200万字典)。我需要从本质上进行每个字典的json.dumps()
将其插入一个庞大的字符串中(以提出AWS OpenSearch请求的正文)。到目前为止,我已经有:
json_data = ''
action = {'index': {}}
for item in data:
json_data += f'{json.dumps(action)}\n'
json_data += f'{json.dumps(item)}\n'
其中数据
是大型字典。这平均要在0.9到1秒之间。是否有更有效的方法可以做到这一点?
其他问题得出的结论是,如果这是一个简单的字符串添加,必须完成一次,那么c = a + b
是最快的方法,但是,我必须继续附加到这种情况下BE C
。我必须多次重复此操作,因此加速此操作将非常有帮助。有没有一种方法可以加快此功能,如果是这样,这些优化会是什么样?
Say that I have a massive list of dictionaries (2 million dictionaries). I need to essentially do a json.dumps()
of each dictionary into a massive string (to put in the body of a request to AWS OpenSearch). So far I have this:
json_data = ''
action = {'index': {}}
for item in data:
json_data += f'{json.dumps(action)}\n'
json_data += f'{json.dumps(item)}\n'
where data
is the large dictionary. This takes on average between 0.9 and 1 second. Is there a more efficient way to do this?
Other SO questions conclude that if this was a simple string addition that has to be done once, doing c = a + b
is the fastest way, however, I have to keep appending to what in this case would be c
. I have to repeat this operation many times, so speeding this up would be immensely helpful. Is there a way to speed up this function, and if so, what would those optimizations look like?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
重复的字符串串联速度很慢。更好的方法是建立一个字符串列表,然后在最后加入。我无法访问您的数据,因此我无法测试此问题,但是您会沿着以下方式进行一些事情:
Repeated string concatenation is slow. A better approach would be to build up a list of strings, and then join them at the end. I don't have access to your data, so I can't test this, but you'd be going for something along the lines of:
变化...
Variation...