Python 的字符串连接与 str.join 相比有多慢?
由于我在此线程,我想知道+=
操作符和''.join()
的速度差异是多少,
那么两者的速度对比是怎样的呢?
As a result of the comments in my answer on this thread, I wanted to know what the speed difference is between the +=
operator and ''.join()
So what is the speed comparison between the two?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
来自:高效字符串连接
方法1:
方法4:< /strong>
现在我意识到它们并不严格具有代表性,第四种方法在迭代和加入每个项目之前附加到列表,但这是一个公平的指示。
字符串连接比串联要快得多。
为什么?字符串是不可变的,不能就地更改。要改变其中一个,需要创建一个新的表示(两者的串联)。
From: Efficient String Concatenation
Method 1:
Method 4:
Now I realise they are not strictly representative, and the 4th method appends to a list before iterating through and joining each item, but it's a fair indication.
String join is significantly faster then concatenation.
Why? Strings are immutable and can't be changed in place. To alter one, a new representation needs to be created (a concatenation of the two).
注意: 这个基准测试是非正式的,需要重做,因为它没有全面展示这些方法如何在更实际的长字符串上执行。正如 @Mark Amery 的评论中提到的,
+=
不如可靠地使用f
-strings 和str# join
在实际用例中并没有那么慢。由于后续 CPython 版本(尤其是 3.11)引入的显着性能改进,这些指标也可能已经过时。
现有的答案写得很好并且经过研究,但这是 Python 3.6 时代的另一个答案,因为现在我们有了 文字字符串插值(又称为
f
-strings):使用 CPython 3.6.5 在配备 Intel Core i7、2.3 GHz 的 2012 Retina MacBook Pro 上执行测试。
Note: This benchmark was informal and is due to be redone because it doesn't show a full picture of how these methods will perform with more realistically long strings. As mentioned in the comments by @Mark Amery,
+=
is not reliably as fast as usingf
-strings, andstr#join
isn't as dramatically slower in realistic use cases.These metrics are also likely outdated by the significant performance improvements introduced by subsequent CPython versions, and most notably, 3.11.
The existing answers are very well-written and researched, but here's another answer for the Python 3.6 era, since now we have literal string interpolation (AKA,
f
-strings):Test performed using CPython 3.6.5 on a 2012 Retina MacBook Pro with an Intel Core i7 at 2.3 GHz.
我原来的代码是错误的,看来
+
连接通常更快(特别是在较新的硬件上使用较新版本的 Python)时间如下:
Windows 7 上的 Python 3.3,
Windows 上的 Core i7 Python 2.7 7、Core i7
On Linux Mint、Python 2.7、一些较慢的处理器
这是代码:
My original code was wrong, it appears that
+
concatenation is usually faster (especially with newer versions of Python on newer hardware)The times are as follows:
Python 3.3 on Windows 7, Core i7
Python 2.7 on Windows 7, Core i7
On Linux Mint, Python 2.7, some slower processor
And here is the code:
如果我期望得好,对于一个包含 k 个字符串、总共 n 个字符的列表,连接的时间复杂度应该是 O(nlogk),而经典串联的时间复杂度应该是 O(nk)。
这与合并 k 个排序列表的相对成本相同(有效的方法是 O(nlkg) ,而简单的方法,类似于串联是 O(nk) )。
If I expect well, for a list with k string, with n characters in total, time complexity of join should be O(nlogk) while time complexity of classic concatenation should be O(nk).
That would be the same relative costs as merging k sorted list (efficient method is O(nlkg), while the simple one, akin to concatenation is O(nk) ).
如果我从算法上来说,如果你选择 [ += ],那么它会生成一个新对象,并且时间复杂度为 O(n)**2。但如果你使用[.join]那么它将是O(n)。
If I say it algorithmically, if you choose [ += ] then it generates a new object and it will be O(n)**2. But if you use [ .join ] then it will be O(n).
我重写了上一个答案,请问您可以分享您对我测试方式的看法吗?
注意:此示例是用 Python 3.5 编写的,其中 range() 的作用类似于以前的 xrange()
我得到的输出:
就我个人而言,我更喜欢 ''.join([]) 而不是 'Plusser way',因为它更干净且更具可读性。
I rewrote the last answer, could jou please share your opinion on the way i tested?
NOTE: This example is written in Python 3.5, where range() acts like the former xrange()
The output i got:
Personally i prefer ''.join([]) over the 'Plusser way' because it's cleaner and more readable.
这就是愚蠢的程序旨在测试的:)
使用 plus
输出:
现在使用 join....
输出:
所以在 Windows 上的 python 2.6 上,我会说 + 比 join 快大约 18 倍:)
This is what silly programs are designed to test :)
Use plus
Output of:
Now with join....
Output Of:
So on python 2.6 on windows, I would say + is about 18 times faster than join :)