Python .join 或字符串连接
我意识到,如果您有一个可迭代对象,则应该始终使用 .join(iterable)
而不是 for x in y: str += x
。但是,如果只有固定数量的变量尚未存在于可迭代中,那么使用 .join()
仍然是推荐的方法吗?
例如,我应
user = 'username'
host = 'host'
该做什么
ret = user + '@' + host
,或者
ret = '@'.join([user, host])
从性能的角度来看我没有太多要求,因为两者都非常微不足道。但我读到这里的人说总是使用 .join() ,我想知道是否有任何特殊原因,或者使用 .join() 通常是一个好主意。
I realise that if you have an iterable you should always use .join(iterable)
instead of for x in y: str += x
. But if there's only a fixed number of variables that aren't already in an iterable, is using .join()
still the recommended way?
For example I have
user = 'username'
host = 'host'
should I do
ret = user + '@' + host
or
ret = '@'.join([user, host])
I'm not so much asking from a performance point of view, since both will be pretty trivial. But I've read people on here say always use .join()
and I was wondering if there's any particular reason for that or if it's just generally a good idea to use .join()
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果您要创建这样的字符串,通常需要使用字符串格式:
Python 2.6 添加了另一种形式,它不依赖于运算符重载并且具有一些额外的功能:
作为一般准则,大多数人会使用
仅当他们在那里添加两个字符串时,才在字符串上添加 +
。对于更多部分或更复杂的字符串,它们要么使用字符串格式(如上所示),要么将列表中的元素组合在一起并将它们连接在一起(特别是如果涉及任何形式的循环)。使用str.join() 的原因
的一点是,将字符串添加在一起意味着每次添加都会创建一个新字符串(并且可能会破坏旧字符串)。。 Python 有时可以对此进行优化,但是str.join()
很快就会变得更清晰、更明显并且速度明显更快。If you're creating a string like that, you normally want to use string formatting:
Python 2.6 added another form, which doesn't rely on operator overloading and has some extra features:
As a general guideline, most people will use
+
on strings only if they're adding two strings right there. For more parts or more complex strings, they either use string formatting, like above, or assemble elements in a list and join them together (especially if there's any form of looping involved.) The reason for usingstr.join()
is that adding strings together means creating a new string (and potentially destroying the old ones) for each addition. Python can sometimes optimize this away, butstr.join()
quickly becomes clearer, more obvious and significantly faster.我认为这个问题的意思是:“这样做可以吗:”
……答案是可以。那完全没问题。
当然,您应该知道可以在 Python 中执行的很酷的格式化操作,并且您应该知道对于长列表,“join”是可行的方法,但是对于像这样的简单情况,您所拥有的是完全正确。它简单明了,性能不会成为问题。
I take the question to mean: "Is it ok to do this:"
..and the answer is yes. That is perfectly fine.
You should, of course, be aware of the cool formatting stuff you can do in Python, and you should be aware that for long lists, "join" is the way to go, but for a simple situation like this, what you have is exactly right. It's simple and clear, and performance will not be an issue.
(我很确定所有指向字符串格式化的人都完全忽略了这个问题。)
通过构造数组并连接它来创建字符串只是出于性能原因。除非您需要这种性能,或者除非它恰好是实现它的自然方式,否则这样做比简单的字符串连接没有任何好处。
说
'@'.join([user, host])
是不直观的。这让我想知道:他为什么要这么做?其中有什么微妙之处吗?是否存在可能存在多个“@”的情况?答案当然是否定的,但与以自然方式编写相比,需要更多时间才能得出该结论。不要仅仅为了避免字符串连接而扭曲代码;这本身并没有什么问题。连接数组只是一种优化。
(I'm pretty sure all of the people pointing at string formatting are missing the question entirely.)
Creating a string by constructing an array and joining it is for performance reasons only. Unless you need that performance, or unless it happens to be the natural way to implement it anyway, there's no benefit to doing that rather than simple string concatenation.
Saying
'@'.join([user, host])
is unintuitive. It makes me wonder: why is he doing this? Are there any subtleties to it; is there any case where there might be more than one '@'? The answer is no, of course, but it takes more time to come to that conclusion than if it was written in a natural way.Don't contort your code merely to avoid string concatenation; there's nothing inherently wrong with it. Joining arrays is just an optimization.
我只是注意到,我一直倾向于使用就地连接,直到我重新阅读 Python 通用风格 PEP Python 代码的 PEP-8 风格指南。
从此以后,我一直在转向使用连接的做法,以便在效率尤为关键时保留这种习惯,作为一种更自动的做法。
所以我将投票支持:
I'll just note that I've always tended to use in-place concatenation until I was rereading a portion of the Python general style PEP PEP-8 Style Guide for Python Code.
Going by this, I have been converting to the practice of using joins so that I may retain the habit as a more automatic practice when efficiency is extra critical.
So I'll put in my vote for:
我接下来使用:
I use next:
我推荐
join()
而不是串联,基于两个方面:关于第一方面,这里有一个例子:
输出:
考虑到一个巨大的数据集(数百万行)及其处理,每行 130 毫秒,这太多了。
而对于第二个方面,确实更加优雅。
I recommend
join()
over concatenation, based on two aspects:Regarding the first aspect, here's an example:
The output:
Considering a huge dataset (millions of lines) and its processing, 130 milliseconds per line, it's too much.
And for the second aspect, indeed, is more elegant.