Python .join 或字符串连接

发布于 2024-10-02 07:26:25 字数 457 浏览 6 评论 0原文

我意识到,如果您有一个可迭代对象,则应该始终使用 .join(iterable) 而不是 for x in y: str += x。但是,如果只有固定数量的变量尚未存在于可迭代中,那么使用 .join() 仍然是推荐的方法吗?

例如,我应

user = 'username'
host = 'host'

该做什么

ret = user + '@' + host

,或者

ret = '@'.join([user, host])

从性能的角度来看我没有太多要求,因为两者都非常微不足道。但我读到这里的人说总是使用 .join() ,我想知道是否有任何特殊原因,或者使用 .join() 通常是一个好主意。

I realise that if you have an iterable you should always use .join(iterable) instead of for x in y: str += x. But if there's only a fixed number of variables that aren't already in an iterable, is using .join() still the recommended way?

For example I have

user = 'username'
host = 'host'

should I do

ret = user + '@' + host

or

ret = '@'.join([user, host])

I'm not so much asking from a performance point of view, since both will be pretty trivial. But I've read people on here say always use .join() and I was wondering if there's any particular reason for that or if it's just generally a good idea to use .join().

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

雨后咖啡店 2024-10-09 07:26:25

如果您要创建这样的字符串,通常需要使用字符串格式:

>>> user = 'username'
>>> host = 'host'
>>> '%s@%s' % (user, host)
'username@host'

Python 2.6 添加了另一种形式,它不依赖于运算符重载并且具有一些额外的功能:

>>> '{0}@{1}'.format(user, host)
'username@host'

作为一般准则,大多数人会使用 仅当他们在那里添加两个字符串时,才在字符串上添加 + 。对于更多部分或更复杂的字符串,它们要么使用字符串格式(如上所示),要么将列表中的元素组合在一起并将它们连接在一起(特别是如果涉及任何形式的循环)。使用 str.join() 的原因 的一点是,将字符串添加在一起意味着每次添加都会创建一个新字符串(并且可能会破坏旧字符串)。。 Python 有时可以对此进行优化,但是 str.join() 很快就会变得更清晰、更明显并且速度明显更快。

If you're creating a string like that, you normally want to use string formatting:

>>> user = 'username'
>>> host = 'host'
>>> '%s@%s' % (user, host)
'username@host'

Python 2.6 added another form, which doesn't rely on operator overloading and has some extra features:

>>> '{0}@{1}'.format(user, host)
'username@host'

As a general guideline, most people will use + on strings only if they're adding two strings right there. For more parts or more complex strings, they either use string formatting, like above, or assemble elements in a list and join them together (especially if there's any form of looping involved.) The reason for using str.join() is that adding strings together means creating a new string (and potentially destroying the old ones) for each addition. Python can sometimes optimize this away, but str.join() quickly becomes clearer, more obvious and significantly faster.

深爱成瘾 2024-10-09 07:26:25

我认为这个问题的意思是:“这样做可以吗:”

ret = user + '@' + host

……答案是可以。那完全没问题。

当然,您应该知道可以在 Python 中执行的很酷的格式化操作,并且您应该知道对于长列表,“join”是可行的方法,但是对于像这样的简单情况,您所拥有的是完全正确。它简单明了,性能不会成为问题。

I take the question to mean: "Is it ok to do this:"

ret = user + '@' + host

..and the answer is yes. That is perfectly fine.

You should, of course, be aware of the cool formatting stuff you can do in Python, and you should be aware that for long lists, "join" is the way to go, but for a simple situation like this, what you have is exactly right. It's simple and clear, and performance will not be an issue.

白芷 2024-10-09 07:26:25

(我很确定所有指向字符串格式化的人都完全忽略了这个问题。)

通过构造数组并连接它来创建字符串只是出于性能原因。除非您需要这种性能,或者除非它恰好是实现它的自然方式,否则这样做比简单的字符串连接没有任何好处。

'@'.join([user, host]) 是不直观的。这让我想知道:他为什么要这么做?其中有什么微妙之处吗?是否存在可能存在多个“@”的情况?答案当然是否定的,但与以自然方式编写相比,需要更多时间才能得出该结论。

不要仅仅为了避免字符串连接而扭曲代码;这本身并没有什么问题。连接数组只是一种优化。

(I'm pretty sure all of the people pointing at string formatting are missing the question entirely.)

Creating a string by constructing an array and joining it is for performance reasons only. Unless you need that performance, or unless it happens to be the natural way to implement it anyway, there's no benefit to doing that rather than simple string concatenation.

Saying '@'.join([user, host]) is unintuitive. It makes me wonder: why is he doing this? Are there any subtleties to it; is there any case where there might be more than one '@'? The answer is no, of course, but it takes more time to come to that conclusion than if it was written in a natural way.

Don't contort your code merely to avoid string concatenation; there's nothing inherently wrong with it. Joining arrays is just an optimization.

往事随风而去 2024-10-09 07:26:25

我只是注意到,我一直倾向于使用就地连接,直到我重新阅读 Python 通用风格 PEP Python 代码的 PEP-8 风格指南

  • 代码的编写方式不应损害其他人的利益
    Python 的实现(PyPy、Jython、IronPython、Pyrex、Psyco、
    等等)。
    例如,不要依赖 CPython 的高效实现
    a+=b 或 a=a+b 形式的语句的就地字符串连接。
    这些语句在 Jython 中运行得更慢。在性能敏感的情况下
    对于库的某些部分,应使用 ''.join() 形式。这
    将确保串联在各种不同的线性时间内发生
    实现。

从此以后,我一直在转向使用连接的做法,以便在效率尤为关键时保留这种习惯,作为一种更自动的做法。

所以我将投票支持:

ret = '@'.join([user, host])

I'll just note that I've always tended to use in-place concatenation until I was rereading a portion of the Python general style PEP PEP-8 Style Guide for Python Code.

  • Code should be written in a way that does not disadvantage other
    implementations of Python (PyPy, Jython, IronPython, Pyrex, Psyco,
    and such).
    For example, do not rely on CPython's efficient implementation of
    in-place string concatenation for statements in the form a+=b or a=a+b.
    Those statements run more slowly in Jython. In performance sensitive
    parts of the library, the ''.join() form should be used instead. This
    will ensure that concatenation occurs in linear time across various
    implementations.

Going by this, I have been converting to the practice of using joins so that I may retain the habit as a more automatic practice when efficiency is extra critical.

So I'll put in my vote for:

ret = '@'.join([user, host])
墟烟 2024-10-09 07:26:25

我接下来使用:

ret = '%s@%s' % (user, host)

I use next:

ret = '%s@%s' % (user, host)
尾戒 2024-10-09 07:26:25

我推荐 join() 而不是串联,基于两个方面:

  1. 更快。
  2. 更优雅。

关于第一方面,这里有一个例子:

import timeit    

s1 = "Flowers"    
s2 = "of"    
s3 = "War"    

def join_concat():    
    return s1 + " " + s2 + " " + s3  

def join_builtin():    
    return " ".join((s1, s2, s3))    

print("Join Concatenation: ", timeit.timeit(join_concat))         
print("Join Builtin:       ", timeit.timeit(join_builtin))

输出:

$ python3 join_test.py
Join Concatenation:  0.40386943198973313
Join Builtin:        0.2666833929979475

考虑到一个巨大的数据集(数百万行)及其处理,每行 130 毫秒,这太多了。

而对于第二个方面,确实更加优雅。

I recommend join() over concatenation, based on two aspects:

  1. Faster.
  2. More elegant.

Regarding the first aspect, here's an example:

import timeit    

s1 = "Flowers"    
s2 = "of"    
s3 = "War"    

def join_concat():    
    return s1 + " " + s2 + " " + s3  

def join_builtin():    
    return " ".join((s1, s2, s3))    

print("Join Concatenation: ", timeit.timeit(join_concat))         
print("Join Builtin:       ", timeit.timeit(join_builtin))

The output:

$ python3 join_test.py
Join Concatenation:  0.40386943198973313
Join Builtin:        0.2666833929979475

Considering a huge dataset (millions of lines) and its processing, 130 milliseconds per line, it's too much.

And for the second aspect, indeed, is more elegant.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文