如何从字符串中删除所有空格

发布于 2024-09-24 06:39:39 字数 220 浏览 9 评论 0原文

如何去除Python字符串中的所有空格?例如,我想要将 strip my space 这样的字符串转换为 stripmyspaces,但我似乎无法使用 strip() 来实现这一点:

>>> 'strip my spaces'.strip()
'strip my spaces'

How do I strip all the spaces in a python string? For example, I want a string like strip my spaces to be turned into stripmyspaces, but I cannot seem to accomplish that with strip():

>>> 'strip my spaces'.strip()
'strip my spaces'

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

佼人 2024-10-01 06:39:39

利用 str.split 不带 sep 参数的行为:

>>> s = " \t foo \n bar "
>>> "".join(s.split())
'foobar'

如果您只想删除空格而不是所有空白:

>>> s.replace(" ", "")
'\tfoo\nbar'

过早优化

尽管效率不是主要目标(编写清晰的代码才是),但这里有一些初始计时:

$ python -m timeit '"".join(" \t foo \n bar ".split())'
1000000 loops, best of 3: 1.38 usec per loop
$ python -m timeit -s 'import re' 're.sub(r"\s+", "", " \t foo \n bar ")'
100000 loops, best of 3: 15.6 usec per loop

注意正则表达式被缓存了,所以它并不像你想象的那么慢。提前编译它会有所帮助,但只有在实践中调用此很多次时才有意义:

$ python -m timeit -s 'import re; e = re.compile(r"\s+")' 'e.sub("", " \t foo \n bar ")'
100000 loops, best of 3: 7.76 usec per loop

尽管 re.sub 慢了 11.3 倍,但请记住您的瓶颈肯定在其他地方。大多数程序不会注意到这 3 个选择之间的差异。

Taking advantage of str.split's behavior with no sep parameter:

>>> s = " \t foo \n bar "
>>> "".join(s.split())
'foobar'

If you just want to remove spaces instead of all whitespace:

>>> s.replace(" ", "")
'\tfoo\nbar'

Premature optimization

Even though efficiency isn't the primary goal—writing clear code is—here are some initial timings:

$ python -m timeit '"".join(" \t foo \n bar ".split())'
1000000 loops, best of 3: 1.38 usec per loop
$ python -m timeit -s 'import re' 're.sub(r"\s+", "", " \t foo \n bar ")'
100000 loops, best of 3: 15.6 usec per loop

Note the regex is cached, so it's not as slow as you'd imagine. Compiling it beforehand helps some, but would only matter in practice if you call this many times:

$ python -m timeit -s 'import re; e = re.compile(r"\s+")' 'e.sub("", " \t foo \n bar ")'
100000 loops, best of 3: 7.76 usec per loop

Even though re.sub is 11.3x slower, remember your bottlenecks are assuredly elsewhere. Most programs would not notice the difference between any of these 3 choices.

陌伤ぢ 2024-10-01 06:39:39

对于 Python 3:

>>> import re
>>> re.sub(r'\s+', '', 'strip my \n\t\r ASCII and \u00A0 \u2003 Unicode spaces')
'stripmyASCIIandUnicodespaces'
>>> # Or, depending on the situation:
>>> re.sub(r'(\s|\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF)+', '', \
... '\uFEFF\t\t\t strip all \u000A kinds of \u200B whitespace \n')
'stripallkindsofwhitespace'

...处理您没有想到的任何空白字符 - 相信我们,有很多。

\s 本身始终覆盖 ASCII 空白:(

  • 常规)空格
  • 制表
  • 符换行符 (\n)
  • 回车符 (\r)
  • 换页符
  • 垂直制表符

另外:

  • 对于带有 re 的 Python 2。启用 UNICODE
  • 对于 Python 3 无需任何额外操作,

...\s 还涵盖 Unicode 空白字符,例如:

  • 不间断空格、
  • em 空格、
  • 表意空格、

.. 。ETC。请参阅此处“具有 White_Space 属性的 Unicode 字符”下的完整列表。

但是 \s 不涵盖未归类为空白的字符,这些字符实际上是空白,例如:

  • 零宽度连接符、
  • 蒙古元音分隔符、
  • 零宽度不间断空格(又名 字节顺序标记),

...等。请参阅此处“不带 White_Space 属性的相关 Unicode 字符”下的完整列表

因此,这 6 个字符被第二个正则表达式 \u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF 中的列表覆盖。

来源:

For Python 3:

>>> import re
>>> re.sub(r'\s+', '', 'strip my \n\t\r ASCII and \u00A0 \u2003 Unicode spaces')
'stripmyASCIIandUnicodespaces'
>>> # Or, depending on the situation:
>>> re.sub(r'(\s|\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF)+', '', \
... '\uFEFF\t\t\t strip all \u000A kinds of \u200B whitespace \n')
'stripallkindsofwhitespace'

...handles any whitespace characters that you're not thinking of - and believe us, there are plenty.

\s on its own always covers the ASCII whitespace:

  • (regular) space
  • tab
  • new line (\n)
  • carriage return (\r)
  • form feed
  • vertical tab

Additionally:

  • for Python 2 with re.UNICODE enabled,
  • for Python 3 without any extra actions,

...\s also covers the Unicode whitespace characters, for example:

  • non-breaking space,
  • em space,
  • ideographic space,

...etc. See the full list here, under "Unicode characters with White_Space property".

However \s DOES NOT cover characters not classified as whitespace, which are de facto whitespace, such as among others:

  • zero-width joiner,
  • Mongolian vowel separator,
  • zero-width non-breaking space (a.k.a. byte order mark),

...etc. See the full list here, under "Related Unicode characters without White_Space property".

So these 6 characters are covered by the list in the second regex, \u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF.

Sources:

暮年慕年 2024-10-01 06:39:39

或者,

"strip my spaces".translate( None, string.whitespace )

这是 Python3 版本:

"strip my spaces".translate(str.maketrans('', '', string.whitespace))

Alternatively,

"strip my spaces".translate( None, string.whitespace )

And here is Python3 version:

"strip my spaces".translate(str.maketrans('', '', string.whitespace))
锦爱 2024-10-01 06:39:39

删除Python中的起始空格

string1 = "    This is Test String to strip leading space"
print(string1)
print(string1.lstrip())

删除Python中的尾部或结尾空格

string2 = "This is Test String to strip trailing space     "
print(string2)
print(string2.rstrip())

删除Python中字符串开头和结尾的空格

string3 = "    This is Test String to strip leading and trailing space      "
print(string3)
print(string3.strip())

删除Python中的所有空格

string4 = "   This is Test String to test all the spaces        "
print(string4)
print(string4.replace(" ", ""))

Remove the Starting Spaces in Python

string1 = "    This is Test String to strip leading space"
print(string1)
print(string1.lstrip())

Remove the Trailing or End Spaces in Python

string2 = "This is Test String to strip trailing space     "
print(string2)
print(string2.rstrip())

Remove the whiteSpaces from Beginning and end of the string in Python

string3 = "    This is Test String to strip leading and trailing space      "
print(string3)
print(string3.strip())

Remove all the spaces in python

string4 = "   This is Test String to test all the spaces        "
print(string4)
print(string4.replace(" ", ""))
情深缘浅 2024-10-01 06:39:39

最简单的是使用替换:

"foo bar\t".replace(" ", "").replace("\t", "")

或者,使用正则表达式:

import re
re.sub(r"\s", "", "foo bar\t")

The simplest is to use replace:

"foo bar\t".replace(" ", "").replace("\t", "")

Alternatively, use a regular expression:

import re
re.sub(r"\s", "", "foo bar\t")
冷血 2024-10-01 06:39:39

正如 Roger Pate 所提到的,以下代码对我有用:

s = " \t foo \n bar "
"".join(s.split())
'foobar'

我正在使用 Jupyter Notebook 运行以下代码:

i=0
ProductList=[]
while i < len(new_list): 
   temp=''                            # new_list[i]=temp=' Plain   Utthapam  '
   #temp=new_list[i].strip()          #if we want o/p as: 'Plain Utthapam'
   temp="".join(new_list[i].split())  #o/p: 'PlainUtthapam' 
   temp=temp.upper()                  #o/p:'PLAINUTTHAPAM' 
   ProductList.append(temp)
   i=i+2

As mentioned by Roger Pate following code worked for me:

s = " \t foo \n bar "
"".join(s.split())
'foobar'

I am using Jupyter Notebook to run following code:

i=0
ProductList=[]
while i < len(new_list): 
   temp=''                            # new_list[i]=temp=' Plain   Utthapam  '
   #temp=new_list[i].strip()          #if we want o/p as: 'Plain Utthapam'
   temp="".join(new_list[i].split())  #o/p: 'PlainUtthapam' 
   temp=temp.upper()                  #o/p:'PLAINUTTHAPAM' 
   ProductList.append(temp)
   i=i+2
菩提树下叶撕阳。 2024-10-01 06:39:39

尝试使用 re.sub 进行正则表达式。您可以搜索所有空格并替换为空字符串。

模式中的 \s 将匹配空白字符 - 而不仅仅是空格(制表符、换行符等)。您可以在手册中阅读更多相关信息。

Try a regex with re.sub. You can search for all whitespace and replace with an empty string.

\s in your pattern will match whitespace characters - and not just a space (tabs, newlines, etc). You can read more about it in the manual.

表情可笑 2024-10-01 06:39:39
import re
re.sub(' ','','strip my spaces')
import re
re.sub(' ','','strip my spaces')
因为看清所以看轻 2024-10-01 06:39:39

过滤列表的标准技术适用,尽管它们不如 split/jointranslate 方法高效。

我们需要一组空格:

>>> import string
>>> ws = set(string.whitespace)

filter 内置:

>>> "".join(filter(lambda c: c not in ws, "strip my spaces"))
'stripmyspaces'

列表理解(是的,使用括号:请参阅下面的基准):

>>> import string
>>> "".join([c for c in "strip my spaces" if c not in ws])
'stripmyspaces'

折叠:

>>> import functools
>>> "".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))
'stripmyspaces'

基准:

>>> from timeit import timeit
>>> timeit('"".join("strip my spaces".split())')
0.17734256500003198
>>> timeit('"strip my spaces".translate(ws_dict)', 'import string; ws_dict = {ord(ws):None for ws in string.whitespace}')
0.457635745999994
>>> timeit('re.sub(r"\s+", "", "strip my spaces")', 'import re')
1.017787621000025

>>> SETUP = 'import string, operator, functools, itertools; ws = set(string.whitespace)'
>>> timeit('"".join([c for c in "strip my spaces" if c not in ws])', SETUP)
0.6484303600000203
>>> timeit('"".join(c for c in "strip my spaces" if c not in ws)', SETUP)
0.950212219999969
>>> timeit('"".join(filter(lambda c: c not in ws, "strip my spaces"))', SETUP)
1.3164566040000523
>>> timeit('"".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))', SETUP)
1.6947649049999995

The standard techniques to filter a list apply, although they are not as efficient as the split/join or translate methods.

We need a set of whitespaces:

>>> import string
>>> ws = set(string.whitespace)

The filter builtin:

>>> "".join(filter(lambda c: c not in ws, "strip my spaces"))
'stripmyspaces'

A list comprehension (yes, use the brackets: see benchmark below):

>>> import string
>>> "".join([c for c in "strip my spaces" if c not in ws])
'stripmyspaces'

A fold:

>>> import functools
>>> "".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))
'stripmyspaces'

Benchmark:

>>> from timeit import timeit
>>> timeit('"".join("strip my spaces".split())')
0.17734256500003198
>>> timeit('"strip my spaces".translate(ws_dict)', 'import string; ws_dict = {ord(ws):None for ws in string.whitespace}')
0.457635745999994
>>> timeit('re.sub(r"\s+", "", "strip my spaces")', 'import re')
1.017787621000025

>>> SETUP = 'import string, operator, functools, itertools; ws = set(string.whitespace)'
>>> timeit('"".join([c for c in "strip my spaces" if c not in ws])', SETUP)
0.6484303600000203
>>> timeit('"".join(c for c in "strip my spaces" if c not in ws)', SETUP)
0.950212219999969
>>> timeit('"".join(filter(lambda c: c not in ws, "strip my spaces"))', SETUP)
1.3164566040000523
>>> timeit('"".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))', SETUP)
1.6947649049999995
说不完的你爱 2024-10-01 06:39:39
  1. 将字符串分开以分隔单词
  2. 去除两侧的空格
  3. 加入最后使用单个空格最后

一行代码:

' '.join(word.strip() for word in message_text.split()
  1. Parce your string to separate words
  2. Strip white spaces on both sides
  3. Join them with single space in the end

Final line of code:

' '.join(word.strip() for word in message_text.split()
唯憾梦倾城 2024-10-01 06:39:39

如果不需要最佳性能并且您只想要一些非常简单的东西,您可以定义一个基本函数来使用字符串类的内置“isspace”方法来测试每个字符:

def remove_space(input_string):
    no_white_space = ''
    for c in input_string:
        if not c.isspace():
            no_white_space += c
    return no_white_space

以这种方式构建 no_white_space 字符串将没有理想的性能,但解​​决方案很容易理解。

>>> remove_space('strip my spaces')
'stripmyspaces'

如果您不想定义函数,可以将其转换为与列表理解大致相似的内容。借用最佳答案的 join 解决方案:

>>> "".join([c for c in "strip my spaces" if not c.isspace()])
'stripmyspaces'

If optimal performance is not a requirement and you just want something dead simple, you can define a basic function to test each character using the string class's built in "isspace" method:

def remove_space(input_string):
    no_white_space = ''
    for c in input_string:
        if not c.isspace():
            no_white_space += c
    return no_white_space

Building the no_white_space string this way will not have ideal performance, but the solution is easy to understand.

>>> remove_space('strip my spaces')
'stripmyspaces'

If you don't want to define a function, you can convert this into something vaguely similar with list comprehension. Borrowing from the top answer's join solution:

>>> "".join([c for c in "strip my spaces" if not c.isspace()])
'stripmyspaces'
泪眸﹌ 2024-10-01 06:39:39

TL/DR

此解决方案已使用 Python 3.6 进行测试

要从 Python3 中的字符串中去除所有空格,您可以使用以下函数:

def remove_spaces(in_string: str):
    return in_string.translate(str.maketrans({' ': ''})

删除任何空白字符 (' \t\n\r\x0b\x0c ') 你可以使用以下函数:

import string
def remove_whitespace(in_string: str):
    return in_string.translate(str.maketrans(dict.fromkeys(string.whitespace)))

Explanation

Python 的 str.translate 方法是 str 的内置类方法,它接受一个表并返回字符串的副本通过传递的翻译表映射每个字符。 str.translate 的完整文档

创建翻译表 <使用代码>str.maketrans。该方法是str的另一个内置类方法。在这里,我们仅使用一个参数,在本例中是一个字典,其中键是要替换的字符,映射到具有字符替换值的值。它返回一个与 str.translate 一起使用的翻译表。 str.maketrans 的完整文档

string< python中的/code>模块包含一些常见的字符串操作和常量。 string.whitespace 是一个常量,它返回包含所有被视为空白的 ASCII 字符的字符串。这包括字符空格、制表符、换行符、回车符、换页符和垂直制表符。 string.whitespace 的完整文档

在第二个函数中 dict.fromkeys 用于创建一个字典,其中键是 string.whitespace 返回的字符串中的字符,每个字符的值为 Nonedict.fromkeys 的完整文档

TL/DR

This solution was tested using Python 3.6

To strip all spaces from a string in Python3 you can use the following function:

def remove_spaces(in_string: str):
    return in_string.translate(str.maketrans({' ': ''})

To remove any whitespace characters (' \t\n\r\x0b\x0c') you can use the following function:

import string
def remove_whitespace(in_string: str):
    return in_string.translate(str.maketrans(dict.fromkeys(string.whitespace)))

Explanation

Python's str.translate method is a built-in class method of str, it takes a table and returns a copy of the string with each character mapped through the passed translation table. Full documentation for str.translate

To create the translation table str.maketrans is used. This method is another built-in class method of str. Here we use it with only one parameter, in this case a dictionary, where the keys are the characters to be replaced mapped to values with the characters replacement value. It returns a translation table for use with str.translate. Full documentation for str.maketrans

The string module in python contains some common string operations and constants. string.whitespace is a constant which returns a string containing all ASCII characters that are considered whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab. Full documentation for string.whitespace

In the second function dict.fromkeys is used to create a dictionary where the keys are the characters in the string returned by string.whitespace each with value None. Full documentation for dict.fromkeys

倚栏听风 2024-10-01 06:39:39

这是使用普通旧列表理解的另一种方法:

''.join([c for c in aString if c not in [' ','\t','\n']])

示例:

>>> aStr = 'aaa\nbbb\t\t\tccc  '
>>> print(aString)
aaa
bbb         ccc

>>> ''.join([c for c in aString if c not in [' ','\t','\n']])
'aaabbbccc'

Here's another way using plain old list comprehension:

''.join([c for c in aString if c not in [' ','\t','\n']])

Example:

>>> aStr = 'aaa\nbbb\t\t\tccc  '
>>> print(aString)
aaa
bbb         ccc

>>> ''.join([c for c in aString if c not in [' ','\t','\n']])
'aaabbbccc'
娇柔作态 2024-10-01 06:39:39

这是在采访中被问到的。因此,如果您必须仅使用剥离法来给出解决方案。这是一个方法 -

s='string with spaces'
res=''.join((i.strip(' ') for i in s))
print(res)

This got asked in an interview. So if you have to give a solution just by using strip method. Here's an approach -

s='string with spaces'
res=''.join((i.strip(' ') for i in s))
print(res)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文