如何从字符串中删除所有空格?

发布于 2024-11-07 16:58:24 字数 79 浏览 5 评论 0原文

因此“xx yy 11 22 33”将变成“xxyy112233”。我怎样才能实现这个目标?

So " xx yy 11 22 33 " will become "xxyy112233". How can I achieve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

-小熊_ 2024-11-14 16:58:24

一般来说,我们想要一个矢量化的解决方案,所以这里有一个更好的测试示例:

whitespace <- " \t\n\r\v\f" # space, tab, newline, 
                            # carriage return, vertical tab, form feed
x <- c(
  " x y ",           # spaces before, after and in between
  " \u2190 \u2192 ", # contains unicode chars
  paste0(            # varied whitespace     
    whitespace, 
    "x", 
    whitespace, 
    "y", 
    whitespace, 
    collapse = ""
  ),   
  NA                 # missing
)
## [1] " x y "                           
## [2] " ← → "                           
## [3] " \t\n\r\v\fx \t\n\r\v\fy \t\n\r\v\f"
## [4] NA

基本 R 方法:gsub

gsub 替换字符串 (fixed = TRUE) 或正则表达式 (fixed = FALSE,默认值)用另一根绳子。要删除所有空格,请使用:

gsub(" ", "", x, fixed = TRUE)
## [1] "xy"                            "←→"             
## [3] "\t\n\r\v\fx\t\n\r\v\fy\t\n\r\v\f" NA 

正如 DWin 所指出的,在这种情况下,fixed = TRUE 不是必需的,但可以提供稍好的性能,因为匹配固定字符串比匹配正则表达式更快。

如果要删除所有类型的空格,请使用:

gsub("[[:space:]]", "", x) # note the double square brackets
## [1] "xy" "←→" "xy" NA 

gsub("\\s", "", x)         # same; note the double backslash

library(regex)
gsub(space(), "", x)       # same

"[:space :]" 是匹配所有空格字符的特定于 R 的正则表达式组。 \s 是一个独立于语言的正则表达式,具有相同的功能。


stringr 方法:str_replace_allstr_trim

stringr 围绕基本 R 函数提供了更多人类可读的包装器(尽管从 2014 年 12 月开始,开发版本有一个基于 stringi 构建的分支,如下所述)。使用 [str_replace_all][3] 的上述命令的等效项是:

library(stringr)
str_replace_all(x, fixed(" "), "")
str_replace_all(x, space(), "")

stringr 也有一个 str_trim 函数,仅删除前导和尾随空格。

str_trim(x) 
## [1] "x y"          "← →"          "x \t\n\r\v\fy" NA    
str_trim(x, "left")    
## [1] "x y "                   "← → "    
## [3] "x \t\n\r\v\fy \t\n\r\v\f" NA     
str_trim(x, "right")    
## [1] " x y"                   " ← →"    
## [3] " \t\n\r\v\fx \t\n\r\v\fy" NA      

stringi 方法:stri_replace_all_charclassstri_trim

stringi 构建在独立于平台的 ICU 库,并具有一组广泛的字符串操作函数。上述的等价是:

library(stringi)
stri_replace_all_fixed(x, " ", "")
stri_replace_all_charclass(x, "\\p{WHITE_SPACE}", "")

这里"\\p{WHITE_SPACE}" 是替代方案集合的语法被视为空白的 Unicode 代码点,相当于 "[[:space:]]""\\s"space()。对于更复杂的正则表达式替换,还有 stri_replace_all_regex

stringi 还具有修剪函数

stri_trim(x)
stri_trim_both(x)    # same
stri_trim(x, "left")
stri_trim_left(x)    # same
stri_trim(x, "right")  
stri_trim_right(x)   # same

In general, we want a solution that is vectorised, so here's a better test example:

whitespace <- " \t\n\r\v\f" # space, tab, newline, 
                            # carriage return, vertical tab, form feed
x <- c(
  " x y ",           # spaces before, after and in between
  " \u2190 \u2192 ", # contains unicode chars
  paste0(            # varied whitespace     
    whitespace, 
    "x", 
    whitespace, 
    "y", 
    whitespace, 
    collapse = ""
  ),   
  NA                 # missing
)
## [1] " x y "                           
## [2] " ← → "                           
## [3] " \t\n\r\v\fx \t\n\r\v\fy \t\n\r\v\f"
## [4] NA

The base R approach: gsub

gsub replaces all instances of a string (fixed = TRUE) or regular expression (fixed = FALSE, the default) with another string. To remove all spaces, use:

gsub(" ", "", x, fixed = TRUE)
## [1] "xy"                            "←→"             
## [3] "\t\n\r\v\fx\t\n\r\v\fy\t\n\r\v\f" NA 

As DWin noted, in this case fixed = TRUE isn't necessary but provides slightly better performance since matching a fixed string is faster than matching a regular expression.

If you want to remove all types of whitespace, use:

gsub("[[:space:]]", "", x) # note the double square brackets
## [1] "xy" "←→" "xy" NA 

gsub("\\s", "", x)         # same; note the double backslash

library(regex)
gsub(space(), "", x)       # same

"[:space:]" is an R-specific regular expression group matching all space characters. \s is a language-independent regular-expression that does the same thing.


The stringr approach: str_replace_all and str_trim

stringr provides more human-readable wrappers around the base R functions (though as of Dec 2014, the development version has a branch built on top of stringi, mentioned below). The equivalents of the above commands, using [str_replace_all][3], are:

library(stringr)
str_replace_all(x, fixed(" "), "")
str_replace_all(x, space(), "")

stringr also has a str_trim function which removes only leading and trailing whitespace.

str_trim(x) 
## [1] "x y"          "← →"          "x \t\n\r\v\fy" NA    
str_trim(x, "left")    
## [1] "x y "                   "← → "    
## [3] "x \t\n\r\v\fy \t\n\r\v\f" NA     
str_trim(x, "right")    
## [1] " x y"                   " ← →"    
## [3] " \t\n\r\v\fx \t\n\r\v\fy" NA      

The stringi approach: stri_replace_all_charclass and stri_trim

stringi is built upon the platform-independent ICU library, and has an extensive set of string manipulation functions. The equivalents of the above are:

library(stringi)
stri_replace_all_fixed(x, " ", "")
stri_replace_all_charclass(x, "\\p{WHITE_SPACE}", "")

Here "\\p{WHITE_SPACE}" is an alternate syntax for the set of Unicode code points considered to be whitespace, equivalent to "[[:space:]]", "\\s" and space(). For more complex regular expression replacements, there is also stri_replace_all_regex.

stringi also has trim functions.

stri_trim(x)
stri_trim_both(x)    # same
stri_trim(x, "left")
stri_trim_left(x)    # same
stri_trim(x, "right")  
stri_trim_right(x)   # same
给妤﹃绝世温柔 2024-11-14 16:58:24

我刚刚了解了“stringr”包,它可以使用 str_trim( , side="both") 从字符串的开头和结尾删除空格,但它还有一个替换功能,以便:

a <- " xx yy 11 22 33 " 
str_replace_all(string=a, pattern=" ", repl="")

[1] "xxyy112233"

I just learned about the "stringr" package to remove white space from the beginning and end of a string with str_trim( , side="both") but it also has a replacement function so that:

a <- " xx yy 11 22 33 " 
str_replace_all(string=a, pattern=" ", repl="")

[1] "xxyy112233"
血之狂魔 2024-11-14 16:58:24
x = "xx yy 11 22 33"

gsub(" ", "", x)

> [1] "xxyy112233"
x = "xx yy 11 22 33"

gsub(" ", "", x)

> [1] "xxyy112233"
白云不回头 2024-11-14 16:58:24

使用 [[:blank:]] 匹配任何类型的水平空白字符。

gsub("[[:blank:]]", "", " xx yy 11 22  33 ")
# [1] "xxyy112233"

Use [[:blank:]] to match any kind of horizontal white_space characters.

gsub("[[:blank:]]", "", " xx yy 11 22  33 ")
# [1] "xxyy112233"
心碎无痕… 2024-11-14 16:58:24

请注意,上面写的灵魂仅删除空格。如果您还想删除制表符或换行符,请使用 stringi 包中的 stri_replace_all_charclass

library(stringi)
stri_replace_all_charclass("   ala \t  ma \n kota  ", "\\p{WHITE_SPACE}", "")
## [1] "alamakota"

Please note that soultions written above removes only space. If you want also to remove tab or new line use stri_replace_all_charclass from stringi package.

library(stringi)
stri_replace_all_charclass("   ala \t  ma \n kota  ", "\\p{WHITE_SPACE}", "")
## [1] "alamakota"
紧拥背影 2024-11-14 16:58:24

tidyverse 包 stringr 中的函数 str_squish() 发挥了神奇作用!

library(dplyr)
library(stringr)

df <- data.frame(a = c("  aZe  aze s", "wxc  s     aze   "), 
                 b = c("  12    12 ", "34e e4  "), 
                 stringsAsFactors = FALSE)
df <- df %>%
  rowwise() %>%
  mutate_all(funs(str_squish(.))) %>%
  ungroup()
df

# A tibble: 2 x 2
  a         b     
  <chr>     <chr> 
1 aZe aze s 12 12 
2 wxc s aze 34e e4

The function str_squish() from package stringr of tidyverse does the magic!

library(dplyr)
library(stringr)

df <- data.frame(a = c("  aZe  aze s", "wxc  s     aze   "), 
                 b = c("  12    12 ", "34e e4  "), 
                 stringsAsFactors = FALSE)
df <- df %>%
  rowwise() %>%
  mutate_all(funs(str_squish(.))) %>%
  ungroup()
df

# A tibble: 2 x 2
  a         b     
  <chr>     <chr> 
1 aZe aze s 12 12 
2 wxc s aze 34e e4
深陷 2024-11-14 16:58:24

可以考虑另一种方法

library(stringr)
str_replace_all(" xx yy 11 22  33 ", regex("\\s*"), "")

#[1] "xxyy112233"

\\s:匹配空格、制表符、垂直制表符、换行符、换页符、回车符

*:匹配至少 0 次

Another approach can be taken into account

library(stringr)
str_replace_all(" xx yy 11 22  33 ", regex("\\s*"), "")

#[1] "xxyy112233"

\\s: Matches Space, tab, vertical tab, newline, form feed, carriage return

*: Matches at least 0 times

仙女山的月亮 2024-11-14 16:58:24
income<-c("$98,000.00 ", "$90,000.00 ", "$18,000.00 ", "")

要删除 .00 之后的空格,请使用 trimws() 函数。

income<-trimws(income)
income<-c("$98,000.00 ", "$90,000.00 ", "$18,000.00 ", "")

To remove space after .00 use the trimws() function.

income<-trimws(income)
っ左 2024-11-14 16:58:24

从 stringr 库中,您可以尝试以下操作:

  1. 删除连续填充空白
  2. 删除填充空白

    库(字符串)

    <前><代码> 2. 1.
    | |
    维维

    str_replace_all(str_trim("xx yy 11 22 33"),"","")

From stringr library you could try this:

  1. Remove consecutive fill blanks
  2. Remove fill blank

    library(stringr)

                2.         1.
                |          |
                V          V
    
        str_replace_all(str_trim(" xx yy 11 22  33 "), " ", "")
    
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文