用有时会出现的定界符分开弦
我有一个颜色列有时包含一个颜色参考,有时包含多种颜色,这些颜色被“ ||”分开。
library(tidyverse)
id <- c(1:10)
colour <- c("sky-blue","blood-red","lavender-purple",
"sky-blue||blood-red", "midnight-blue", "blood-red||lavender-purple||sky-blue",
"grass-green","sky-blue||blood-red||lavender-purple||midnight-blue",
"grass-green","grass-green||midnight-blue")
df <- tibble("id" = id,
"colour" = colour)
# A tibble: 10 × 2
id colour
<int> <chr>
1 1 sky-blue
2 2 blood-red
3 3 lavender-purple
4 4 sky-blue||blood-red
5 5 midnight-blue
6 6 blood-red||lavender-purple||sky-blue
7 7 grass-green
8 8 sky-blue||blood-red||lavender-purple||midnight-blue
9 9 grass-green
10 10 grass-green||midnight-blue
我想将这些颜色分为单个列,以便每列仅包含一种颜色,然后我想用重复的ID堆叠颜色。 (使用gatch()
)新颜色列的名称相当无关,因此我去了“ col_1”,“ col_2”等,因为然后我将再次堆叠它们。但是,如果我运行andy()
,则可以执行以下操作:
df %>%
separate(colour, into = c("col_1","col_2","col_3","col_4"), sep = "||")
# A tibble: 10 × 5
id col_1 col_2 col_3 col_4
<int> <chr> <chr> <chr> <chr>
1 1 "" s k y
2 2 "" b l o
3 3 "" l a v
4 4 "" s k y
5 5 "" m i d
6 6 "" b l o
7 7 "" g r a
8 8 "" s k y
9 9 "" g r a
10 10 "" g r a
如果我在单行上运行它,并在in =
中使用正确数量的列数运行。
我看过一些解决方案,但还没有找到涵盖不规则分离器发生的东西和不规则的表达长度。任何解决方案都是最欢迎的。
I have a colour column that sometimes contains a single colour reference, and sometimes contains multiple colours, which are separated by "||"
library(tidyverse)
id <- c(1:10)
colour <- c("sky-blue","blood-red","lavender-purple",
"sky-blue||blood-red", "midnight-blue", "blood-red||lavender-purple||sky-blue",
"grass-green","sky-blue||blood-red||lavender-purple||midnight-blue",
"grass-green","grass-green||midnight-blue")
df <- tibble("id" = id,
"colour" = colour)
# A tibble: 10 × 2
id colour
<int> <chr>
1 1 sky-blue
2 2 blood-red
3 3 lavender-purple
4 4 sky-blue||blood-red
5 5 midnight-blue
6 6 blood-red||lavender-purple||sky-blue
7 7 grass-green
8 8 sky-blue||blood-red||lavender-purple||midnight-blue
9 9 grass-green
10 10 grass-green||midnight-blue
I would like to separate those colours into individual columns, such that each column only contains one colour, then I would like to stack the colours with duplicate ids. (using gather()
) The name of the new colour columns is rather irrelevant, so I went for "col_1", "col_2", etc. , since I will then stack them again. However, if I run separate()
, it does the following:
df %>%
separate(colour, into = c("col_1","col_2","col_3","col_4"), sep = "||")
# A tibble: 10 × 5
id col_1 col_2 col_3 col_4
<int> <chr> <chr> <chr> <chr>
1 1 "" s k y
2 2 "" b l o
3 3 "" l a v
4 4 "" s k y
5 5 "" m i d
6 6 "" b l o
7 7 "" g r a
8 8 "" s k y
9 9 "" g r a
10 10 "" g r a
This also happens if I run it on a single row with the exact right number of columns in the into=
I have looked at some solutions, but haven't found something that covers irregular separator occurrence, and irregular expression length. Any solution would be most welcome.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要使用
\\
逃脱特殊符号。因此,尝试:
输出是:
You need to escape special symbols with
\\
.So try:
Output is: