如何从r中与多个逗号分离的向量中提取特定文本
这是我的第一篇文章,我对R World是相对较新的文章,因此我希望我尊重地将问题发布到网站上。我搜索这个,但我无法提出一些有效的效率。
我有一个具有这样一个结构的列:
df$col1 <- c("book, pencil,eraser,pen", "book,pen", "music,art,sport", "apple, banana, kiwi, watermelon", "Earth, Mars, Jupiter").
我想做的是我想创建一个将根据col1
的某些元素构建的新列。
如果第一个单元格具有2个逗号,那么我想在第一个和第二个逗号之间提取元素,然后将其写入新列中的第一个单元格。如果下一个单元格有3个逗号,那么我想在第二和第三逗号之间提取元素,然后将其写入新列中的第二个单元格,依此类推。
从COL1的示例可以看出,我没有按逗号数量的顺序进行细胞,因此有时在以下细胞中可能会再次发生三凸位分隔的细胞结构。我也需要考虑这一点。
在这方面,你能帮我吗?
提前感谢您的帮助!
This is my first post and I am relatively new to R world so I hope I post my question respectfully to the website. I search for this but I could not come up with something efficient.
I have a column that has such a structure:
df$col1 <- c("book, pencil,eraser,pen", "book,pen", "music,art,sport", "apple, banana, kiwi, watermelon", "Earth, Mars, Jupiter").
what I would like to do is that I would like to create a new column that is going to be built based on certain elements of the col1
.
If the first cell has 2 commas, then I would like to extract the element between the first and the second comma and write it to the first cell in the new column. If the next cell has 3 commas, then I would like to extract the element between the second and third comma and write it to the second cell in the new column and so on.
As can be seen from the example of col1, I have cells not in order of the number of commas so sometimes a three-comma-separated cell structure might occur again in the following cells. I need to account for that too.
Could you please help me in this regard?
Your help is much appreciated in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
那以下呢?
What about the following?
这是一种直接的正则解决方案,将前词的前词提取到一个新列中:
数据:
Here's a straightforward regex solution to extract the pre-ultimate word into a new column:
Data:
您可以使用
strsplit
。我这个情况n
是3。编辑
如果我正确理解您的问题,您可以做这样的事情:
You could use
strsplit
. I this casen
is 3.EDIT
If I understand your question correctly, you could do something like this: