在竖线字符“|”处分割字符串
我觉得这个问题被问了很多,但我发现的所有解决方案也不适合我。
我有一个 dataframe
,其中有一列(称为 ID
),其中有一串数字和字母(例如:Q8A203
)。在几行中,有两个由竖线分隔的结构(例如:Q8AA66|Q8AAT5
)。对于我的分析,保留哪一个并不重要,因此我想创建一个名为 NewColumn
的新列,在其中传输第一个列并在 |
处拆分字符串。
我知道竖线必须区别对待,并且我必须将 \\
放在前面。我尝试了 strsplit()
和 unlist()
:
df$NewColumn <- strsplit(df$ID,split='\\|',fixed=TRUE)
df$NewColumn <- unlist(strsplit(df$ID, " \\| ", fixed=TRUE))
这两个选项都从列 ID
到 NewColumn
返回完全相同的内容>。
我将非常感谢您的帮助。
I feel like this question is asked a lot but all the solutions I found don't work for me either.
I have a dataframe
with a column (called ID
) in which I have a string of numbers and letters (e.g: Q8A203
). In a few rows there are two of those constructs separated by a vertical bar (e.g: Q8AA66|Q8AAT5
). For my analysis it doesn't matter which one I keep so I wanted to make a new column named NewColumn
in which I transfer the first and split the string at |
.
I know that the vertical bar must be treated differently and that I have to put \\
in front. I tried strsplit()
and unlist()
:
df$NewColumn <- strsplit(df$ID,split='\\|',fixed=TRUE)
df$NewColumn <- unlist(strsplit(df$ID, " \\| ", fixed=TRUE))
Both options return the exact same content from column ID
to the NewColumn
.
I would very much appreciate the help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以简单地将第二部分替换为任何内容,而不是拆分,它将保留第一个 ID。
请下次添加一个最小的可重现示例(此处为您的
df
)以加快答案速度;)如果您删除固定选项,strsplit 就可以工作,但您需要提供精确的正则表达式。此外,您之后还需要使用列表,这更加复杂。
Rather than splitting you can simply substitute the second part with nothing and it will keep the first ID.
Please next time, add an minimal reproductible example (your
df
here) to speed up answers ;)strsplit can work if you remove the fixed option, but you need to provide an exact regex. Also, you will need to work with a list after, which is more complex.