R 更改数据框中的元素
正如标题所示,我试图将数据框中的元素从一个字符更改为另一个字符。数据框如下:
g1=c("CC","DD","GG")
g2=c("AA","BB","EE")
g3=c("HH","II","JJ")
df=data.frame(g1,g2,g3)
我希望将元素从字母格式转换为字母/字母格式(例如CC到C/C或AA到A/A)
我知道使用“strsplit”可以在列表上使用。 我还知道我需要以某种方式合并:collapse="/"
我如何能够将 strsplit 函数应用到整个数据帧?
我在想一些类似的事情:
split=function(x)
{
unlist(paste(strsplit(x,""),collapse="/"))
}
j=as.data.frame(apply(df,1,split))
但它没有给出预期的结果。
更新 - - - - - - - - 显然,以下脚本是有效的:
split=function(x)
{
paste(unlist(strsplit(x,"")),collapse="/")
}
p=apply(df,c(1,2),split)
如果有更有效或更方便的方法,请随时分享。
I'm trying to, as the title says, change elements from my dataframe from one character to another. The dataframe is as follows:
g1=c("CC","DD","GG")
g2=c("AA","BB","EE")
g3=c("HH","II","JJ")
df=data.frame(g1,g2,g3)
I wish to convert the elements from letterletter format to letter/letter format (e.g. CC to C/C or AA to A/A)
I know using "strsplit" would work on a list.
I also know that I would need to somehow incorporate: collapse="/"
How would I be able to apply the strsplit function to the entire dataframe?
I was thinking something along the lines of:
split=function(x)
{
unlist(paste(strsplit(x,""),collapse="/"))
}
j=as.data.frame(apply(df,1,split))
but it doesn't give the desired results.
Update----------------
Apparently, the following script works:
split=function(x)
{
paste(unlist(strsplit(x,"")),collapse="/")
}
p=apply(df,c(1,2),split)
If there is a more efficient or convenient way, please feel free to share.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我可以想到两种方法来解决这个问题。一种是像您一样使用
strsplit
。您只缺少循环遍历从strsplit
返回的列表中的每个元素的部分:另一种方法是使用
gsub
和\\B 符号,它匹配不在“单词”开头或结尾的空字符串。
“单词”的构成取决于区域设置和实现,因此这里是另一个使用 gsub 和反向引用的解决方案。
I can think of two ways to approach this. One is using
strsplit
like you did. You were only missing the portion where you loop over each element in the list returned fromstrsplit
:Another approach would be to use
gsub
and the\\B
symbol, which matches the empty string that isn't at the beginning or end of a "word".What constitutes a "word" depends on locale and implementation, so here's another solution using
gsub
and back-references.从这样的函数定义开始
通过
insertslash(g1)
说服自己它执行了它应该做的事情。要将其应用到数据帧的所有列,请执行以下操作:
显然,您可以将其滚动到一个令人讨厌的单行代码中:
Start with a function definition like this
Convince yourself that it does what it should by
insertslash(g1)
.To apply it to all columns of the dataframe, do this:
Obviously, you can roll this into one nasty one-liner:
这是使用 gsub 的一些技巧。对正则表达式了解更多的人应该能够对此进行改进:
您最初的解决方案不起作用的原因是您在错误的位置
unlist
。因此,如果您稍后unlist
并使用lapply
,事情就会如您所期望的那样工作:Here's a bit of a hack using
gsub
. Someone who knows more about regex ought to be able to improve on this:The reason your original solution wasn't working was because you were
unlist
ing at the wrong spot. So if youunlist
later and uselapply
things work as you might expect:另一个使用paste()的黑客,绝对不是那么优雅,但它完成了工作。
Another hack using paste(), definitely not as elegant but it gets the job done.