如何创建包含一串星号的列来指示 R 数据框中因子的水平
(今天的第二个问题 - 一定是糟糕的一天)
我有一个包含各种列的数据框,包括浓度列(数字)、突出显示无效结果的标志(布尔值)和问题描述(字符)
df <- structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), rawconc = c(77.4,
52.6, 86.5, 44.5, 167, 16.2, 59.3, 123, 1.95, 181), reason = structure(c(NA,
NA, 2L, NA, NA, NA, 2L, 1L, NA, NA), .Label = c("Fails Acceptance Criteria",
"Poor Injection"), class = "factor"), flag = c("False", "False",
"True", "False", "False", "False", "True", "True", "False", "False"
)), .Names = c("x", "rawconc", "reason", "flag"), row.names = c(NA,
-10L), class = "data.frame")
我可以创建一个列原因列的数字级别
df$level<-as.numeric(df$reason)
df
x rawconc reason flag level
1 1 77.40 <NA> False NA
2 2 52.60 <NA> False NA
3 3 86.50 Poor Injection True 2
4 4 44.50 <NA> False NA
5 5 167.00 <NA> False NA
6 6 16.20 <NA> False NA
7 7 59.30 Poor Injection True 2
8 8 123.00 Fails Acceptance Criteria True 1
9 9 1.95 <NA> False NA
10 10 181.00 <NA> False NA
,这是我想要创建一个具有“级别”许多星星的列,但它失败了
df$stars<-paste(rep("*",df$level)sep="",collapse="")
Error: unexpected symbol in "df$stars<-paste(rep("*",df$level)sep"
df$stars<-paste(rep("*",df$level),sep="",collapse="")
Error in rep("*", df$level) : invalid 'times' argument
rep("*",df$level)
Error in rep("*", df$level) : invalid 'times' argument
df$stars<-paste(rep("*",pmax(df$level,0,na.rm=TRUE)),sep="",collapse="")
Error in rep("*", pmax(df$level, 0, na.rm = TRUE)) :
invalid 'times' argument
似乎需要一次向rep提供一个值。我觉得这应该是可能的(我的直觉说“使用 lapply”,但我的 apply fu 很差)
有人想尝试吗?
(second question today - must be a bad day)
I have a dataframe with various columns, including a concentration column (numeric), a flag highlighting invalid results (boolean) and a description of the problem (character)
df <- structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), rawconc = c(77.4,
52.6, 86.5, 44.5, 167, 16.2, 59.3, 123, 1.95, 181), reason = structure(c(NA,
NA, 2L, NA, NA, NA, 2L, 1L, NA, NA), .Label = c("Fails Acceptance Criteria",
"Poor Injection"), class = "factor"), flag = c("False", "False",
"True", "False", "False", "False", "True", "True", "False", "False"
)), .Names = c("x", "rawconc", "reason", "flag"), row.names = c(NA,
-10L), class = "data.frame")
I can create a column with the numeric level of the reason column
df$level<-as.numeric(df$reason)
df
x rawconc reason flag level
1 1 77.40 <NA> False NA
2 2 52.60 <NA> False NA
3 3 86.50 Poor Injection True 2
4 4 44.50 <NA> False NA
5 5 167.00 <NA> False NA
6 6 16.20 <NA> False NA
7 7 59.30 Poor Injection True 2
8 8 123.00 Fails Acceptance Criteria True 1
9 9 1.95 <NA> False NA
10 10 181.00 <NA> False NA
and here's what I want to do to create a column with 'level' many stars, but it fails
df$stars<-paste(rep("*",df$level)sep="",collapse="")
Error: unexpected symbol in "df$stars<-paste(rep("*",df$level)sep"
df$stars<-paste(rep("*",df$level),sep="",collapse="")
Error in rep("*", df$level) : invalid 'times' argument
rep("*",df$level)
Error in rep("*", df$level) : invalid 'times' argument
df$stars<-paste(rep("*",pmax(df$level,0,na.rm=TRUE)),sep="",collapse="")
Error in rep("*", pmax(df$level, 0, na.rm = TRUE)) :
invalid 'times' argument
It seems that rep needs to be fed one value at a time. I feel that this should be possible (and my gut says 'use lapply' but my apply fu is v. poor)
Any one want to try ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以创建星星向量,
然后用 df$reason 对其进行索引(之所以有效,是因为它是一个因素):
对于大型
data.frame
应该比快得多将
粘贴到每一行中。You could create stars vector as
And then indexing it with
df$reason
(which works because its a factor):For large
data.frame
should be much faster thenpaste
in each row.我认为你需要一个应用类型的函数。这会起作用:
在这种情况下,您最好使用
sapply
而不是lapply
,因为它返回一个向量而不是一个列表。来自代表的帮助:
将
rep
与 times 参数的向量一起使用的一个问题是,它仅返回一个向量,并且在 times=0 时丢弃实例。您可以使用以下命令查看这一点:rep(rep("*", nrow(df)), times=df$level)
。I think that you will need an apply-type function. This will work:
You would be better using
sapply
thanlapply
in this instance since it returns a vector instead of a list.From the help for rep:
One problem with using
rep
with a vector for the times parameter is that it just returns a vector and it discards instances when times=0. You can see this with this command:rep(rep("*", nrow(df)), times=df$level)
.