如何创建包含一串星号的列来指示 R 数据框中因子的水平

发布于 2024-08-28 01:08:18 字数 1895 浏览 5 评论 0原文

(今天的第二个问题 - 一定是糟糕的一天)

我有一个包含各种列的数据框,包括浓度列(数字)、突出显示无效结果的标志(布尔值)和问题描述(字符)

df <- structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), rawconc = c(77.4, 
52.6, 86.5, 44.5, 167, 16.2, 59.3, 123, 1.95, 181), reason = structure(c(NA, 
NA, 2L, NA, NA, NA, 2L, 1L, NA, NA), .Label = c("Fails Acceptance Criteria", 
"Poor Injection"), class = "factor"), flag = c("False", "False", 
"True", "False", "False", "False", "True", "True", "False", "False"
)), .Names = c("x", "rawconc", "reason", "flag"), row.names = c(NA, 
-10L), class = "data.frame")

我可以创建一个列原因列的数字级别

df$level<-as.numeric(df$reason)
df
    x rawconc                    reason  flag level
1   1   77.40                      <NA> False    NA
2   2   52.60                      <NA> False    NA
3   3   86.50            Poor Injection  True     2
4   4   44.50                      <NA> False    NA
5   5  167.00                      <NA> False    NA
6   6   16.20                      <NA> False    NA
7   7   59.30            Poor Injection  True     2
8   8  123.00 Fails Acceptance Criteria  True     1
9   9    1.95                      <NA> False    NA
10 10  181.00                      <NA> False    NA

,这是我想要创建一个具有“级别”许多星星的列,但它失败了

df$stars<-paste(rep("*",df$level)sep="",collapse="")
Error: unexpected symbol in "df$stars<-paste(rep("*",df$level)sep"

df$stars<-paste(rep("*",df$level),sep="",collapse="")
Error in rep("*", df$level) : invalid 'times' argument

rep("*",df$level)
Error in rep("*", df$level) : invalid 'times' argument

df$stars<-paste(rep("*",pmax(df$level,0,na.rm=TRUE)),sep="",collapse="")
Error in rep("*", pmax(df$level, 0, na.rm = TRUE)) : 
  invalid 'times' argument

似乎需要一次向rep提供一个值。我觉得这应该是可能的(我的直觉说“使用 lapply”,但我的 apply fu 很差)

有人想尝试吗?

(second question today - must be a bad day)

I have a dataframe with various columns, including a concentration column (numeric), a flag highlighting invalid results (boolean) and a description of the problem (character)

df <- structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), rawconc = c(77.4, 
52.6, 86.5, 44.5, 167, 16.2, 59.3, 123, 1.95, 181), reason = structure(c(NA, 
NA, 2L, NA, NA, NA, 2L, 1L, NA, NA), .Label = c("Fails Acceptance Criteria", 
"Poor Injection"), class = "factor"), flag = c("False", "False", 
"True", "False", "False", "False", "True", "True", "False", "False"
)), .Names = c("x", "rawconc", "reason", "flag"), row.names = c(NA, 
-10L), class = "data.frame")

I can create a column with the numeric level of the reason column

df$level<-as.numeric(df$reason)
df
    x rawconc                    reason  flag level
1   1   77.40                      <NA> False    NA
2   2   52.60                      <NA> False    NA
3   3   86.50            Poor Injection  True     2
4   4   44.50                      <NA> False    NA
5   5  167.00                      <NA> False    NA
6   6   16.20                      <NA> False    NA
7   7   59.30            Poor Injection  True     2
8   8  123.00 Fails Acceptance Criteria  True     1
9   9    1.95                      <NA> False    NA
10 10  181.00                      <NA> False    NA

and here's what I want to do to create a column with 'level' many stars, but it fails

df$stars<-paste(rep("*",df$level)sep="",collapse="")
Error: unexpected symbol in "df$stars<-paste(rep("*",df$level)sep"

df$stars<-paste(rep("*",df$level),sep="",collapse="")
Error in rep("*", df$level) : invalid 'times' argument

rep("*",df$level)
Error in rep("*", df$level) : invalid 'times' argument

df$stars<-paste(rep("*",pmax(df$level,0,na.rm=TRUE)),sep="",collapse="")
Error in rep("*", pmax(df$level, 0, na.rm = TRUE)) : 
  invalid 'times' argument

It seems that rep needs to be fed one value at a time. I feel that this should be possible (and my gut says 'use lapply' but my apply fu is v. poor)

Any one want to try ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

青春有你 2024-09-04 01:08:18

您可以创建星星向量,

vstars <- sapply(1L:nlevels(df$reason), function(i) paste(rep("*",i),collapse=""))
vstars
# [1] "*"  "**"

然后用 df$reason 对其进行索引(之所以有效,是因为它是一个因素):

vstars[df$reason]
# [1] NA   NA   "**" NA   NA   NA   "**" "*"  NA   NA

对于大型 data.frame 应该比 快得多将粘贴到每一行中。

You could create stars vector as

vstars <- sapply(1L:nlevels(df$reason), function(i) paste(rep("*",i),collapse=""))
vstars
# [1] "*"  "**"

And then indexing it with df$reason (which works because its a factor):

vstars[df$reason]
# [1] NA   NA   "**" NA   NA   NA   "**" "*"  NA   NA

For large data.frame should be much faster then paste in each row.

深海不蓝 2024-09-04 01:08:18

我认为你需要一个应用类型的函数。这会起作用:

df[is.na(df$level),"level"] <- 0
df$level <- sapply(df$level, function(x) paste(rep("*",x),collapse=""))

在这种情况下,您最好使用 sapply 而不是 lapply,因为它返回一个向量而不是一个列表。

来自代表的帮助:

如果“times”由单个
整数,结果包括
整个输入重复了很多次。如果 'times' 是一个向量
与“x”长度相同(通过“each”复制后),
结果由“x[1]”重复“times[1]”次、“x[2]”组成
重复“times[2]”次,依此类推。

rep 与 times 参数的向量一起使用的一个问题是,它仅返回一个向量,并且在 times=0 时丢弃实例。您可以使用以下命令查看这一点:rep(rep("*", nrow(df)), times=df$level)

I think that you will need an apply-type function. This will work:

df[is.na(df$level),"level"] <- 0
df$level <- sapply(df$level, function(x) paste(rep("*",x),collapse=""))

You would be better using sapply than lapply in this instance since it returns a vector instead of a list.

From the help for rep:

If 'times' consists of a single
integer, the result consists of
the whole input repeated this many times. If 'times' is a vector
of the same length as 'x' (after replication by 'each'), the
result consists of 'x[1]' repeated 'times[1]' times, 'x[2]'
repeated 'times[2]' times and so on.

One problem with using rep with a vector for the times parameter is that it just returns a vector and it discards instances when times=0. You can see this with this command: rep(rep("*", nrow(df)), times=df$level).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文