如何计算 R 中某一行中特定值出现的次数

发布于 2024-11-07 09:10:13 字数 7807 浏览 0 评论 0原文

我遇到了一个非常棘手的问题,我似乎无法解决。

我有一个很大的数据集(23277 行,151 列)。每列的值范围为 0:100(含),表示为世界上的事件分配的概率。

作为计算每个人得分的一部分,我需要计算数据集中每个值的出现次数。

我第一次尝试应用,但我需要忽略 NA 和子集,因此当我尝试以下操作时:

apply(ans.samp, 1, sum(ans.samp[ans==0]), na.rm=TRUE)

我收到错误消息: sum(ans.samp[ans == 0])' 不是函数、字符或符号

我使用 sapply、vapply、tapply 和 do.call 重复此过程,但无济于事。

放弃矢量化解决方案,我编写了以下 for 循环。

RespCount <- function (x) { for (i in (1:nrow(x))) 
  { res <- vector(mode="numeric", length=nrow(x))
    ans.tmp <- x[i,]
    res[i] <- length(ans.tmp[ans.tmp==0])
    print(res)
  }
return(res)
}

然而,在我开始工作后,它只返回样本中 O 的总和。

我希望得到一些帮助,因为我面临一些时间压力,并且我希望将来能够在 R 中解决此类问题。

包含用于重现性的样本数据:

structure(list(X = 1:6, X100 = c(70L, NA, 80L, 0L, 40L, NA), 
    X10 = c(30L, NA, NA, NA, NA, NA), X1 = c(50L, NA, NA, NA, 
    NA, NA), X11 = c(50L, NA, NA, NA, NA, NA), X12 = c(30L, NA, 
    NA, NA, NA, NA), X13 = c(50L, NA, NA, NA, NA, NA), X14 = c(70L, 
    NA, NA, NA, NA, NA), X15 = c(60L, NA, NA, NA, NA, NA), X158 = c(30L, 
    NA, NA, NA, NA, NA), X159 = c(50L, NA, NA, NA, NA, NA), X160 = c(80L, 
    NA, NA, NA, NA, NA), X16 = c(50L, NA, NA, NA, NA, NA), X161 = c(40L, 
    NA, NA, NA, NA, NA), X162 = c(100L, NA, NA, NA, NA, NA), 
    X163 = c(50L, NA, NA, NA, NA, NA), X164 = c(0L, NA, NA, NA, 
    NA, NA), X165 = c(0L, NA, NA, NA, NA, NA), X166 = c(20L, 
    NA, NA, NA, NA, NA), X167 = c(0L, NA, NA, NA, NA, NA), X168 = c(30L, 
    NA, NA, NA, NA, NA), X169 = c(100L, NA, NA, NA, NA, NA), 
    X170 = c(30L, NA, NA, NA, NA, NA), X17 = c(40L, NA, NA, NA, 
    NA, NA), X171 = c(50L, NA, NA, NA, NA, NA), X172 = c(20L, 
    NA, NA, NA, NA, NA), X173 = c(30L, NA, NA, NA, NA, NA), X174 = c(20L, 
    NA, NA, NA, NA, NA), X175 = c(30L, NA, NA, NA, NA, NA), X176 = c(10L, 
    NA, NA, NA, NA, NA), X177 = c(70L, NA, NA, NA, NA, NA), X178 = c(40L, 
    NA, NA, NA, NA, NA), X179 = c(70L, NA, NA, NA, NA, NA), X180 = c(0L, 
    NA, NA, NA, NA, NA), X18 = c(30L, NA, NA, NA, NA, NA), X181 = c(100L, 
    NA, NA, NA, NA, NA), X182 = c(100L, NA, NA, NA, NA, NA), 
    X183 = c(20L, NA, NA, NA, NA, NA), X184 = c(80L, NA, NA, 
    NA, NA, NA), X185 = c(90L, NA, NA, NA, NA, NA), X186 = c(0L, 
    NA, NA, NA, NA, NA), X187 = c(10L, NA, NA, NA, NA, NA), X188 = c(100L, 
    NA, NA, NA, NA, NA), X189 = c(100L, NA, NA, NA, NA, NA), 
    X190 = c(0L, NA, NA, NA, NA, NA), X19 = c(100L, NA, NA, NA, 
    NA, NA), X191 = c(0L, NA, NA, NA, NA, NA), X192 = c(90L, 
    NA, NA, NA, NA, NA), X193 = c(50L, NA, NA, NA, NA, NA), X194 = c(100L, 
    NA, NA, NA, NA, NA), X195 = c(10L, NA, NA, NA, NA, NA), X196 = c(100L, 
    NA, NA, NA, NA, NA), X197 = c(20L, NA, NA, NA, NA, NA), X198 = c(40L, 
    NA, NA, NA, NA, NA), X199 = c(20L, NA, NA, NA, NA, NA), X200 = c(0L, 
    NA, NA, NA, NA, NA), X20 = c(0L, NA, NA, NA, NA, NA), X201 = c(0L, 
    NA, NA, NA, NA, NA), X202 = c(20L, NA, NA, NA, NA, NA), X203 = c(20L, 
    NA, NA, NA, NA, NA), X204 = c(80L, NA, NA, NA, NA, NA), X205 = c(0L, 
    NA, NA, NA, NA, NA), X206 = c(80L, NA, NA, NA, NA, NA), X207 = c(0L, 
    NA, NA, NA, NA, NA), X2 = c(10L, NA, NA, NA, NA, NA), X21 = c(0L, 
    NA, NA, NA, NA, NA), X22 = c(100L, NA, NA, NA, NA, NA), X23 = c(50L, 
    NA, NA, NA, NA, NA), X24 = c(50L, NA, NA, NA, NA, NA), X25 = c(70L, 
    NA, NA, NA, NA, NA), X26 = c(60L, NA, NA, NA, NA, NA), X27 = c(40L, 
    NA, NA, NA, NA, NA), X28 = c(20L, NA, NA, NA, NA, NA), X29 = c(0L, 
    NA, NA, NA, NA, NA), X30 = c(90L, NA, NA, NA, NA, NA), X3 = c(0L, 
    NA, NA, NA, NA, NA), X31 = c(50L, NA, NA, NA, NA, NA), X32 = c(50L, 
    NA, NA, NA, NA, NA), X33 = c(0L, NA, NA, NA, NA, NA), X34 = c(50L, 
    NA, NA, NA, NA, NA), X35 = c(90L, NA, NA, NA, NA, NA), X36 = c(50L, 
    NA, NA, NA, NA, NA), X37 = c(60L, NA, NA, NA, NA, NA), X38 = c(40L, 
    NA, NA, NA, NA, NA), X39 = c(50L, NA, NA, NA, NA, NA), X40 = c(0L, 
    NA, NA, NA, NA, NA), X4 = c(50L, NA, NA, NA, NA, NA), X41 = c(90L, 
    NA, NA, NA, NA, NA), X42 = c(80L, NA, NA, NA, NA, NA), X43 = c(50L, 
    NA, NA, NA, NA, NA), X44 = c(80L, NA, NA, NA, NA, NA), X45 = c(80L, 
    NA, NA, NA, NA, NA), X46 = c(0L, NA, NA, NA, NA, NA), X47 = c(80L, 
    NA, NA, NA, NA, NA), X48 = c(20L, NA, NA, NA, NA, NA), X49 = c(100L, 
    NA, NA, NA, NA, NA), X50 = c(0L, NA, NA, NA, NA, NA), X5 = c(0L, 
    NA, NA, NA, NA, NA), X51 = c(80L, 100L, 70L, 100L, 0L, 60L
    ), X52 = c(10L, 0L, 0L, 0L, 0L, 20L), X53 = c(40L, 40L, 70L, 
    20L, 90L, 50L), X54 = c(0L, 10L, 0L, 50L, 50L, 0L), X55 = c(20L, 
    80L, 90L, 80L, 30L, 0L), X56 = c(100L, 100L, 50L, 100L, 80L, 
    100L), X57 = c(60L, 0L, 100L, 70L, 100L, 80L), X58 = c(100L, 
    100L, 100L, 50L, 100L, 100L), X59 = c(80L, 50L, 80L, 0L, 
    30L, 50L), X60 = c(70L, 50L, 60L, 50L, 100L, 100L), X6 = c(100L, 
    NA, NA, NA, NA, NA), X61 = c(50L, 50L, 50L, 30L, 70L, 50L
    ), X62 = c(20L, 50L, 40L, 40L, 50L, 100L), X63 = c(50L, 0L, 
    100L, 10L, 50L, 100L), X64 = c(60L, 30L, 0L, 50L, 50L, 50L
    ), X65 = c(50L, 50L, 70L, 80L, 50L, 50L), X66 = c(70L, 40L, 
    10L, 90L, 60L, 50L), X67 = c(30L, 50L, 50L, 0L, 50L, 60L), 
    X68 = c(30L, 0L, 0L, 40L, 70L, 80L), X69 = c(30L, NA, 70L, 
    10L, 0L, 20L), X70 = c(80L, NA, 50L, 50L, 70L, 100L), X7 = c(100L, 
    NA, NA, NA, NA, NA), X71 = c(70L, NA, 50L, 100L, 100L, 100L
    ), X72 = c(60L, NA, 70L, 50L, 80L, 50L), X73 = c(80L, NA, 
    80L, 80L, 80L, NA), X74 = c(50L, NA, 50L, 0L, 50L, NA), X75 = c(30L, 
    NA, 70L, 10L, 80L, NA), X76 = c(70L, NA, 40L, 80L, 100L, 
    NA), X77 = c(80L, NA, 50L, 100L, 40L, NA), X78 = c(80L, NA, 
    0L, 0L, 0L, NA), X79 = c(80L, NA, 50L, 50L, 50L, NA), X80 = c(40L, 
    NA, 90L, 70L, 60L, NA), X8 = c(50L, NA, NA, NA, NA, NA), 
    X81 = c(70L, NA, 60L, 40L, 80L, NA), X82 = c(80L, NA, 100L, 
    60L, 60L, NA), X83 = c(30L, NA, 100L, 30L, 0L, NA), X84 = c(80L, 
    NA, 0L, 60L, 100L, NA), X85 = c(80L, NA, 50L, 40L, 30L, NA
    ), X86 = c(50L, NA, 90L, 50L, 50L, NA), X87 = c(80L, NA, 
    50L, 70L, 20L, NA), X88 = c(40L, NA, 70L, 30L, 90L, NA), 
    X89 = c(50L, NA, 50L, 80L, 80L, NA), X90 = c(90L, NA, 100L, 
    60L, 100L, NA), X91 = c(0L, NA, 0L, 0L, 0L, NA), X9 = c(100L, 
    NA, NA, NA, NA, NA), X92 = c(50L, NA, 70L, 90L, 80L, NA), 
    X93 = c(40L, NA, 50L, 50L, 50L, NA), X94 = c(40L, NA, 0L, 
    60L, 40L, NA), X95 = c(90L, NA, 100L, 40L, 50L, NA), X96 = c(50L, 
    NA, 50L, 50L, 50L, NA), X97 = c(60L, NA, 60L, 100L, 50L, 
    NA), X98 = c(40L, NA, 40L, 0L, 0L, NA), X99 = c(30L, NA, 
    0L, 50L, 70L, NA)), .Names = c("X", "X100", "X10", "X1", 
"X11", "X12", "X13", "X14", "X15", "X158", "X159", "X160", "X16", 
"X161", "X162", "X163", "X164", "X165", "X166", "X167", "X168", 
"X169", "X170", "X17", "X171", "X172", "X173", "X174", "X175", 
"X176", "X177", "X178", "X179", "X180", "X18", "X181", "X182", 
"X183", "X184", "X185", "X186", "X187", "X188", "X189", "X190", 
"X19", "X191", "X192", "X193", "X194", "X195", "X196", "X197", 
"X198", "X199", "X200", "X20", "X201", "X202", "X203", "X204", 
"X205", "X206", "X207", "X2", "X21", "X22", "X23", "X24", "X25", 
"X26", "X27", "X28", "X29", "X30", "X3", "X31", "X32", "X33", 
"X34", "X35", "X36", "X37", "X38", "X39", "X40", "X4", "X41", 
"X42", "X43", "X44", "X45", "X46", "X47", "X48", "X49", "X50", 
"X5", "X51", "X52", "X53", "X54", "X55", "X56", "X57", "X58", 
"X59", "X60", "X6", "X61", "X62", "X63", "X64", "X65", "X66", 
"X67", "X68", "X69", "X70", "X7", "X71", "X72", "X73", "X74", 
"X75", "X76", "X77", "X78", "X79", "X80", "X8", "X81", "X82", 
"X83", "X84", "X85", "X86", "X87", "X88", "X89", "X90", "X91", 
"X9", "X92", "X93", "X94", "X95", "X96", "X97", "X98", "X99"), row.names = c(NA, 
6L), class = "data.frame")

任何见解将不胜感激。

从上面对小数据集的一些尝试来看,似乎正在计算每行的数字,但是当我返回 res 对象时,它仅仅给出了最终值。我该如何解决这个问题?

I am having quite a tricky problem, which i just cannot seem to solve.

I have a large dataset (23277 rows, 151 columns). Each column has values from 0:100 (inclusive) representing probabilities assigned for events in the world.

As part of calculating the score for each individual, I need to count the occurrences of each of the values in the dataset.

I first tried apply, but I need to ignore NA's, and subset, so when i tried the following:

apply(ans.samp, 1, sum(ans.samp[ans==0]), na.rm=TRUE)

I got the error message: sum(ans.samp[ans == 0])' is not a function, character or symbol

I repeated this process with sapply, vapply, tapply and do.call to no avail.

Giving up on a vectorised solution, I wrote the following for loop.

RespCount <- function (x) { for (i in (1:nrow(x))) 
  { res <- vector(mode="numeric", length=nrow(x))
    ans.tmp <- x[i,]
    res[i] <- length(ans.tmp[ans.tmp==0])
    print(res)
  }
return(res)
}

However, after i got this working, it returns only the total sum of O in the sample.

I would appreciate some help with this, as I am under some time pressure, and I would like to be able to solve these kinds of problems in R in the future.

Sample data included for reproducibility:

structure(list(X = 1:6, X100 = c(70L, NA, 80L, 0L, 40L, NA), 
    X10 = c(30L, NA, NA, NA, NA, NA), X1 = c(50L, NA, NA, NA, 
    NA, NA), X11 = c(50L, NA, NA, NA, NA, NA), X12 = c(30L, NA, 
    NA, NA, NA, NA), X13 = c(50L, NA, NA, NA, NA, NA), X14 = c(70L, 
    NA, NA, NA, NA, NA), X15 = c(60L, NA, NA, NA, NA, NA), X158 = c(30L, 
    NA, NA, NA, NA, NA), X159 = c(50L, NA, NA, NA, NA, NA), X160 = c(80L, 
    NA, NA, NA, NA, NA), X16 = c(50L, NA, NA, NA, NA, NA), X161 = c(40L, 
    NA, NA, NA, NA, NA), X162 = c(100L, NA, NA, NA, NA, NA), 
    X163 = c(50L, NA, NA, NA, NA, NA), X164 = c(0L, NA, NA, NA, 
    NA, NA), X165 = c(0L, NA, NA, NA, NA, NA), X166 = c(20L, 
    NA, NA, NA, NA, NA), X167 = c(0L, NA, NA, NA, NA, NA), X168 = c(30L, 
    NA, NA, NA, NA, NA), X169 = c(100L, NA, NA, NA, NA, NA), 
    X170 = c(30L, NA, NA, NA, NA, NA), X17 = c(40L, NA, NA, NA, 
    NA, NA), X171 = c(50L, NA, NA, NA, NA, NA), X172 = c(20L, 
    NA, NA, NA, NA, NA), X173 = c(30L, NA, NA, NA, NA, NA), X174 = c(20L, 
    NA, NA, NA, NA, NA), X175 = c(30L, NA, NA, NA, NA, NA), X176 = c(10L, 
    NA, NA, NA, NA, NA), X177 = c(70L, NA, NA, NA, NA, NA), X178 = c(40L, 
    NA, NA, NA, NA, NA), X179 = c(70L, NA, NA, NA, NA, NA), X180 = c(0L, 
    NA, NA, NA, NA, NA), X18 = c(30L, NA, NA, NA, NA, NA), X181 = c(100L, 
    NA, NA, NA, NA, NA), X182 = c(100L, NA, NA, NA, NA, NA), 
    X183 = c(20L, NA, NA, NA, NA, NA), X184 = c(80L, NA, NA, 
    NA, NA, NA), X185 = c(90L, NA, NA, NA, NA, NA), X186 = c(0L, 
    NA, NA, NA, NA, NA), X187 = c(10L, NA, NA, NA, NA, NA), X188 = c(100L, 
    NA, NA, NA, NA, NA), X189 = c(100L, NA, NA, NA, NA, NA), 
    X190 = c(0L, NA, NA, NA, NA, NA), X19 = c(100L, NA, NA, NA, 
    NA, NA), X191 = c(0L, NA, NA, NA, NA, NA), X192 = c(90L, 
    NA, NA, NA, NA, NA), X193 = c(50L, NA, NA, NA, NA, NA), X194 = c(100L, 
    NA, NA, NA, NA, NA), X195 = c(10L, NA, NA, NA, NA, NA), X196 = c(100L, 
    NA, NA, NA, NA, NA), X197 = c(20L, NA, NA, NA, NA, NA), X198 = c(40L, 
    NA, NA, NA, NA, NA), X199 = c(20L, NA, NA, NA, NA, NA), X200 = c(0L, 
    NA, NA, NA, NA, NA), X20 = c(0L, NA, NA, NA, NA, NA), X201 = c(0L, 
    NA, NA, NA, NA, NA), X202 = c(20L, NA, NA, NA, NA, NA), X203 = c(20L, 
    NA, NA, NA, NA, NA), X204 = c(80L, NA, NA, NA, NA, NA), X205 = c(0L, 
    NA, NA, NA, NA, NA), X206 = c(80L, NA, NA, NA, NA, NA), X207 = c(0L, 
    NA, NA, NA, NA, NA), X2 = c(10L, NA, NA, NA, NA, NA), X21 = c(0L, 
    NA, NA, NA, NA, NA), X22 = c(100L, NA, NA, NA, NA, NA), X23 = c(50L, 
    NA, NA, NA, NA, NA), X24 = c(50L, NA, NA, NA, NA, NA), X25 = c(70L, 
    NA, NA, NA, NA, NA), X26 = c(60L, NA, NA, NA, NA, NA), X27 = c(40L, 
    NA, NA, NA, NA, NA), X28 = c(20L, NA, NA, NA, NA, NA), X29 = c(0L, 
    NA, NA, NA, NA, NA), X30 = c(90L, NA, NA, NA, NA, NA), X3 = c(0L, 
    NA, NA, NA, NA, NA), X31 = c(50L, NA, NA, NA, NA, NA), X32 = c(50L, 
    NA, NA, NA, NA, NA), X33 = c(0L, NA, NA, NA, NA, NA), X34 = c(50L, 
    NA, NA, NA, NA, NA), X35 = c(90L, NA, NA, NA, NA, NA), X36 = c(50L, 
    NA, NA, NA, NA, NA), X37 = c(60L, NA, NA, NA, NA, NA), X38 = c(40L, 
    NA, NA, NA, NA, NA), X39 = c(50L, NA, NA, NA, NA, NA), X40 = c(0L, 
    NA, NA, NA, NA, NA), X4 = c(50L, NA, NA, NA, NA, NA), X41 = c(90L, 
    NA, NA, NA, NA, NA), X42 = c(80L, NA, NA, NA, NA, NA), X43 = c(50L, 
    NA, NA, NA, NA, NA), X44 = c(80L, NA, NA, NA, NA, NA), X45 = c(80L, 
    NA, NA, NA, NA, NA), X46 = c(0L, NA, NA, NA, NA, NA), X47 = c(80L, 
    NA, NA, NA, NA, NA), X48 = c(20L, NA, NA, NA, NA, NA), X49 = c(100L, 
    NA, NA, NA, NA, NA), X50 = c(0L, NA, NA, NA, NA, NA), X5 = c(0L, 
    NA, NA, NA, NA, NA), X51 = c(80L, 100L, 70L, 100L, 0L, 60L
    ), X52 = c(10L, 0L, 0L, 0L, 0L, 20L), X53 = c(40L, 40L, 70L, 
    20L, 90L, 50L), X54 = c(0L, 10L, 0L, 50L, 50L, 0L), X55 = c(20L, 
    80L, 90L, 80L, 30L, 0L), X56 = c(100L, 100L, 50L, 100L, 80L, 
    100L), X57 = c(60L, 0L, 100L, 70L, 100L, 80L), X58 = c(100L, 
    100L, 100L, 50L, 100L, 100L), X59 = c(80L, 50L, 80L, 0L, 
    30L, 50L), X60 = c(70L, 50L, 60L, 50L, 100L, 100L), X6 = c(100L, 
    NA, NA, NA, NA, NA), X61 = c(50L, 50L, 50L, 30L, 70L, 50L
    ), X62 = c(20L, 50L, 40L, 40L, 50L, 100L), X63 = c(50L, 0L, 
    100L, 10L, 50L, 100L), X64 = c(60L, 30L, 0L, 50L, 50L, 50L
    ), X65 = c(50L, 50L, 70L, 80L, 50L, 50L), X66 = c(70L, 40L, 
    10L, 90L, 60L, 50L), X67 = c(30L, 50L, 50L, 0L, 50L, 60L), 
    X68 = c(30L, 0L, 0L, 40L, 70L, 80L), X69 = c(30L, NA, 70L, 
    10L, 0L, 20L), X70 = c(80L, NA, 50L, 50L, 70L, 100L), X7 = c(100L, 
    NA, NA, NA, NA, NA), X71 = c(70L, NA, 50L, 100L, 100L, 100L
    ), X72 = c(60L, NA, 70L, 50L, 80L, 50L), X73 = c(80L, NA, 
    80L, 80L, 80L, NA), X74 = c(50L, NA, 50L, 0L, 50L, NA), X75 = c(30L, 
    NA, 70L, 10L, 80L, NA), X76 = c(70L, NA, 40L, 80L, 100L, 
    NA), X77 = c(80L, NA, 50L, 100L, 40L, NA), X78 = c(80L, NA, 
    0L, 0L, 0L, NA), X79 = c(80L, NA, 50L, 50L, 50L, NA), X80 = c(40L, 
    NA, 90L, 70L, 60L, NA), X8 = c(50L, NA, NA, NA, NA, NA), 
    X81 = c(70L, NA, 60L, 40L, 80L, NA), X82 = c(80L, NA, 100L, 
    60L, 60L, NA), X83 = c(30L, NA, 100L, 30L, 0L, NA), X84 = c(80L, 
    NA, 0L, 60L, 100L, NA), X85 = c(80L, NA, 50L, 40L, 30L, NA
    ), X86 = c(50L, NA, 90L, 50L, 50L, NA), X87 = c(80L, NA, 
    50L, 70L, 20L, NA), X88 = c(40L, NA, 70L, 30L, 90L, NA), 
    X89 = c(50L, NA, 50L, 80L, 80L, NA), X90 = c(90L, NA, 100L, 
    60L, 100L, NA), X91 = c(0L, NA, 0L, 0L, 0L, NA), X9 = c(100L, 
    NA, NA, NA, NA, NA), X92 = c(50L, NA, 70L, 90L, 80L, NA), 
    X93 = c(40L, NA, 50L, 50L, 50L, NA), X94 = c(40L, NA, 0L, 
    60L, 40L, NA), X95 = c(90L, NA, 100L, 40L, 50L, NA), X96 = c(50L, 
    NA, 50L, 50L, 50L, NA), X97 = c(60L, NA, 60L, 100L, 50L, 
    NA), X98 = c(40L, NA, 40L, 0L, 0L, NA), X99 = c(30L, NA, 
    0L, 50L, 70L, NA)), .Names = c("X", "X100", "X10", "X1", 
"X11", "X12", "X13", "X14", "X15", "X158", "X159", "X160", "X16", 
"X161", "X162", "X163", "X164", "X165", "X166", "X167", "X168", 
"X169", "X170", "X17", "X171", "X172", "X173", "X174", "X175", 
"X176", "X177", "X178", "X179", "X180", "X18", "X181", "X182", 
"X183", "X184", "X185", "X186", "X187", "X188", "X189", "X190", 
"X19", "X191", "X192", "X193", "X194", "X195", "X196", "X197", 
"X198", "X199", "X200", "X20", "X201", "X202", "X203", "X204", 
"X205", "X206", "X207", "X2", "X21", "X22", "X23", "X24", "X25", 
"X26", "X27", "X28", "X29", "X30", "X3", "X31", "X32", "X33", 
"X34", "X35", "X36", "X37", "X38", "X39", "X40", "X4", "X41", 
"X42", "X43", "X44", "X45", "X46", "X47", "X48", "X49", "X50", 
"X5", "X51", "X52", "X53", "X54", "X55", "X56", "X57", "X58", 
"X59", "X60", "X6", "X61", "X62", "X63", "X64", "X65", "X66", 
"X67", "X68", "X69", "X70", "X7", "X71", "X72", "X73", "X74", 
"X75", "X76", "X77", "X78", "X79", "X80", "X8", "X81", "X82", 
"X83", "X84", "X85", "X86", "X87", "X88", "X89", "X90", "X91", 
"X9", "X92", "X93", "X94", "X95", "X96", "X97", "X98", "X99"), row.names = c(NA, 
6L), class = "data.frame")

Any insight would be greatly appreciated.

From some attempts on the small dataset above, it appears that the number is being calculated for each row, but when i return the res object, it merely gives me the final value. How can I fix this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

南七夏 2024-11-14 09:10:13

有两种方法可以使用 apply 系列函数。 则可以这样做。

apply(mat, 1, sum, na.rm=TRUE)

如果您想将函数 sum() 应用于每一行,并传递诸如 na.rm=TRUE 之类的附加参数, 或者您可以执行

apply(mat, 1, foo)

以下操作,其中 foo() 是您自己的外部定义的函数,例如

foo <- function(x) sum(x==0, na.rm=TRUE)

请注意,NA 处理也可能处理函数本身的参数,默认值设置为 TRUE,在上面的定义中,如 中所示

foo2 <- function(x, no.na=TRUE) sum(x==0, na.rm=no.na)

,您可以将其称为 apply(mat, 1, foo2, no.na=F),尽管它对于sum() 函数(除非您想检查是否有 NA 值,但在这种情况下最好使用 is.na() :-)。

最后,您可以直接内联定义您的函数,因为

apply(mat, 1, function(x) sum(x==0, na.rm=TRUE))

在您的情况下,它给了我

> apply(mat, 1, function(x) sum(x==0, na.rm=TRUE))
 1  2  3  4  5  6 
22  4  9  8  7  2 

相当于 apply(ex, 1, foo) 的功能。

There're two ways to use the apply family functions. Either you do

apply(mat, 1, sum, na.rm=TRUE)

if you want to apply the function sum()to each row, passing additional parameters like na.rm=TRUE. Or you can do

apply(mat, 1, foo)

where foo() is a function of your own, defined externally, e.g.

foo <- function(x) sum(x==0, na.rm=TRUE)

Note that NA handling might also be dealt with a parameter of the function itself, with default value set to TRUE, in the above definition, as in

foo2 <- function(x, no.na=TRUE) sum(x==0, na.rm=no.na)

and you can call it as apply(mat, 1, foo2, no.na=F) although it doesn't really make sense with the sum() function (unless you want to check if there're NA values, but in this case it's better to use is.na() :-).

Finally, you can define your function directly inline as

apply(mat, 1, function(x) sum(x==0, na.rm=TRUE))

In your case, it gives me

> apply(mat, 1, function(x) sum(x==0, na.rm=TRUE))
 1  2  3  4  5  6 
22  4  9  8  7  2 

which is equivalent to apply(ex, 1, foo).

江湖彼岸 2024-11-14 09:10:13

我们将您的数据集命名为 dat。您可以使用 table() 获取数据集中每个值的频率表。如果您想将其应用于数据框中的所有数据,请将数据强制为单个向量,并对结果向量使用 table()

table(do.call('c', dat))

这将为您提供:

> table(do.call('c', dat))
  0   1   2   3   4   5   6  10  20  30  40  50  60  70  80  90 100 
 52   1   1   1   1   1   1  10  16  21  25  76  19  25  37  14  45 

如果您想检查以下内容的频率 :单独的列,只需执行以下操作:

apply(dat, 1, table)

Let's call your dataset dat. You can use table() to get a table of frequencies for each value in your dataset. If you want to apply that to all data in your data frame, coerce the data to a single vector, and use table() on the resulting vector:

table(do.call('c', dat))

This gives you:

> table(do.call('c', dat))
  0   1   2   3   4   5   6  10  20  30  40  50  60  70  80  90 100 
 52   1   1   1   1   1   1  10  16  21  25  76  19  25  37  14  45 

If you want to check frequencies for individual columns, simply do:

apply(dat, 1, table)
节枝 2024-11-14 09:10:13

对于名为 df 的 data.frame 中的数据,

sapply(df + 1, tabulate, 101)

会生成一个 101 x 151 的矩阵,其中行对应于 0、1、...、100,列对应于 151 个样本;矩阵可能方便后续计算,并且制表比表格更快。

For data in a data.frame named df,

sapply(df + 1, tabulate, 101)

produces a matrix of 101 x 151, where rows correspond to 0, 1, ..., 100 and columns to the 151 samples; a matrix might be convenient for subsequent computation, and tabulate is faster than table.

复古式 2024-11-14 09:10:13

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

> apply(dfrm, 1, table)

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

1` 0 1 10 20 30 40 50 60 70 80 90 100 22 1 5 12 14 12 26 7 10 19 7 16

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

2` 0 2 10 30 40 50 80 100 4 1 1 1 2 6 1 3

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

3` 0 3 10 40 50 60 70 80 90 100 9 1 1 3 13 3 8 3 3 7

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

4` 0 4 10 20 30 40 50 60 70 80 90 100 8 1 3 1 3 5 11 4 3 5 2 5

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

5` 0 5 20 30 40 50 60 70 80 90 100 7 1 1 3 3 13 3 4 7 2 7

我正在尝试解决问题陈述,而不是在最初的部分工作中纠正编码问题。要计算一行中出现的次数,请使用 'apply' 和 'table'

6` 0 6 20 50 60 80 100 2 1 2 7 2 2 7

并注意此结果包括 x==0 情况的子集:

> sapply( apply(dfrm, 1, table), function(x) x['0'])
1.0 2.0 3.0 4.0 5.0 6.0 
 22   4   9   8   7   2 

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

> apply(dfrm, 1, table)

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

1` 0 1 10 20 30 40 50 60 70 80 90 100 22 1 5 12 14 12 26 7 10 19 7 16

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

2` 0 2 10 30 40 50 80 100 4 1 1 1 2 6 1 3

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

3` 0 3 10 40 50 60 70 80 90 100 9 1 1 3 13 3 8 3 3 7

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

4` 0 4 10 20 30 40 50 60 70 80 90 100 8 1 3 1 3 5 11 4 3 5 2 5

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

5` 0 5 20 30 40 50 60 70 80 90 100 7 1 1 3 3 13 3 4 7 2 7

I'm trying to address the problem statement, rather than correcitng the coding problem in what appeared to be an initial partial effort. To count the number of occurrences in a row, use 'apply' with 'table'

6` 0 6 20 50 60 80 100 2 1 2 7 2 2 7

And notice that this result includes as a subset the x==0 case:

> sapply( apply(dfrm, 1, table), function(x) x['0'])
1.0 2.0 3.0 4.0 5.0 6.0 
 22   4   9   8   7   2 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文