如何从r中的多个列中选择最投票的类别

发布于 2025-02-13 16:41:59 字数 571 浏览 1 评论 0原文

我有一个分类问题,我需要使用R解决,但是要真诚,我对如何做不知道。

我有一个表(见下文),其中不同的样本通过三个ML模型(每列)进行分类,并且我需要为每种情况选择“最投票”类别并将其写入新列。

当前表

所需的输出

我一直在阅读R中的分类变量,但任何东西似乎都符合我的特定需求。

任何帮助将不胜感激。

提前致谢。

JL

I have a classification problem I need to solve using R, but to be sincere I have no clue on how to do it.

I have a table (see below) where different samples are classified by three ML models (one per column), and I need to choose the "most voted" category for each case and write it to a new column.

Current table

enter image description here

Desired Output

enter image description here

I have been reading about categorical variables in R, but anything seem to fit my specific needs.

Any help would be highly appreciated.

Thanks in advance.

JL

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

我做我的改变 2025-02-20 16:41:59

这不是您问问题的方式。请参阅相关线程,并在将来提供以下表格中的数据(使用dput(),然后复制并粘贴从控制台的结果)。无论如何,这里是基本R解决方案:

# Calculate the modal values: mode => character vector
df1$mode <- apply(
  df1[,colnames(df1) != "samples"],
  1,
  function(x){
    head(
      names(
        sort(
          table(x), 
          decreasing = TRUE
        )
      ),
     1
    )
  }
)

数据:

df1 <- structure(list(samples = c("S1", "D4", "S2", "D1", "D2", "S3", 
"D3", "S4"), RFpred = c("Carrier", "Absent", "Helper", "Helper", 
"Carrier", "Absent", "Resistant", "Carrier"), SVMpred = c("Absent", 
"Absent", "Helper", "Helper", "Carrier", "Helper", "Helper", 
"Resistant"), KNNpred = c("Carrier", "Absent", "Carrier", "Helper", 
"Carrier", "Absent", "Helper", "Resistant"), mode = c("Carrier", 
"Absent", "Helper", "Helper", "Carrier", "Absent", "Helper", 
"Resistant")), row.names = c(NA, -8L), class = "data.frame")

This is not how you ask a question. Please see the relevant thread, and in the future offer the data in the form shown below (using dput() and copy and paste the result from the console). At any rate here is a base R solution:

# Calculate the modal values: mode => character vector
df1$mode <- apply(
  df1[,colnames(df1) != "samples"],
  1,
  function(x){
    head(
      names(
        sort(
          table(x), 
          decreasing = TRUE
        )
      ),
     1
    )
  }
)

Data:

df1 <- structure(list(samples = c("S1", "D4", "S2", "D1", "D2", "S3", 
"D3", "S4"), RFpred = c("Carrier", "Absent", "Helper", "Helper", 
"Carrier", "Absent", "Resistant", "Carrier"), SVMpred = c("Absent", 
"Absent", "Helper", "Helper", "Carrier", "Helper", "Helper", 
"Resistant"), KNNpred = c("Carrier", "Absent", "Carrier", "Helper", 
"Carrier", "Absent", "Helper", "Resistant"), mode = c("Carrier", 
"Absent", "Helper", "Helper", "Carrier", "Absent", "Helper", 
"Resistant")), row.names = c(NA, -8L), class = "data.frame")
寂寞清仓 2025-02-20 16:41:59

平淡无奇的方法:

library(dplyr)
library(tibble)

mode_char <- function(x) {
    ux <- unique(na.omit(x))
    ux[which.max(tabulate(match(x, ux)))]
}

df %>%
    as_tibble() %>%
    rowwise() %>%
    mutate(
        Vote = mode_char(c_across(RFpred:KNNpred))
    )

#> # A tibble: 8 × 5
#> # Rowwise: 
#>   samples RFpred    SVMpred   KNNpred   Vote     
#>   <chr>   <chr>     <chr>     <chr>     <chr>    
#> 1 S1      Carrier   Absent    Carrier   Carrier  
#> 2 D4      Absent    Absent    Absent    Absent   
#> 3 S2      Helper    Helper    Carrier   Helper   
#> 4 D1      Helper    Helper    Helper    Helper   
#> 5 D2      Carrier   Carrier   Carrier   Carrier  
#> 6 S3      Absent    Helper    Absent    Absent   
#> 7 D3      Resistant Helper    Helper    Helper   
#> 8 S4      Carrier   Resistant Resistant Resistant

Tidyverse Approach:

library(dplyr)
library(tibble)

mode_char <- function(x) {
    ux <- unique(na.omit(x))
    ux[which.max(tabulate(match(x, ux)))]
}

df %>%
    as_tibble() %>%
    rowwise() %>%
    mutate(
        Vote = mode_char(c_across(RFpred:KNNpred))
    )

#> # A tibble: 8 × 5
#> # Rowwise: 
#>   samples RFpred    SVMpred   KNNpred   Vote     
#>   <chr>   <chr>     <chr>     <chr>     <chr>    
#> 1 S1      Carrier   Absent    Carrier   Carrier  
#> 2 D4      Absent    Absent    Absent    Absent   
#> 3 S2      Helper    Helper    Carrier   Helper   
#> 4 D1      Helper    Helper    Helper    Helper   
#> 5 D2      Carrier   Carrier   Carrier   Carrier  
#> 6 S3      Absent    Helper    Absent    Absent   
#> 7 D3      Resistant Helper    Helper    Helper   
#> 8 S4      Carrier   Resistant Resistant Resistant
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文