矩阵计算该矩阵的一行和列组合的次数是数据框中

发布于 2025-01-26 05:45:33 字数 1488 浏览 4 评论 0原文

我仍然对R的新事物是新的,所以请与我一起露面:)

我需要创建一个矩阵,该矩阵计算数据框中存在该矩阵的行和列的组合的次数。

由于我的描述可能很含糊,因此我给了下面的示例集。实际上,我的数据集将在矩阵中包含更多的水果和数据框架中的更多果汁,因此我正在寻找一种有效的方法来解决此问题。

#Stackoverflow example
#Create empty matrix ----
newMatrix <- matrix(0, nrow = 5, ncol = 5)
colnames(newMatrix) <- c("Apple", "Pear", "Orange", "Mango", "Banana")
rownames(newMatrix) <- c("Apple", "Pear", "Orange", "Mango", "Banana")

#Create dataframe ----
newDf <- data.frame(c("Juice 1", "Juice 2", "Juice 3", "Juice 4","Juice 5"),
                    c("Banana", "Banana", "Orange", "Pear", "Apple"),
                    c("Pear", "Orange", "Pear", "Apple", "Pear"),
                    c("Orange", "Mango", NA, NA, NA))
colnames(newDf) <- c("Juice", "Fruit 1", "Fruit 2", "Fruit 3")

我想创建一个for循环,该循环遍历我的newmatrix中的每个元素,并添加+1,如果列和行的组合存在于newdf的行中。<<<<<<<<<<<<< br> 因此,从本质上讲,有多少果汁结合了苹果和梨,有多少果汁具有苹果和芒果的组合,等等。

输出应该看起来像这样:

       Apple Pear Orange Mango Banana
Apple      0    2      0     0      0
Pear       2    0      2     0      1
Orange     0    2      0     1      2
Mango      0    0      1     0      1
Banana     0    1      2     1      0

我首先尝试创建一个for循环,但我陷入了 part的

for (i in 1:nrow(adj_matrix)){
  for (j in 1:ncol(adj_matrix)) {
    if (???)
      adj_matrix[i,j] <- adj_matrix[i,j] + 1
  }
}

有人可以帮助我吗?将不胜感激!

I am still quite new with R so please bare with me :)

I need to create a matrix that counts how many times the combination of a row and column of that matrix are present in a dataframe.

As my description is probably quite vague, I have given an example set below. In reality, my dataset will contain many more fruits in the matrix and many more juices in the dataframe, so I'm looking for an efficient way to tackle this problem.

#Stackoverflow example
#Create empty matrix ----
newMatrix <- matrix(0, nrow = 5, ncol = 5)
colnames(newMatrix) <- c("Apple", "Pear", "Orange", "Mango", "Banana")
rownames(newMatrix) <- c("Apple", "Pear", "Orange", "Mango", "Banana")

#Create dataframe ----
newDf <- data.frame(c("Juice 1", "Juice 2", "Juice 3", "Juice 4","Juice 5"),
                    c("Banana", "Banana", "Orange", "Pear", "Apple"),
                    c("Pear", "Orange", "Pear", "Apple", "Pear"),
                    c("Orange", "Mango", NA, NA, NA))
colnames(newDf) <- c("Juice", "Fruit 1", "Fruit 2", "Fruit 3")

I want to create a for loop that goes over every element in my newMatrix and adds +1 if the combination of the column and row are present in a row of newDf.
So in essence, how many juices have a combination of for example Apple and Pear, how many juices have a combination of Apple and Mango, and so forth.

The output should look like this:

       Apple Pear Orange Mango Banana
Apple      0    2      0     0      0
Pear       2    0      2     0      1
Orange     0    2      0     1      2
Mango      0    0      1     0      1
Banana     0    1      2     1      0

I started by trying to create a for loop but I got stuck at the if part:

for (i in 1:nrow(adj_matrix)){
  for (j in 1:ncol(adj_matrix)) {
    if (???)
      adj_matrix[i,j] <- adj_matrix[i,j] + 1
  }
}

Can somebody help me with this? Would be highly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

流心雨 2025-02-02 05:45:33

使用base r,您可以将值组合起来,然后使用igraph获得邻接矩阵:

library(igraph)

m <- do.call(cbind, apply(newDf[-1], 1, \(x) if(sum(complete.cases(x)) >= 2) combn(x, m = 2) else x, simplify = F))
g <- graph_from_data_frame(na.omit(t(m)), directed = F)
get.adjacency(g, sparse = F)

       Banana Pear Orange Apple Mango
Banana      0    1      2     0     1
Pear        1    0      2     2     0
Orange      2    2      0     0     1
Apple       0    2      0     0     0
Mango       1    0      1     0     0

它可能会有些复杂,但是您也可以将tidyverse软件包:

library(igraph)
library(tidyverse)

newDf %>% 
  pivot_longer(-Juice) %>% 
  group_by(Juice) %>% 
  summarise(new = ifelse(n() > 1, paste(combn(na.omit(value), 2), collapse = "-"), value)) %>% 
  separate_rows(new, sep = "(?:[^-]*(?:-[^-]*){1})\\K-") %>% 
  separate(new, into = c("X1", "X2")) %>% 
  select(-Juice) %>% 
  graph_from_data_frame(directed = FALSE) %>% 
  get.adjacency(sparse = FALSE)

       Banana Pear Orange Apple Mango
Banana      0    1      2     0     1
Pear        1    0      2     2     0
Orange      2    2      0     0     1
Apple       0    2      0     0     0
Mango       1    0      1     0     0

With base R, you can take the combinations of your values, and then use igraph to get the adjacency matrix:

library(igraph)

m <- do.call(cbind, apply(newDf[-1], 1, \(x) if(sum(complete.cases(x)) >= 2) combn(x, m = 2) else x, simplify = F))
g <- graph_from_data_frame(na.omit(t(m)), directed = F)
get.adjacency(g, sparse = F)

       Banana Pear Orange Apple Mango
Banana      0    1      2     0     1
Pear        1    0      2     2     0
Orange      2    2      0     0     1
Apple       0    2      0     0     0
Mango       1    0      1     0     0

It might a bit convoluted, but you can also use igraph with tidyverse packages:

library(igraph)
library(tidyverse)

newDf %>% 
  pivot_longer(-Juice) %>% 
  group_by(Juice) %>% 
  summarise(new = ifelse(n() > 1, paste(combn(na.omit(value), 2), collapse = "-"), value)) %>% 
  separate_rows(new, sep = "(?:[^-]*(?:-[^-]*){1})\\K-") %>% 
  separate(new, into = c("X1", "X2")) %>% 
  select(-Juice) %>% 
  graph_from_data_frame(directed = FALSE) %>% 
  get.adjacency(sparse = FALSE)

       Banana Pear Orange Apple Mango
Banana      0    1      2     0     1
Pear        1    0      2     2     0
Orange      2    2      0     0     1
Apple       0    2      0     0     0
Mango       1    0      1     0     0
怕倦 2025-02-02 05:45:33

可以像这样编写循环的

cb <- combn(2:4, 2)  ## cols combinations newDf 

## initialize adj_matrix
v <- c("Apple", "Pear", "Orange", "Mango", "Banana")
adj_matrix <- matrix(0, length(v), length(v), dimnames=list(v, v))

for (k in seq_len(nrow(newDf))) {
    for (l in seq_len(ncol(cb))) {
        x <- unlist(newDf[k, cb[, l]])
        if (length(x[!is.na(x)]) == 2) {
        adj_matrix[x[1], x[2]] <- adj_matrix[x[1], x[2]] + 1
        adj_matrix[x[2], x[1]] <- adj_matrix[x[2], x[1]] + 1
        }
    }
}

adj_matrix
#        Apple Pear Orange Mango Banana
# Apple      0    2      0     0      0
# Pear       2    0      2     0      1
# Orange     0    2      0     1      2
# Mango      0    0      1     0      1
# Banana     0    1      2     1      0

数据:

newDf <- structure(list(Juice = c("Juice 1", "Juice 2", "Juice 3", "Juice 4", 
"Juice 5"), `Fruit 1` = c("Banana", "Banana", "Orange", "Pear", 
"Apple"), `Fruit 2` = c("Pear", "Orange", "Pear", "Apple", "Pear"
), `Fruit 3` = c("Orange", "Mango", NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-5L))

The for loop can be written like this.

cb <- combn(2:4, 2)  ## cols combinations newDf 

## initialize adj_matrix
v <- c("Apple", "Pear", "Orange", "Mango", "Banana")
adj_matrix <- matrix(0, length(v), length(v), dimnames=list(v, v))

for (k in seq_len(nrow(newDf))) {
    for (l in seq_len(ncol(cb))) {
        x <- unlist(newDf[k, cb[, l]])
        if (length(x[!is.na(x)]) == 2) {
        adj_matrix[x[1], x[2]] <- adj_matrix[x[1], x[2]] + 1
        adj_matrix[x[2], x[1]] <- adj_matrix[x[2], x[1]] + 1
        }
    }
}

adj_matrix
#        Apple Pear Orange Mango Banana
# Apple      0    2      0     0      0
# Pear       2    0      2     0      1
# Orange     0    2      0     1      2
# Mango      0    0      1     0      1
# Banana     0    1      2     1      0

Data:

newDf <- structure(list(Juice = c("Juice 1", "Juice 2", "Juice 3", "Juice 4", 
"Juice 5"), `Fruit 1` = c("Banana", "Banana", "Orange", "Pear", 
"Apple"), `Fruit 2` = c("Pear", "Orange", "Pear", "Apple", "Pear"
), `Fruit 3` = c("Orange", "Mango", NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-5L))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文