从具有许多列的数据框架创建有向的邻接矩阵

发布于 2025-02-13 08:03:09 字数 1857 浏览 2 评论 0 原文

这样

矩阵 一个 数据
创建 邻接

​及时。

邻接矩阵应反映以下逻辑:

对于x1列: 1 应该在x2列中的3行,

22 应该转到X2列中的3行,

3 应该转到3行在X2列

X2列中:X3列相同的模式。 这对于所有列。因此,就像将给定列中的每个元素链接到以下列的所有元素一样,依此类推。

输出应为具有列和行n x n的矩阵(其中整个矩阵中唯一值的n中的n)和...嗯,邻接矩阵。

此数据框只是一个示例,我必须使用的数据框有数百列。

对于这8列,输出应类似于这样:

1 2 3 5 22 23
1 6 1 0 0 0 0 0
0 0 0 2 0 0 0 0 0
0 1 0 3 4 1 4 1 0 1
5 0 1 0 1 0 1 0
2 2 2 2 0 1 0 2 0 2 0 2 0 2
2 0 0 0 0 0 0 0 0

这是图表的表现。 (编辑)

alt

我一直在努力使它起作用,但是现在真的迷失了... Tia

P.S.我正在与R一起工作,但Python也可以工作。

I want to create a directed adjacency matrix from data like this:

x1 x2 x3 x4 x5 x6 x7 x8
1 1 1 1 1 1 1 2
22 22 22 3 3 3 2 3
3 3 3 5 5 2 3 23

Where the columns represent states in time.

The adjacency matrix should reflect the following logic:

For the column x1:
1 should go to the 3 rows in column x2,

22 should go to the 3 rows in column x2,

3 should go to the 3 rows in column x2

For the column x2: The same pattern going to column x3.
And this for all columns. So it's like linking each element in a given column to all elements of the following column, and so on.

The output should be a matrix with columns and rows N x N (where N in the number of unique values in the whole matrix) and... well, an adjacency matrix.

This dataframe is just a sample, the one I have to use has hundreds of columns.

For these 8 columns, the output should resemble something like this:

1 2 3 5 22 23
1 6 1 0 0 0 0
2 0 0 2 0 0 0
3 0 1 4 1 0 1
5 0 1 0 1 0 0
22 0 0 1 0 2 0
23 0 0 0 0 0 0

This is a representation of how the graph should look like. (edited)

enter image description here

I've been trying to make it work, but am really lost by now...
TIA

P.S. I'm working with R but Python could also work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

尝蛊 2025-02-20 08:03:10

我认为邻接矩阵不是您所追求的。我想这应该是过渡的摘要信息。您可以在下面尝试基本R代码(无 igraph ),

d <- do.call(
  rbind,
  apply(
    embed(seq_along(df), 2),
    1,
    function(k) {
      expand.grid(
        setNames(
          df[rev(k)],
          c("from", "to")
        )
      )
    }
  )
)
lvls <- sort(unique(unlist(d)))
table(list2DF(lapply(d, factor, level = lvls)))

该代码给出

    to
from 1 2 3 5 22 23
  1  6 3 7 2  2  1
  2  1 2 2 0  0  1
  3  6 3 7 2  2  1
  5  2 1 2 1  0  0
  22 3 0 3 1  2  0
  23 0 0 0 0  0  0

data

> dput(df)
structure(list(x1 = c(1L, 22L, 3L), x2 = c(1L, 22L, 3L), x3 = c(1L, 
22L, 3L), x4 = c(1L, 3L, 5L), x5 = c(1L, 3L, 5L), x6 = c(1L,
3L, 2L), x7 = 1:3, x8 = c(2L, 3L, 23L)), class = "data.frame", row.names = c(NA,
-3L))

I don't think the adjacency matrix is the thing you are after. I guess it should be the summary info of transitions. You can try the base R code below (without igraph)

d <- do.call(
  rbind,
  apply(
    embed(seq_along(df), 2),
    1,
    function(k) {
      expand.grid(
        setNames(
          df[rev(k)],
          c("from", "to")
        )
      )
    }
  )
)
lvls <- sort(unique(unlist(d)))
table(list2DF(lapply(d, factor, level = lvls)))

which gives

    to
from 1 2 3 5 22 23
  1  6 3 7 2  2  1
  2  1 2 2 0  0  1
  3  6 3 7 2  2  1
  5  2 1 2 1  0  0
  22 3 0 3 1  2  0
  23 0 0 0 0  0  0

data

> dput(df)
structure(list(x1 = c(1L, 22L, 3L), x2 = c(1L, 22L, 3L), x3 = c(1L, 
22L, 3L), x4 = c(1L, 3L, 5L), x5 = c(1L, 3L, 5L), x6 = c(1L,
3L, 2L), x7 = 1:3, x8 = c(2L, 3L, 23L)), class = "data.frame", row.names = c(NA,
-3L))
烟织青萝梦 2025-02-20 08:03:10

你可以做:

as.data.frame.matrix(xtabs(~factor(x1, unique(c(x1, values)))+values, cbind(df[1], stack(df[-1]))))
   1 2 3 5 22 23
1  6 1 0 0  0  0
22 0 1 4 0  2  0
3  0 1 3 2  0  1
5  0 0 0 0  0  0
2  0 0 0 0  0  0
23 0 0 0 0  0  0


xtabs(~x1+x, transform(reshape(df, names(df)[-1], dir='long', sep=''), x1 = factor(x1, unique(c(x,x1)))))
    x
x1   1 2 3 5 22 23
  1  6 1 0 0  0  0
  22 0 1 4 0  2  0
  3  0 1 3 2  0  1
  5  0 0 0 0  0  0
  2  0 0 0 0  0  0
  23 0 0 0 0  0  0

library(tidyverse)
df %>%
   mutate(x1 = factor(x1, unique(unlist(.)))) %>%
   pivot_longer(-x1) %>%
   xtabs(~x1+value,.) %>%
   as.data.frame.matrix()

   1 2 3 5 22 23
1  6 1 0 0  0  0
22 0 1 4 0  2  0
3  0 1 3 2  0  1
5  0 0 0 0  0  0
2  0 0 0 0  0  0
23 0 0 0 0  0  0

You could do:

as.data.frame.matrix(xtabs(~factor(x1, unique(c(x1, values)))+values, cbind(df[1], stack(df[-1]))))
   1 2 3 5 22 23
1  6 1 0 0  0  0
22 0 1 4 0  2  0
3  0 1 3 2  0  1
5  0 0 0 0  0  0
2  0 0 0 0  0  0
23 0 0 0 0  0  0


xtabs(~x1+x, transform(reshape(df, names(df)[-1], dir='long', sep=''), x1 = factor(x1, unique(c(x,x1)))))
    x
x1   1 2 3 5 22 23
  1  6 1 0 0  0  0
  22 0 1 4 0  2  0
  3  0 1 3 2  0  1
  5  0 0 0 0  0  0
  2  0 0 0 0  0  0
  23 0 0 0 0  0  0

library(tidyverse)
df %>%
   mutate(x1 = factor(x1, unique(unlist(.)))) %>%
   pivot_longer(-x1) %>%
   xtabs(~x1+value,.) %>%
   as.data.frame.matrix()

   1 2 3 5 22 23
1  6 1 0 0  0  0
22 0 1 4 0  2  0
3  0 1 3 2  0  1
5  0 0 0 0  0  0
2  0 0 0 0  0  0
23 0 0 0 0  0  0
触ぅ动初心 2025-02-20 08:03:10

看来您可能会误解邻接矩阵的工作原理。

该矩阵包含布尔值(是或错误),

应为节点索引1,2,3,4,...

如果从节点1到节点2有一个链接,则第2行,第1列中的单元格将是真实的。

让我们索引您的前两个列,

1 4
2 5
3 6

所以节点1链接到节点4,5,而6个

邻接矩阵看起来像这样

  1 2 3 4 5 6
1 
2 
3
4 1 1 1
5 1 1 1
6 1 1 1

It seems that you may misunderstand how an adjacency matrix works.

The matrix contains Boolean values ( true or false )

The nodes should be indexed 1,2,3,4, ...

If there is a link from node 1 to node 2, then the cell in row 2, column 1 will be true.

Let's index your first two columns like this

1 4
2 5
3 6

So node 1 is linked to nodes 4,5, and 6

and the adjacency matrix looks like this

  1 2 3 4 5 6
1 
2 
3
4 1 1 1
5 1 1 1
6 1 1 1
薆情海 2025-02-20 08:03:10

@thomasiscoding 的数据框开始。

  structure(list(x1 = c(1L, 22L, 3L), x2 = c(1L, 22L, 3L), x3 = c(1L, 
  22L, 3L), x4 = c(1L, 3L, 5L), x5 = c(1L, 3L, 5L), x6 = c(1L,
  3L, 2L), x7 = 1:3, x8 = c(2L, 3L, 23L)), class = "data.frame", row.names = c(NA,
  -3L))

第一个替代方法是将所有节点组合在一起,而无需考虑时间(x1,x2,...)。

m1 <- formatC(as.matrix(df), width = 2, format = "d", flag = "0")

输出。

      x1   x2   x3   x4   x5   x6   x7   x8  
 [1,] "01" "01" "01" "01" "01" "01" "01" "02"
 [2,] "22" "22" "22" "03" "03" "03" "02" "03"
 [3,] "03" "03" "03" "05" "05" "02" "03" "23"

替代(ii)考虑了观察时间。

m2 <- 
  rbind(
  c1=paste(m1[1,], names(df), sep="_"),
  c2=paste(m1[2,], names(df), sep="_"),
  c3=paste(m1[3,], names(df), sep="_")
  )

输出。

  [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]   
c1 "01_x1" "01_x2" "01_x3" "01_x4" "01_x5" "01_x6" "01_x7" "02_x8"
c2 "22_x1" "22_x2" "22_x3" "03_x4" "03_x5" "03_x6" "02_x7" "03_x8"
c3 "03_x1" "03_x2" "03_x3" "05_x4" "05_x5" "02_x6" "03_x7" "23_x8"

Expand.Grid()将X(i)处的所有发生与i = 1至7的x(i+1)结合在一起。
根据手头的情况选择M1或M2。

mc <- m1
mmm <- c()
for (i in seq(ncol(m1)-1) ) { 
  mmm <- rbind(mmm, expand.grid(x = mc[, i], y = mc[, i + 1])) 
}
table(mmm)
g   <- graph_from_data_frame(mmm, directed=FALSE)
plot(g)
g[]

输出(i)。使用表(MMM)检查此输出。

6 x 6 sparse Matrix of class "dgCMatrix"
   01 22 03 05 02 23
01  6  2  7  2  3  1
22  3  2  3  1  .  .
03  6  2  7  2  3  1
05  2  .  2  1  1  .
02  1  .  2  .  2  1
23  .  .  .  .  .  .

输出(ii)。

24 x 24 sparse Matrix of class "dgCMatrix"
   [[ suppressing 24 column names ‘01_x1’, ‘22_x1’, ‘03_x1’ ... ]]
                                                     
01_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
22_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
03_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
01_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
22_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
03_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
01_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
22_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
03_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
01_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
03_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
05_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
01_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
03_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
05_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
01_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
03_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
02_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
01_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
02_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
03_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
02_x8 . . . . . . . . . . . . . . . . . . . . . . . .
03_x8 . . . . . . . . . . . . . . . . . . . . . . . .
23_x8 . . . . . . . . . . . . . . . . . . . . . . . .

Starting with the dataframe of @ThomasisCoding.

  structure(list(x1 = c(1L, 22L, 3L), x2 = c(1L, 22L, 3L), x3 = c(1L, 
  22L, 3L), x4 = c(1L, 3L, 5L), x5 = c(1L, 3L, 5L), x6 = c(1L,
  3L, 2L), x7 = 1:3, x8 = c(2L, 3L, 23L)), class = "data.frame", row.names = c(NA,
  -3L))

The first alternative is to combine all nodes without regard to time (x1, x2, ...).

m1 <- formatC(as.matrix(df), width = 2, format = "d", flag = "0")

Output.

      x1   x2   x3   x4   x5   x6   x7   x8  
 [1,] "01" "01" "01" "01" "01" "01" "01" "02"
 [2,] "22" "22" "22" "03" "03" "03" "02" "03"
 [3,] "03" "03" "03" "05" "05" "02" "03" "23"

Alternative (II) takes into account the time of observation.

m2 <- 
  rbind(
  c1=paste(m1[1,], names(df), sep="_"),
  c2=paste(m1[2,], names(df), sep="_"),
  c3=paste(m1[3,], names(df), sep="_")
  )

Output.

  [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]   
c1 "01_x1" "01_x2" "01_x3" "01_x4" "01_x5" "01_x6" "01_x7" "02_x8"
c2 "22_x1" "22_x2" "22_x3" "03_x4" "03_x5" "03_x6" "02_x7" "03_x8"
c3 "03_x1" "03_x2" "03_x3" "05_x4" "05_x5" "02_x6" "03_x7" "23_x8"

Expand.grid() combines all occurrences at x(i) with x(i+1) for i = 1 through 7.
Choose m1 or m2 depending on scenario at hand.

mc <- m1
mmm <- c()
for (i in seq(ncol(m1)-1) ) { 
  mmm <- rbind(mmm, expand.grid(x = mc[, i], y = mc[, i + 1])) 
}
table(mmm)
g   <- graph_from_data_frame(mmm, directed=FALSE)
plot(g)
g[]

Output (I). Check this output with table(mmm).

6 x 6 sparse Matrix of class "dgCMatrix"
   01 22 03 05 02 23
01  6  2  7  2  3  1
22  3  2  3  1  .  .
03  6  2  7  2  3  1
05  2  .  2  1  1  .
02  1  .  2  .  2  1
23  .  .  .  .  .  .

Output (II).

24 x 24 sparse Matrix of class "dgCMatrix"
   [[ suppressing 24 column names ‘01_x1’, ‘22_x1’, ‘03_x1’ ... ]]
                                                     
01_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
22_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
03_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
01_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
22_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
03_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
01_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
22_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
03_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
01_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
03_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
05_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
01_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
03_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
05_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
01_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
03_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
02_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
01_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
02_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
03_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
02_x8 . . . . . . . . . . . . . . . . . . . . . . . .
03_x8 . . . . . . . . . . . . . . . . . . . . . . . .
23_x8 . . . . . . . . . . . . . . . . . . . . . . . .
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文