隔离“分支”在使用 networkd3 的桑基图中

发布于 2025-01-13 23:57:27 字数 1019 浏览 1 评论 0原文

我正在使用 networkD3 包中的 sankeyNetwork() 来可视化一些数据。我想知道是否有一种方法可以从头到尾“隔离”一个分支,忽略不相关的链接。

示例:我有这个:SankeyGot

我想要提取此: SankeyWant

可重现的示例:

set.seed(9)

df <- tibble(
  source = sample(stringr::words, 5) %>% rep(2),
  target = c(sample(words, 7), source[1:3]), 
  values = rnorm(10, 10, 7) %>% round(0) %>% abs)

nodes <- data.frame(names = unique(c(df$source, df$target)))

links <- tibble(
  source = match(
    df$source, nodes$names) -1,
  target = match(
    df$target, nodes$names) -1,
  value = df$values
  )

sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "names",
              iterations = 64, sinksRight = F, fontSize = 14)

我希望能够过滤掉例如“名称”,并获取上游和下游所有级别上与其连接的所有内容 - 我将如何去做呢?

I am using sankeyNetwork() from the networkD3 package for visualizing some data. I was wondering if theres a way to "isolate" a branch from start to finish, ignoring the irrelevant links.

Example: I've got this: SankeyGot

And I want to extract this: SankeyWant

reproducible example:

set.seed(9)

df <- tibble(
  source = sample(stringr::words, 5) %>% rep(2),
  target = c(sample(words, 7), source[1:3]), 
  values = rnorm(10, 10, 7) %>% round(0) %>% abs)

nodes <- data.frame(names = unique(c(df$source, df$target)))

links <- tibble(
  source = match(
    df$source, nodes$names) -1,
  target = match(
    df$target, nodes$names) -1,
  value = df$values
  )

sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "names",
              iterations = 64, sinksRight = F, fontSize = 14)

I'd like to be able to filter out "name" for example and get everything that connects to that on all levels upstream and downstream - how would i go about doing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ι不睡觉的鱼゛ 2025-01-20 23:57:27

计算图中节点的路径并非易事,但 igraph 包可以帮助使用 all_simple_paths()。但是,请注意帮助文件中的警告......

请注意,两个之间可能存在指数级多的路径
图的顶点,使用它时可能会耗尽内存
函数,如果你的图是格子状的。

(我不知道你的 words 向量是什么,所以我手动重新创建了 links data.frame)

library(dplyr)
library(networkD3)

set.seed(9)

df <- read.csv(header = TRUE, text = "
source,target
summer,obvious
summer,structure
however,either
however,match
obvious,about
obvious,non
either,contract
either,produce
contract,paint
contract,name
")
df$values <- rnorm(10, 10, 7) %>% round(0) %>% abs()


# use graph to calculate the paths from a node
library(igraph)

graph <- graph_from_data_frame(df)

start_node <- "name"

# get nodes along a uni-directional path going IN to the start_node
connected_nodes_in <- 
  all_simple_paths(graph, from = start_node, mode = "in") %>% 
  unlist() %>% 
  names() %>% 
  unique()

# get nodes along a uni-directional path going OUT of the start_node
connected_nodes_out <- 
  all_simple_paths(graph, from = start_node, mode = "out") %>% 
  unlist() %>% 
  names() %>% 
  unique()

# combine them
connected_nodes <- unique(c(connected_nodes_in, connected_nodes_out))

# filter your data frame so it only includes links/edges that start and
# end at connected nodes
df <- df %>% filter(source %in% connected_nodes & target %in% connected_nodes)



nodes <- data.frame(names = unique(c(df$source, df$target)))

links <- tibble(
  source = match(
    df$source, nodes$names) -1,
  target = match(
    df$target, nodes$names) -1,
  value = df$values
)

sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "names",
              iterations = 64, sinksRight = F, fontSize = 14)

在此处输入图像描述

Calculating the paths from a node in a graph is non-trivial, but the igraph package can help with the all_simple_paths(). However, heed that warning in the help file...

Note that potentially there are exponentially many paths between two
vertices of a graph, and you may run out of memory when using this
function, if your graph is lattice-like.

(I don't know what your words vector is, so I recreated the links data.frame manually)

library(dplyr)
library(networkD3)

set.seed(9)

df <- read.csv(header = TRUE, text = "
source,target
summer,obvious
summer,structure
however,either
however,match
obvious,about
obvious,non
either,contract
either,produce
contract,paint
contract,name
")
df$values <- rnorm(10, 10, 7) %>% round(0) %>% abs()


# use graph to calculate the paths from a node
library(igraph)

graph <- graph_from_data_frame(df)

start_node <- "name"

# get nodes along a uni-directional path going IN to the start_node
connected_nodes_in <- 
  all_simple_paths(graph, from = start_node, mode = "in") %>% 
  unlist() %>% 
  names() %>% 
  unique()

# get nodes along a uni-directional path going OUT of the start_node
connected_nodes_out <- 
  all_simple_paths(graph, from = start_node, mode = "out") %>% 
  unlist() %>% 
  names() %>% 
  unique()

# combine them
connected_nodes <- unique(c(connected_nodes_in, connected_nodes_out))

# filter your data frame so it only includes links/edges that start and
# end at connected nodes
df <- df %>% filter(source %in% connected_nodes & target %in% connected_nodes)



nodes <- data.frame(names = unique(c(df$source, df$target)))

links <- tibble(
  source = match(
    df$source, nodes$names) -1,
  target = match(
    df$target, nodes$names) -1,
  value = df$values
)

sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "names",
              iterations = 64, sinksRight = F, fontSize = 14)

enter image description here

短暂陪伴 2025-01-20 23:57:27

如果您将 sankeyNetwork 编码为对象,则可以使用 str(object) 将其标识为列表,并使用一个名为 x 的矩阵来保存您的输入 df

list_sankey < ;- sankeyNetwork(链接=链接,节点=节点,源=“源”,目标=“目标”,值=“值”,NodeID =“名称”,迭代= 64,sinksRight = F,fontSize = 14)

str(list_sankey)

然后,您可以过滤 x 矩阵以仅约束所需的输入 source 和输出 目标节点

list_sankey_filter <- list_sankey

list_sankey_filter$x$links <- list_sankey_filter$x$links %>% 过滤器(源%in% c(4, 2, 0), target %in% c(4, 2, 0, 10))

这将为您提供下面的对象。

输入图片此处描述

If you code sankeyNetwork as an object you can use str(object) to identify it as a list, with a matrix called x that holds your input df

list_sankey <- sankeyNetwork(Links = links, Nodes = nodes, Source = "source", Target = "target", Value = "value", NodeID = "names", iterations = 64, sinksRight = F, fontSize = 14)

str(list_sankey)

You can then filter the x matrix to only contrain your desired input source and output target nodes

list_sankey_filter <- list_sankey

list_sankey_filter$x$links <- list_sankey_filter$x$links %>% filter(source %in% c(4, 2, 0), target %in% c(4, 2, 0, 10))

This then gives you the object below.

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文