在 R 中的树状图的彩色条上排序颜色
R 包 dendextend 的小插图 (https://cran. r-project.org/web/packages/dendextend/vignettes/dendextend.html) 给出了使用包中的 color_bars 函数和 cutreeDynamic 的示例DynamicTreeCut 如下:
# let's get the clusters
library(dynamicTreeCut)
data(iris)
x <- iris[,-5] %>% as.matrix
hc <- x %>% dist %>% hclust
dend <- hc %>% as.dendrogram
# Find special clusters:
clusters <- cutreeDynamic(hc, distM = as.matrix(dist(x)), method = "tree")
# we need to sort them to the order of the dendrogram:
clusters <- clusters[order.dendrogram(dend)]
clusters_numbers <- unique(clusters) - (0 %in% clusters)
n_clusters <- length(clusters_numbers)
library(colorspace)
cols <- rainbow_hcl(n_clusters)
true_species_cols <- rainbow_hcl(3)[as.numeric(iris[,][order.dendrogram(dend),5])]
dend2 <- dend %>%
branches_attr_by_clusters(clusters, values = cols) %>%
color_labels(col = true_species_cols)
plot(dend2)
clusters <- factor(clusters)
levels(clusters)[-1] <- cols[-5][c(1,4,2,3)]
# Get the clusters to have proper colors.
# fix the order of the colors to match the branches.
colored_bars(clusters, dend, sort_by_labels_order = FALSE)
下面的行重新排序颜色以匹配分支:
levels(clusters)[-1] <- cols[-5][c(1,4,2,3)]
我希望将此方法应用于我自己的具有更多簇的数据,但我不清楚如何确定修改后的颜色顺序。此示例对虹膜数据使用自定义排序。谁能解释一下这个顺序是如何确定的,有没有办法自动执行?
The vignette for the R package dendextend (https://cran.r-project.org/web/packages/dendextend/vignettes/dendextend.html) gives an example of using the colored_bars function with cutreeDynamic from package dynamicTreeCut as follows:
# let's get the clusters
library(dynamicTreeCut)
data(iris)
x <- iris[,-5] %>% as.matrix
hc <- x %>% dist %>% hclust
dend <- hc %>% as.dendrogram
# Find special clusters:
clusters <- cutreeDynamic(hc, distM = as.matrix(dist(x)), method = "tree")
# we need to sort them to the order of the dendrogram:
clusters <- clusters[order.dendrogram(dend)]
clusters_numbers <- unique(clusters) - (0 %in% clusters)
n_clusters <- length(clusters_numbers)
library(colorspace)
cols <- rainbow_hcl(n_clusters)
true_species_cols <- rainbow_hcl(3)[as.numeric(iris[,][order.dendrogram(dend),5])]
dend2 <- dend %>%
branches_attr_by_clusters(clusters, values = cols) %>%
color_labels(col = true_species_cols)
plot(dend2)
clusters <- factor(clusters)
levels(clusters)[-1] <- cols[-5][c(1,4,2,3)]
# Get the clusters to have proper colors.
# fix the order of the colors to match the branches.
colored_bars(clusters, dend, sort_by_labels_order = FALSE)
The following line reorders the colors to match the branches:
levels(clusters)[-1] <- cols[-5][c(1,4,2,3)]
I wish to apply this method to my own data which has many more clusters, but I am unclear on how the revised ordering of the colors was determined. This example uses a custom ordering for the iris data. Can anyone explain how this order was determined and is there a way to automate this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对于初学者来说,上面的
data(iris)
示例代码缺少两个必要的包library(dplyr)
,以便能够使用管道命令% >%
和library(dendextend)
用于标签颜色,来自color_lables()
为了回答您的问题,解决方案可以在
levels(clusters)[-1] <- cols[-5][c(1,4,3,2)]
代码部分。正如您所提到的,这是针对该特定数据集的定制,但我不知道作者为什么选择这个特定顺序。如果您不设置顺序,并希望 R 自动执行此操作,则必须在colored_bars()
命令中设置sort_by_labels_order=TRUE
。此处,它设置为 FALSE,因为作者使用自定义顺序。如果它设置为 TRUE,那么我直接从 R 引用“颜色向量/矩阵应按原始数据顺序的顺序提供(并且它将自动重新排序为树状图的顺序)”。有关详细信息,请参阅
?colored_bars()
这将显示设置为 FALSE 或 TRUE 时两个参数之间的差异。
Just for starters, your example code above from the
data(iris)
was missing two necessary packages,library(dplyr)
to be able to use the pipe command%>%
andlibrary(dendextend)
for the label colors, fromcolor_lables()
In order to answer your question, solution can be found in the
levels(clusters)[-1] <- cols[-5][c(1,4,3,2)]
section of code. As you mention, this is custom to this specific dataset, but I am unaware of why the authors picked this specific order. If you do not set the order, and want R to automatically do it, than in thecolored_bars()
command, thesort_by_labels_order=TRUE
must be set. Here, it is set to FALSE since the authors use a custom order.If it is set to TRUE, than I cite directly from R "the colors vector/matrix should be provided in the order of the original data order (and it will be re-ordered automatically to the order of the dendrogram)". For more information, see
?colored_bars()
This will show you the difference betweeen the two parameters, when set to FALSE or TRUE.