如何在 R 树形图中正确着色边缘或绘制矩形?

发布于 2024-07-16 22:22:45 字数 3216 浏览 14 评论 0原文

我使用 R 的 hclust() 生成了此树状图as.dendrogram()plot.dendrogram() 函数。

我使用了dendrapply()函数和一个本地函数来给叶子着色,效果很好。

我的统计测试结果表明一组节点(例如_+v\_stat5a\_01_”和“_+v”的集群) \_stat5b\_01_”位于树的右下角)是重要的或重要的。

我还有一个本地函数,可以与 dendrapply() 一起使用,它可以找到树图中包含重要叶子的确切节点。

我想要(按照示例):

  1. 为连接“_+v\_stat5a\_01_”和“_+v\_stat5b\_01_”的边缘着色; 或者,
  2. 在“_+v\_stat5a\_01_”和“_+v\_stat5b\_01_”周围绘制一个rect()

我有以下本地函数(“nodes-in-leafList-match-nodes-in-clusterList”条件的详细信息并不重要,但它突出显示了重要节点):

markSignificantClusters <<- function (n) {
  if (!is.leaf(n)) {
     a <- attributes(n)
     leafList <- unlist(dendrapply(n, listLabels))
     for (clusterIndex in 1:length(significantClustersList[[1]])) {
       clusterList <- unlist(significantClustersList[[1]][clusterIndex])
       if (nodes-in-leafList-match-nodes-in-clusterList) {
          # I now have a node "n" that contains significant leaves, and
          # I'd like to use a dendrapply() call to another local function
          # which colors the edges that run down to the leaves; or, draw
          # a rect() around the leaves
       }
     }
  }
}

在此 if 块内,我尝试调用 dendrapply(n, markEdges),但这不起作用:

markEdges <<- function (n) {
  a <- attributes(n)
  attr(n, "edgePar") <- c(a$edgePar, list(lty=3, col="red"))
}

在我的理想示例中,连接“_+v\_stat5a\_01_”和“_+v\_stat5b\_01_”将为虚线且为红色。

我还尝试在这个 if 块中使用 rect.hclust()

ma <- match(leafList, orderedLabels)  
rect.hclust(scoreClusterObj, h = a$height, x = c(min(ma), max(ma)), border = 2)

但结果不适用于水平树状图(具有水平树状图标签)。 这是一个示例(请注意右下角的红色条纹) 。 rect.hclust() 生成的尺寸有些不正确,而且我不知道它是如何工作的,以便能够编写我自己的版本。

我很感激任何关于让 edgeParrect.hclust() 正常工作,或者能够编写我自己的 rect.hclust() 的建议> 等价。

更新

自从提出这个问题以来,我使用getAnywhere(rect.hclust())来获取计算参数并绘制rect对象的功能代码。 我编写了此函数的自定义版本来处理水平和垂直叶子,并使用 dendrapply() 调用它。

但是,有某种剪切效果可以删除部分矩形。 对于水平叶子(绘制在树右侧的叶子),矩形的最右边缘要么消失,要么比矩形其他三边的边框宽度更薄。 对于垂直叶子(绘制在树底部的叶子),矩形的最底部边缘也会遇到相同的显示问题。

作为标记重要簇的一种方法,我所做的就是减小矩形的宽度,以便在簇边缘的尖端和(水平)叶子标签之间渲染垂直的红色条纹。

这消除了剪切问题,但引入了另一个问题,即簇边缘尖端和叶子标签之间的空间只有六个左右像素宽,我对此没有太多控制权。 这限制了垂直条纹的宽度。

更糟糕的问题是,标记两个元素之间垂直条纹的位置的 x 坐标将根据较大树的宽度而变化 (par["usr"]),这又取决于树层次结构最终的结构方式。

我写了一个“修正”,或者更好的说法,一个 hack 来调整这个 x 值和水平树的 rect 宽度。 它并不总是一致地工作,但对于我正在制作的树,它似乎避免太靠近(或重叠)边缘和标签。

最终,更好的解决方法是找出如何绘制矩形,这样就不会出现裁剪。 或者采用一致的方法来计算任何给定树的树边缘和标签之间的特定 x 位置,以便正确居中和调整条纹大小。

我也对用颜色或线条样式注释边缘的方法非常感兴趣。

I generated this dendrogram using R's hclust(), as.dendrogram() and plot.dendrogram() functions.

I used the dendrapply() function and a local function to color leaves, which is working fine.

I have results from a statistical test that indicate if a set of nodes (e.g. the cluster of "_+v\_stat5a\_01_" and "_+v\_stat5b\_01_" in the lower-right corner of the tree) are significant or important.

I also have a local function that I can use with dendrapply() that finds the exact node in my dendrogram which contains significant leaves.

I would like to either (following the example):

  1. Color the edges that join "_+v\_stat5a\_01_" and "_+v\_stat5b\_01_"; or,
  2. Draw a rect() around "_+v\_stat5a\_01_" and "_+v\_stat5b\_01_"

I have the following local function (the details of the "nodes-in-leafList-match-nodes-in-clusterList" condition aren't important, but that it highlights significant nodes):

markSignificantClusters <<- function (n) {
  if (!is.leaf(n)) {
     a <- attributes(n)
     leafList <- unlist(dendrapply(n, listLabels))
     for (clusterIndex in 1:length(significantClustersList[[1]])) {
       clusterList <- unlist(significantClustersList[[1]][clusterIndex])
       if (nodes-in-leafList-match-nodes-in-clusterList) {
          # I now have a node "n" that contains significant leaves, and
          # I'd like to use a dendrapply() call to another local function
          # which colors the edges that run down to the leaves; or, draw
          # a rect() around the leaves
       }
     }
  }
}

From within this if block, I have tried calling dendrapply(n, markEdges), but this did not work:

markEdges <<- function (n) {
  a <- attributes(n)
  attr(n, "edgePar") <- c(a$edgePar, list(lty=3, col="red"))
}

In my ideal example, the edges connecting "_+v\_stat5a\_01_" and "_+v\_stat5b\_01_" would be dashed and of a red color.

I have also tried using rect.hclust() within this if block:

ma <- match(leafList, orderedLabels)  
rect.hclust(scoreClusterObj, h = a$height, x = c(min(ma), max(ma)), border = 2)

But the result does not work with horizontal dendrograms (i.e. dendrograms with horizontal labels). Here is an example (note the red stripe in the lower-right corner). Something is not correct about the dimensions of what rect.hclust() generates, and I don't know how it works, to be able to write my own version.

I appreciate any advice for getting edgePar or rect.hclust() to work properly, or to be able to write my own rect.hclust() equivalent.

UPDATE

Since asking this question, I used getAnywhere(rect.hclust()) to get the functional code that calculates parameters and draws the rect object. I wrote a custom version of this function to handle horizontal and vertical leaves, and call it with dendrapply().

However, there is some kind of clipping effect that removes part of the rect. For horizontal leaves (leaves that are drawn on the right side of the tree), the rightmost edge of the rect either disappears or is thinner than the border width of the other three sides of the rect. For vertical leaves (leaves that are drawn on the bottom of the tree), the bottommost edge of the rect suffers the same display problem.

What I had done as a means of marking significant clusters is to reduce the width of the rect such that I render a vertical red stripe between the tips of the cluster edges and the (horizontal) leaf labels.

This eliminates the clipping issue, but introduces another problem, in that the space between the cluster edge tips and the leaf labels is only six or so pixels wide, which I don't have much control over. This limits the width of the vertical stripe.

The worse problem is that the x-coordinate that marks where the vertical stripe can fit between the two elements will change based on the width of the larger tree (par["usr"]), which in turn depends on how the tree hierarchy ends up being structured.

I wrote a "correction" or, better termed, a hack to adjust this x value and the rect width for horizontal trees. It doesn't always work consistently, but for the trees I am making, it seems to keep from getting too close to (or overlapping) edges and labels.

Ultimately, a better fix would be to find out how to draw the rect so that there is no clipping. Or a consistent way to calculate the specific x position in between tree edges and labels for any given tree, so as to center and size the stripe properly.

I would also be very interested in a method for annotating edges with colors or line styles.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

爱格式化 2024-07-23 22:22:45

所以您实际上问了大约五个问题 (5 +/- 3)。 至于编写您自己的 rect.hclust 之类的函数,如果您想查看源代码,可以在 library/stats/R/identify.hclust.R 中查看。

我自己快速浏览了一遍,不确定它是否达到了我在阅读您的描述时所认为的效果——它似乎正在绘制多个矩形,此外,x 选择器似乎是硬编码的以水平分隔标签(这不是您想要的,并且没有 y)。

我会回来的,但与此同时,您可能(除了查看源代码之外)尝试使用不同的 border= 颜色和不同的 h= 执行多个 rect.hclust值来查看是否出现故障模式。

更新

我对此也没有太多运气。

剪切的一种可能的混乱是用尾随空格填充标签,然后稍微将矩形的边缘带入(其想法是,仅将矩形带入即可将其移出剪切区域,但会覆盖标签的末端)。

另一个想法是用半透明(低 Alpha)颜色填充矩形,形成阴影区域而不是边界框。

So you've actually asked about five questions (5 +/- 3). As far as writing your own rect.hclust like function, the source is in library/stats/R/identify.hclust.R if you want to look at it.

I took a quick glance at it myself and am not sure it does what I thought it did from reading your description--it seems to be drawing multiple rectangles, Also, the x selector appears to be hard coded to segregate the tags horizontally (which isn't what you want and there's no y).

I'll be back, but in the meantime you might (in addition to looking at the source) try doing multiple rect.hclust with different border= colors and different h= values to see if a failure pattern emerges.

Update

I haven't had much luck poking at this either.

One possible kludge for the clipping would be to pad the labels with trailing spaces and then bring the edge of your rectangle in slightly (the idea being that just bringing the rectangle in would get it out of the clipping zone but overwrite the ends of the labels).

Another idea would be to fill the rectangle with a translucent (low alpha) color, making a shaded area rather than a bounding box.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文