根据中值对箱线图进行排序

发布于 2024-09-24 13:40:22 字数 352 浏览 7 评论 0原文

我想使用 R 制作一系列按中值排序的箱线图。假设然后我执行:

boxplot(cost ~ type)

这会给我一些箱线图,成本显示在 y 轴上,类型类别在 x 轴上可见:

-----     -----
  |         |
 [ ]        |
  |        [ ]
  |         |
-----     -----
  A         B

但是,我想要的是从最高到最低中值排序的箱线图数字。我怀疑我需要做的是更改类型(A 或 B)的标签以数字方式指示哪个是最低和最高中值,但我想知道是否有更聪明的方法来解决该问题。

I'd like to use R to make a series of boxplots which are sorted by median value. Suppose then I execute:

boxplot(cost ~ type)

This would give me some boxplots were cost is shown on the y axis and the type category is visible on the x-axis:

-----     -----
  |         |
 [ ]        |
  |        [ ]
  |         |
-----     -----
  A         B

However, what I'd like is the boxplot figures sorted from highest to lowest median value. My suspicion is that what I need to do is change the labels of the type (A or B) to numerically indicate which is the lowest and highest median value, but I wonder if there is a more clever way to solve the problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

凉薄对峙 2024-10-01 13:40:22

查看?重新排序。该示例似乎是您想要的,但按相反的顺序排序。我更改了下面第一行中的 -count 以按您想要的顺序排序。

  bymedian <- with(InsectSprays, reorder(spray, -count, median))
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")

Check out ?reorder. The example seems to be what you want, but sorted in the opposite order. I changed -count in the first line below to sort in the order you want.

  bymedian <- with(InsectSprays, reorder(spray, -count, median))
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")
满身野味 2024-10-01 13:40:22

是的,就是这个想法:

> set.seed(42)                     # fix seed       
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE), 
+                  cost=rnorm(100)) 
>
> boxplot(cost ~ type, data=DF)    # not ordered by median
>
> # compute index of ordered 'cost factor' and reassign          
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))    
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])   
>
> boxplot(cost ~ type, data=DF)    # now it is ordered by median

Yes, that is the idea:

> set.seed(42)                     # fix seed       
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE), 
+                  cost=rnorm(100)) 
>
> boxplot(cost ~ type, data=DF)    # not ordered by median
>
> # compute index of ordered 'cost factor' and reassign          
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))    
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])   
>
> boxplot(cost ~ type, data=DF)    # now it is ordered by median
花间憩 2024-10-01 13:40:22

请注意缺失值,您必须添加 na.rm = TRUE 才能正常工作。如果没有,该代码根本无法工作。我花了几个小时才发现这一点。

  bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")

Beware of missing values, you have to add na.rm = TRUE for it to work. If not, the code simply doesn't work. It took me hours to found that out.

  bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文