R - 箱线图中的排序

发布于 2024-10-04 10:08:40 字数 1685 浏览 1 评论 0原文

我正在尝试在 R 中生成一系列由 2 个因素分组的箱线图。我已经成功地绘制了图,但我无法将盒子按正确的方向排序。

我正在使用的数据场看起来像这样:

Nitrogen    Species    Treatment
2           G          L
3           R          M
4           G          H
4           B          L
2           B          M
1           G          H

我尝试过:

boxplot(mydata$Nitrogen~mydata$Species*mydata$Treatment)

这按字母顺序对框进行排序(前三个是“高”处理,然后在这三个处理中,它们按物种名称字母顺序排序)。

alt text

我希望箱线图按 Low>Medium>High 的顺序排列,然后在每个组中 G>R>B 的物种。

所以我尝试在公式中使用一个因素:

f = ordered(interaction(mydata$Treatment, mydata$Species), 
            levels = c("L.G","L.R","L.B","M.G","M.R","M.B","H.G","H.R","H.B")

然后:

boxplot(mydata$Nitrogen~f)

但是盒子仍然以相同的顺序排列。现在标签不同了,但盒子没有移动。

我已经取出每组数据并将它们单独绘制在一起:

lg = mydata[mydata$Treatment="L" & mydata$Species="G", "Nitrogen"]
mg = mydata[mydata$Treatment="M" & mydata$Species="G", "Nitrogen"]
hg = mydata[mydata$Treatment="H" & mydata$Species="G", "Nitrogen"]
etc ..

boxplot(lg, lr, lb, mg, mr, mb, hg, hr, hb)

这给出了我想要的,但我更喜欢以更优雅的方式执行此操作,因此我不必为更大的数据集单独取出每个数据。


可加载数据:

mydata <-
structure(list(Nitrogen = c(2L, 3L, 4L, 4L, 2L, 1L), Species = structure(c(2L, 
3L, 2L, 1L, 1L, 2L), .Label = c("B", "G", "R"), class = "factor"), 
    Treatment = structure(c(2L, 3L, 1L, 2L, 3L, 1L), .Label = c("H", 
    "L", "M"), class = "factor")), .Names = c("Nitrogen", "Species", 
"Treatment"), class = "data.frame", row.names = c(NA, -6L))

I am trying to produce a series of box plots in R that is grouped by 2 factors. I've managed to make the plot, but I cannot get the boxes to order in the correct direction.

My data farm I am using looks like this:

Nitrogen    Species    Treatment
2           G          L
3           R          M
4           G          H
4           B          L
2           B          M
1           G          H

I tried:

boxplot(mydata$Nitrogen~mydata$Species*mydata$Treatment)

this ordered the boxes alphabetically (first three were the "High" treatments, then within those three they were ordered by species name alphabetically).

alt text

I want the box plot ordered Low>Medium>High then within each of those groups G>R>B for the species.

So i tried using a factor in the formula:

f = ordered(interaction(mydata$Treatment, mydata$Species), 
            levels = c("L.G","L.R","L.B","M.G","M.R","M.B","H.G","H.R","H.B")

then:

boxplot(mydata$Nitrogen~f)

however the boxes are still shoeing up in the same order. The labels are now different, but the boxes have not moved.

I have pulled out each set of data and plotted them all together individually:

lg = mydata[mydata$Treatment="L" & mydata$Species="G", "Nitrogen"]
mg = mydata[mydata$Treatment="M" & mydata$Species="G", "Nitrogen"]
hg = mydata[mydata$Treatment="H" & mydata$Species="G", "Nitrogen"]
etc ..

boxplot(lg, lr, lb, mg, mr, mb, hg, hr, hb)

This gives what i want, but I would prefer to do this in a more elegant way, so I don't have to pull each one out individually for larger data sets.


Loadable data:

mydata <-
structure(list(Nitrogen = c(2L, 3L, 4L, 4L, 2L, 1L), Species = structure(c(2L, 
3L, 2L, 1L, 1L, 2L), .Label = c("B", "G", "R"), class = "factor"), 
    Treatment = structure(c(2L, 3L, 1L, 2L, 3L, 1L), .Label = c("H", 
    "L", "M"), class = "factor")), .Names = c("Nitrogen", "Species", 
"Treatment"), class = "data.frame", row.names = c(NA, -6L))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

泪意 2024-10-11 10:08:40

以下命令将通过重建处理和物种因子来创建您所需的排序,并明确手动对级别进行排序:

mydata$Treatment = factor(mydata$Treatment,c("L","M","H"))

mydata$Species = factor(mydata$Species,c("G","R","B"))

alt text


编辑 1 :哎呀我已将其设置为 HML 而不是 LMH。定影。

编辑2:因子(X,Y)的作用:

如果您对现有因子运行因子(X,Y),它会使用Y中值的顺序来枚举因子中存在的值X. 以下是您的数据的一些示例。

> mydata$Treatment
[1] L M H L M H
Levels: H L M
> as.integer(mydata$Treatment)
[1] 2 3 1 2 3 1
> factor(mydata$Treatment,c("L","M","H"))
[1] L M H L M H                               <-- not changed
Levels: L M H                                 <-- changed
> as.integer(factor(mydata$Treatment,c("L","M","H")))
[1] 1 2 3 1 2 3                               <-- changed

它不会改变因子乍一看的样子,但它确实改变了数据的存储方式。

这里重要的是,许多绘图函数将绘制最左边的最低枚举,然后是下一个,依此类推。

如果您仅使用 factor(X) 创建因子,那么通常枚举是基于字母顺序的因子水平(例如“H”、“L”、“M”)。如果您的标签具有与字母顺序不同的常规顺序(即“H”、“M”、“L”),这可能会使您的图表看起来很奇怪。

乍一看,问题似乎是由于数据框中数据的顺序造成的 - 即,如果我们可以将所有“H”放在顶部,将“L”放在底部,那么它就可以工作。事实并非如此。但是,如果您希望标签的显示顺序与数据中第一次出现的顺序相同,则可以使用以下形式:

 mydata$Treatment = factor(mydata$Treatment, unique(mydata$Treatment))

The following commands will create the ordering you need by rebuilding the Treatment and Species factors, with explicit manual ordering of the levels:

mydata$Treatment = factor(mydata$Treatment,c("L","M","H"))

mydata$Species = factor(mydata$Species,c("G","R","B"))

alt text


edit 1 : oops I had set it to HML instead of LMH. fixing.

edit 2 : what factor(X,Y) does:

If you run factor(X,Y) on an existing factor, it uses the ordering of the values in Y to enumerate the values present in the factor X. Here's some examples with your data.

> mydata$Treatment
[1] L M H L M H
Levels: H L M
> as.integer(mydata$Treatment)
[1] 2 3 1 2 3 1
> factor(mydata$Treatment,c("L","M","H"))
[1] L M H L M H                               <-- not changed
Levels: L M H                                 <-- changed
> as.integer(factor(mydata$Treatment,c("L","M","H")))
[1] 1 2 3 1 2 3                               <-- changed

It does NOT change what the factor looks like at first glance, but it does change how the data is stored.

What's important here is that many plot functions will plot the lowest enumeration leftmost, followed by the next, etc.

If you create factors simply using factor(X) then usually the enumeration is based upon the alphabetical order of the factor levels, (e.g. "H","L","M"). If your labels have a conventional ordering different from alphabetical (i.e. "H","M","L"), this can make your graphs seems strange.

At first glance, it may seem like the problem is due to the ordering of data in the data frame - i.e. if only we could place all "H" at the top and "L" at the bottom, then it would work. It doesn't. But if you want your labels to appear in the same order as the first occurrence in the data, you can use this form:

 mydata$Treatment = factor(mydata$Treatment, unique(mydata$Treatment))
雨后彩虹 2024-10-11 10:08:40

这个早期的 StackOverflow 问题展示了如何重新排序箱线图< /code> 基于数值;这里您需要的可能只是从 factor 到相关类型 ordered 的切换。但这很难说,因为我们没有您的数据,并且您没有提供可重现的示例。

编辑使用您在变量md中发布的数据集并依靠我之前指出的解决方案,我们得到了

R> md$Species <- ordered(md$Species, levels=c("G", "R", "B"))
R> md$Treatment <- ordered(md$Treatment, levels=c("L", "M", "H"))
R> with(md, boxplot(Nitrogen ~ Species * Treatment))

创建您想要创建的图表的结果。

这也相当于这里介绍的其他解决方案。

This earlier StackOverflow question shows how to reorder a boxplot based on a numerical value; what you need here is probably just a switch from factor to the related type ordered. But it is hard say as we do not have your data and you didn't provide a reproducible example.

Edit Using the dataset you posted in variable md and relying on the solution I pointed to earlier, we get

R> md$Species <- ordered(md$Species, levels=c("G", "R", "B"))
R> md$Treatment <- ordered(md$Treatment, levels=c("L", "M", "H"))
R> with(md, boxplot(Nitrogen ~ Species * Treatment))

which creates the chart you were looking to create.

This is also equivalent to the other solution presented here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文