使用 R 在 GGPLOT2 散点图上绘制两个数据向量

发布于 2024-08-02 18:26:19 字数 1807 浏览 7 评论 0原文

我一直在尝试使用 ggplot2 和lattice 来绘制数据面板。我在理解 ggplot2 模型时遇到了一些困难。特别是,如何在每个面板上绘制包含两组数据的散点图:

在点阵中我可以这样做:

xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)

这将为每个 State_CD 提供一个面板,每列

我可以做一个ggplot2 列:

pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2) 
      + facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)

我无法理解的是如何将 Actual_value 添加到上面的 ggplot 中。

编辑哈德利指出,如果有一个可重现的例子,这确实会更容易。这是似乎有效的代码。有没有更好或更简洁的方法来使用 ggplot 来做到这一点?为什么向 ggplot 添加另一组点的语法与添加第一组数据的语法如此不同?

library(lattice)
library(ggplot2)

#make some example data
dd<-data.frame(matrix(rnorm(108),36,3),c(rep("A",24),rep("B",24),rep("C",24)))
colnames(dd) <- c("Predicted_value", "Actual_value", "x_value", "State_CD")

#plot with lattice
xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)

#plot with ggplot
pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2) + facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)

pg + geom_point(data=dd,aes(x_value, Actual_value,group=State_CD), colour="green")

晶格输出如下所示: 替代文本
(来源:cerebralmastication.com

和 ggplot 看起来像这样: 替代文本
(来源:cerebralmastication.com

I've been experimenting with both ggplot2 and lattice to graph panels of data. I'm having a little trouble wrapping my mind around the ggplot2 model. In particular, how do I plot a scatter plot with two sets of data on each panel:

in lattice I could do this:

xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)

and that would give me a panel for each State_CD with each column

I can do one column with ggplot2:

pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2) 
      + facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)

What I can't grok is how to add Actual_value to the ggplot above.

EDIT Hadley pointed out that this really would be easier with a reproducible example. Here's code that seems to work. Is there a better or more concise way to do this with ggplot? Why is the syntax for adding another set of points to ggplot so different from adding the first set of data?

library(lattice)
library(ggplot2)

#make some example data
dd<-data.frame(matrix(rnorm(108),36,3),c(rep("A",24),rep("B",24),rep("C",24)))
colnames(dd) <- c("Predicted_value", "Actual_value", "x_value", "State_CD")

#plot with lattice
xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)

#plot with ggplot
pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2) + facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)

pg + geom_point(data=dd,aes(x_value, Actual_value,group=State_CD), colour="green")

The lattice output looks like this:
alt text
(source: cerebralmastication.com)

and ggplot looks like this:
alt text
(source: cerebralmastication.com)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

疧_╮線 2024-08-09 18:26:19

只是按照 Ian 的建议进行操作:对于 ggplot2,您确实希望将所有 y 轴内容放在一列中,并将另一列作为指示您想要如何装饰它的因素。使用 melt 可以轻松做到这一点。也就是说:

qplot(x_value, value, 
      data = melt(dd, measure.vars=c("Predicted_value", "Actual_value")), 
      colour=variable) + facet_wrap(~State_CD)

这对我来说是这样的:
替代文本
(来源:princeton.edu

至了解 melt 实际在做什么,这是标题:

> head(melt(dd, measure.vars=c("Predicted_value", "Actual_value")))
     x_value State_CD        variable      value
1  1.2898779        A Predicted_value  1.0913712
2  0.1077710        A Predicted_value -2.2337188
3 -0.9430190        A Predicted_value  1.1409515
4  0.3698614        A Predicted_value -1.8260033
5 -0.3949606        A Predicted_value -0.3102753
6 -0.1275037        A Predicted_value -1.2945864

您看,它将 Predicted_value 和 Actual_value “融化”到一个名为 value 的列中,并添加另一列名为 >variable 让您知道它最初来自哪一列。

Just following up on what Ian suggested: for ggplot2 you really want all the y-axis stuff in one column with another column as a factor indicating how you want to decorate it. It is easy to do this with melt. To wit:

qplot(x_value, value, 
      data = melt(dd, measure.vars=c("Predicted_value", "Actual_value")), 
      colour=variable) + facet_wrap(~State_CD)

Here's what it looks like for me:
alt text
(source: princeton.edu)

To get an idea of what melt is actually doing, here's the head:

> head(melt(dd, measure.vars=c("Predicted_value", "Actual_value")))
     x_value State_CD        variable      value
1  1.2898779        A Predicted_value  1.0913712
2  0.1077710        A Predicted_value -2.2337188
3 -0.9430190        A Predicted_value  1.1409515
4  0.3698614        A Predicted_value -1.8260033
5 -0.3949606        A Predicted_value -0.3102753
6 -0.1275037        A Predicted_value -1.2945864

You see, it "melts" Predicted_value and Actual_value into one column called value and adds another column called variable letting you know what column it originally came from.

霞映澄塘 2024-08-09 18:26:19

更新:几年过去了,我几乎总是使用 Jonathan 的方法(通过 tidyr 包)与 ggplot2。我下面的答案在紧要关头有效,但当你有 3 个以上的变量时,它很快就会变得乏味。


我确信 Hadley 会有一个更好的答案,但是 - 语法不同,因为 ggplot(dd,aes()) 语法(我认为)主要用于仅绘制一个变量。对于两个,我会使用:

ggplot() + 
geom_point(data=dd, aes(x_value, Actual_value, group=State_CD), colour="green") + 
geom_point(data=dd, aes(x_value, Predicted_value, group=State_CD), shape = 2) + 
facet_wrap(~ State_CD) + 
theme(aspect.ratio = 1)

从 ggplot() 中提取第一组点,使其具有与第二组相同的语法。我发现这更容易处理,因为语法是相同的,并且它强调了 ggplot2 核心的“图形语法”。

Update: several years on now, I almost always use Jonathan's method (via the tidyr package) with ggplot2. My answer below works in a pinch, but gets tedious fast when you have 3+ variables.


I'm sure Hadley will have a better answer, but - the syntax is different because the ggplot(dd,aes()) syntax is (I think) primarily intended for plotting just one variable. For two, I would use:

ggplot() + 
geom_point(data=dd, aes(x_value, Actual_value, group=State_CD), colour="green") + 
geom_point(data=dd, aes(x_value, Predicted_value, group=State_CD), shape = 2) + 
facet_wrap(~ State_CD) + 
theme(aspect.ratio = 1)

Pulling the first set of points out of the ggplot() gives it the same syntax as the second. I find this easier to deal with because the syntax is the same and it emphasizes the "Grammar of Graphics" that is at the core of ggplot2.

惜醉颜 2024-08-09 18:26:19

您可能只想稍微更改一下数据的形式,以便有一个 y 轴变量,以及一个附加因子变量来指示它是预测变量还是实际变量。

这和你想做的事情一样吗?

dd<-data.frame(type=rep(c("Predicted_value","Actual_value"),20),y_value=rnorm(40),
                x_value=rnorm(40),State_CD=rnorm(40)>0)
qplot(x_value,y_value,data=dd,colour=type,facets=.~State_CD)

you might just want to change the form of your data a little bit, so that you have one y-axis variable, with an additional factor variable indicating whether it is a predicted or actual variable.

Is this something like what you are trying to do?

dd<-data.frame(type=rep(c("Predicted_value","Actual_value"),20),y_value=rnorm(40),
                x_value=rnorm(40),State_CD=rnorm(40)>0)
qplot(x_value,y_value,data=dd,colour=type,facets=.~State_CD)
只怪假的太真实 2024-08-09 18:26:19

在发布问题后,我遇到了 这个 R 帮助线程 这可能对我有帮助。看起来我可以这样做:

 pg + geom_line(data=dd,aes(x_value, Actual_value,group=State_CD), colour="green") 

这是一个很好的做事方式吗?这对我来说很奇怪,因为添加第二项的语法与第一项完全不同。

well after posting the question I ran across this R Help thread that may have helped me. It looks like I can do this:

 pg + geom_line(data=dd,aes(x_value, Actual_value,group=State_CD), colour="green") 

is that a good way of doing things? It odd to me because adding the second item has a totally different syntax than the first.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文