使用 R 在 GGPLOT2 散点图上绘制两个数据向量
我一直在尝试使用 ggplot2 和lattice 来绘制数据面板。我在理解 ggplot2 模型时遇到了一些困难。特别是,如何在每个面板上绘制包含两组数据的散点图:
在点阵中我可以这样做:
xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)
这将为每个 State_CD 提供一个面板,每列
我可以做一个ggplot2 列:
pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2)
+ facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)
我无法理解的是如何将 Actual_value 添加到上面的 ggplot 中。
编辑哈德利指出,如果有一个可重现的例子,这确实会更容易。这是似乎有效的代码。有没有更好或更简洁的方法来使用 ggplot 来做到这一点?为什么向 ggplot 添加另一组点的语法与添加第一组数据的语法如此不同?
library(lattice)
library(ggplot2)
#make some example data
dd<-data.frame(matrix(rnorm(108),36,3),c(rep("A",24),rep("B",24),rep("C",24)))
colnames(dd) <- c("Predicted_value", "Actual_value", "x_value", "State_CD")
#plot with lattice
xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)
#plot with ggplot
pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2) + facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)
pg + geom_point(data=dd,aes(x_value, Actual_value,group=State_CD), colour="green")
晶格输出如下所示:
(来源:cerebralmastication.com)
和 ggplot 看起来像这样:
(来源:cerebralmastication.com)
I've been experimenting with both ggplot2
and lattice
to graph panels of data. I'm having a little trouble wrapping my mind around the ggplot2
model. In particular, how do I plot a scatter plot with two sets of data on each panel:
in lattice
I could do this:
xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)
and that would give me a panel for each State_CD with each column
I can do one column with ggplot2
:
pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2)
+ facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)
What I can't grok is how to add Actual_value to the ggplot above.
EDIT Hadley pointed out that this really would be easier with a reproducible example. Here's code that seems to work. Is there a better or more concise way to do this with ggplot? Why is the syntax for adding another set of points to ggplot so different from adding the first set of data?
library(lattice)
library(ggplot2)
#make some example data
dd<-data.frame(matrix(rnorm(108),36,3),c(rep("A",24),rep("B",24),rep("C",24)))
colnames(dd) <- c("Predicted_value", "Actual_value", "x_value", "State_CD")
#plot with lattice
xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd)
#plot with ggplot
pg <- ggplot(dd, aes(x_value, Predicted_value)) + geom_point(shape = 2) + facet_wrap(~ State_CD) + opts(aspect.ratio = 1)
print(pg)
pg + geom_point(data=dd,aes(x_value, Actual_value,group=State_CD), colour="green")
The lattice output looks like this:
(source: cerebralmastication.com)
and ggplot looks like this:
(source: cerebralmastication.com)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
只是按照 Ian 的建议进行操作:对于 ggplot2,您确实希望将所有 y 轴内容放在一列中,并将另一列作为指示您想要如何装饰它的因素。使用
melt
可以轻松做到这一点。也就是说:这对我来说是这样的:
(来源:princeton.edu)
至了解
melt
实际在做什么,这是标题:您看,它将 Predicted_value 和 Actual_value “融化”到一个名为
value
的列中,并添加另一列名为>variable
让您知道它最初来自哪一列。Just following up on what Ian suggested: for ggplot2 you really want all the y-axis stuff in one column with another column as a factor indicating how you want to decorate it. It is easy to do this with
melt
. To wit:Here's what it looks like for me:
(source: princeton.edu)
To get an idea of what
melt
is actually doing, here's the head:You see, it "melts" Predicted_value and Actual_value into one column called
value
and adds another column calledvariable
letting you know what column it originally came from.更新:几年过去了,我几乎总是使用 Jonathan 的方法(通过 tidyr 包)与 ggplot2。我下面的答案在紧要关头有效,但当你有 3 个以上的变量时,它很快就会变得乏味。
我确信 Hadley 会有一个更好的答案,但是 - 语法不同,因为 ggplot(dd,aes()) 语法(我认为)主要用于仅绘制一个变量。对于两个,我会使用:
从 ggplot() 中提取第一组点,使其具有与第二组相同的语法。我发现这更容易处理,因为语法是相同的,并且它强调了 ggplot2 核心的“图形语法”。
Update: several years on now, I almost always use Jonathan's method (via the tidyr package) with ggplot2. My answer below works in a pinch, but gets tedious fast when you have 3+ variables.
I'm sure Hadley will have a better answer, but - the syntax is different because the
ggplot(dd,aes())
syntax is (I think) primarily intended for plotting just one variable. For two, I would use:Pulling the first set of points out of the ggplot() gives it the same syntax as the second. I find this easier to deal with because the syntax is the same and it emphasizes the "Grammar of Graphics" that is at the core of ggplot2.
您可能只想稍微更改一下数据的形式,以便有一个 y 轴变量,以及一个附加因子变量来指示它是预测变量还是实际变量。
这和你想做的事情一样吗?
you might just want to change the form of your data a little bit, so that you have one y-axis variable, with an additional factor variable indicating whether it is a predicted or actual variable.
Is this something like what you are trying to do?
在发布问题后,我遇到了 这个 R 帮助线程 这可能对我有帮助。看起来我可以这样做:
这是一个很好的做事方式吗?这对我来说很奇怪,因为添加第二项的语法与第一项完全不同。
well after posting the question I ran across this R Help thread that may have helped me. It looks like I can do this:
is that a good way of doing things? It odd to me because adding the second item has a totally different syntax than the first.