R 中按因子着色图

发布于 2024-12-09 13:24:43 字数 186 浏览 0 评论 0原文

我正在制作两个变量的散点图,并希望通过因子变量对点进行着色。这是一些可重现的代码:

data <- iris
plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)

这一切都很好,但是我怎么知道什么因素被涂上了什么颜色?

I am making a scatter plot of two variables and would like to colour the points by a factor variable. Here is some reproducible code:

data <- iris
plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)

This is all well and good but how do I know what factor has been coloured what colour??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

我们的影子 2024-12-16 13:24:43
data<-iris
plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)
legend(7,4.3,unique(data$Species),col=1:length(data$Species),pch=1)

应该为你做。但我更喜欢 ggplot2 并建议使用 R 来获得更好的图形效果。

data<-iris
plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)
legend(7,4.3,unique(data$Species),col=1:length(data$Species),pch=1)

should do it for you. But I prefer ggplot2 and would suggest that for better graphics in R.

挽梦忆笙歌 2024-12-16 13:24:43

命令palette告诉您当col = somefactor时的颜色及其顺序。它还可以用于设置颜色。

palette()
[1] "black"   "red"     "green3"  "blue"    "cyan"    "magenta" "yellow"  "gray"   

为了在图表中看到这一点,您可以使用图例。

legend('topright', legend = levels(iris$Species), col = 1:3, cex = 0.8, pch = 1)

您会注意到我只用 3 个数字指定了新颜色。这就像使用一个因子一样。我也可以使用最初用于为点着色的因子。这将使一切在逻辑上流动在一起......但我只是想告诉你可以使用各种各样的东西。

您还可以具体说明颜色。首先尝试 ?rainbow 并从那里开始。您可以指定自己的或让 R 为您完成。只要你对每个都使用相同的方法就可以了。

The command palette tells you the colours and their order when col = somefactor. It can also be used to set the colours as well.

palette()
[1] "black"   "red"     "green3"  "blue"    "cyan"    "magenta" "yellow"  "gray"   

In order to see that in your graph you could use a legend.

legend('topright', legend = levels(iris$Species), col = 1:3, cex = 0.8, pch = 1)

You'll notice that I only specified the new colours with 3 numbers. This will work like using a factor. I could have used the factor originally used to colour the points as well. This would make everything logically flow together... but I just wanted to show you can use a variety of things.

You could also be specific about the colours. Try ?rainbow for starters and go from there. You can specify your own or have R do it for you. As long as you use the same method for each you're OK.

岁月静好 2024-12-16 13:24:43

据我所知,有两种方法可以按因子对绘图点进行着色,然后自动生成相应的图例。 示例:

  1. 使用 ggplot2(通常更容易)
  2. 将 R 的内置绘图功能与 colorRampPallete 函数结合使用(比较棘手,但许多人更喜欢/需要 R 的内置绘图工具)

我将给出两者的 在这两个示例中,我都将使用 ggplot2 钻石数据集。我们将使用数字列 diamond$caratdiamond$price,以及因子/分类列 diamond$color。如果安装了 ggplot2,则可以使用以下代码加载数据集:

library(ggplot2)
data(diamonds)

使用 ggplot2 和 qplot

这是一个单行图。这里的关键是为 qplot 提供您想要着色的因素作为 color 参数。 qplot 默认情况下会为您制作图例。

qplot(
  x = carat,
  y = price,
  data = diamonds,
  color = diamonds$color # color by factor color (I know, confusing)
)

你的输出应该是这样的:
qplot 输出彩色按因子“diamond$color”

使用 R 的内置绘图功能

使用 R 的内置绘图功能来获取按因子和相关图例着色的绘图是一个 4 步过程,而且稍微多一点技术比使用ggplot2。

首先,我们将创建一个 colorRampPallete 函数。 colorRampPallete() 返回一个新函数,该函数将生成颜色列表。在下面的代码片段中,调用 color_pallet_function(5) 将返回 5 种颜色的列表,范围从红色到橙色再到蓝色:

color_pallete_function <- colorRampPalette(
  colors = c("red", "orange", "blue"),
  space = "Lab" # Option used when colors do not represent a quantitative scale
  )

其次,我们需要创建一个颜色列表,每个颜色只包含一种颜色钻石颜色。这是我们将用来为各个绘图点分配颜色并创建图例的映射。

num_colors <- nlevels(diamonds$color)
diamond_color_colors <- color_pallet_function(num_colors)

第三,我们创建情节。这就像您可能完成的任何其他绘图一样,除了我们引用我们作为 col 参数创建的颜色列表。只要我们始终使用相同的列表,颜色和 Diamond$colors 之间的映射就会在 R 脚本中保持一致。

plot(
  x = diamonds$carat,
  y = diamonds$price,
  xlab = "Carat",
  ylab = "Price",
  pch = 20, # solid dots increase the readability of this data plot
  col = diamond_color_colors[diamonds$color]
)

第四,也是最后,我们添加图例,以便阅读我们的图表的人可以清楚地看到绘图点颜色和实际钻石颜色之间的映射。

legend(
  x ="topleft",
  legend = paste("Color", levels(diamonds$color)), # for readability of legend
  col = diamond_color_colors,
  pch = 19, # same as pch=20, just smaller
  cex = .7 # scale the legend to look attractively sized
)

你的输出应该是这样的:
标准 R 图输出按因子“diamond$color”着色

,很漂亮,对吧?

There are two ways that I know of to color plot points by factor and then also have a corresponding legend automatically generated. I'll give examples of both:

  1. Using ggplot2 (generally easier)
  2. Using R's built in plotting functionality in combination with the colorRampPallete function (trickier, but many people prefer/need R's built-in plotting facilities)

For both examples, I will use the ggplot2 diamonds dataset. We'll be using the numeric columns diamond$carat and diamond$price, and the factor/categorical column diamond$color. You can load the dataset with the following code if you have ggplot2 installed:

library(ggplot2)
data(diamonds)

Using ggplot2 and qplot

It's a one liner. Key item here is to give qplot the factor you want to color by as the color argument. qplot will make a legend for you by default.

qplot(
  x = carat,
  y = price,
  data = diamonds,
  color = diamonds$color # color by factor color (I know, confusing)
)

Your output should look like this:
qplot output colored by factor "diamond$color"

Using R's built in plot functionality

Using R's built in plot functionality to get a plot colored by a factor and an associated legend is a 4-step process, and it's a little more technical than using ggplot2.

First, we will make a colorRampPallete function. colorRampPallete() returns a new function that will generate a list of colors. In the snippet below, calling color_pallet_function(5) would return a list of 5 colors on a scale from red to orange to blue:

color_pallete_function <- colorRampPalette(
  colors = c("red", "orange", "blue"),
  space = "Lab" # Option used when colors do not represent a quantitative scale
  )

Second, we need to make a list of colors, with exactly one color per diamond color. This is the mapping we will use both to assign colors to individual plot points, and to create our legend.

num_colors <- nlevels(diamonds$color)
diamond_color_colors <- color_pallet_function(num_colors)

Third, we create our plot. This is done just like any other plot you've likely done, except we refer to the list of colors we made as our col argument. As long as we always use this same list, our mapping between colors and diamond$colors will be consistent across our R script.

plot(
  x = diamonds$carat,
  y = diamonds$price,
  xlab = "Carat",
  ylab = "Price",
  pch = 20, # solid dots increase the readability of this data plot
  col = diamond_color_colors[diamonds$color]
)

Fourth and finally, we add our legend so that someone reading our graph can clearly see the mapping between the plot point colors and the actual diamond colors.

legend(
  x ="topleft",
  legend = paste("Color", levels(diamonds$color)), # for readability of legend
  col = diamond_color_colors,
  pch = 19, # same as pch=20, just smaller
  cex = .7 # scale the legend to look attractively sized
)

Your output should look like this:
standard R plot output colored by factor "diamond$color"

Nifty, right?

神魇的王 2024-12-16 13:24:43

和 Maiasaura 一样,我更喜欢 ggplot2。透明的参考手册是原因之一。
然而,这是完成任务的一种快速方法。

require(ggplot2)
data(diamonds)
qplot(carat, price, data = diamonds, colour = color)
# example taken from Hadley's ggplot2 book

因为有人名人说,如果没有情节,与情节相关的帖子就不完整,结果如下:

在此处输入图像描述

这是一个几个参考文献:
qplot.R 示例
请注意,基本上这使用了我使用的相同钻石数据集,但之前裁剪了数据以获得更好的性能。

http://ggplot2.org/book/
手册:http://docs.ggplot2.org/current/

Like Maiasaura, I prefer ggplot2. The transparent reference manual is one of the reasons.
However, this is one quick way to get it done.

require(ggplot2)
data(diamonds)
qplot(carat, price, data = diamonds, colour = color)
# example taken from Hadley's ggplot2 book

And cause someone famous said, plot related posts are not complete without the plot, here's the result:

enter image description here

Here's a couple of references:
qplot.R example,
note basically this uses the same diamond dataset I use, but crops the data before to get better performance.

http://ggplot2.org/book/
the manual: http://docs.ggplot2.org/current/

放我走吧 2024-12-16 13:24:43

plot 函数中的 col 参数自动将颜色分配给整数向量。如果将 iris$Species 转换为数字,请注意您有一个 1,2 和 3s 的向量,因此您可以将其应用为:

plot(iris$Sepal.Length, iris$Sepal.Width, col=as.numeric(iris$Species))

假设您想要红色、蓝色和绿色而不是默认颜色,那么您可以简单地调整它:

plot(iris$Sepal.Length, iris$Sepal.Width, col=c('red', 'blue', 'green')[as.numeric(iris$Species)])

您可能会看到如何进一步修改上面的代码以获得任何独特的颜色组合。

The col argument in the plot function assign colors automatically to a vector of integers. If you convert iris$Species to numeric, notice you have a vector of 1,2 and 3s So you can apply this as:

plot(iris$Sepal.Length, iris$Sepal.Width, col=as.numeric(iris$Species))

Suppose you want red, blue and green instead of the default colors, then you can simply adjust it:

plot(iris$Sepal.Length, iris$Sepal.Width, col=c('red', 'blue', 'green')[as.numeric(iris$Species)])

You can probably see how to further modify the code above to get any unique combination of colors.

征棹 2024-12-16 13:24:43

lattice 库是另一个不错的选择。在这里,我在右侧添加了一个图例,并抖动了这些点,因为其中一些点重叠了。

xyplot(Sepal.Width ~ Sepal.Length, group=Species, data=iris, 
       auto.key=list(space="right"), 
       jitter.x=TRUE, jitter.y=TRUE)

示例图

The lattice library is another good option. Here I've added a legend on the right side and jittered the points because some of them overlapped.

xyplot(Sepal.Width ~ Sepal.Length, group=Species, data=iris, 
       auto.key=list(space="right"), 
       jitter.x=TRUE, jitter.y=TRUE)

example plot

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文