当前位置：文江博客话题详情

如何在R中的散点图中为每个类别指定颜色？

发布于 2024-12-05 18:24:26 字数 175 浏览 1 评论 0原文

在数据集中，我想采用两个属性并创建监督散点图。有谁知道如何为每个班级赋予不同的颜色？

我正在尝试在绘图命令中使用 col == c("red","blue","yellow") 但不确定它是否正确，就好像我包含了另一种颜色，即该颜色即使我只有 3 个类，也出现在散点图中。

谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

执手闯天涯 2024-12-12 18:24:27

这篇文章很旧，但我花了很长时间试图弄清楚这一点，所以我想我会发布更新的回复。我的主要来源是这个精彩的 PowerPoint：http://www.lrdc。 pitt.edu/maplelab/slides/14-Plotting.pdf。好的，这就是我所做的：

在这个示例中，我的数据集称为“数据”，我将“触摸”数据与“凝视”数据进行比较。受试者被分为两组：“红色”和“蓝色”。

`plot(Data$Touch[Data$Category == "Blue"], Data$Gaze[Data$Category == "Blue"], main = "Touch v Gaze", xlab = "Gaze(s)", ylab = "Touch (s)", col = "blue", pch = 20)`

这组代码创建了我的 Blue 组的 Touch v Gaze 散点图
par(new = TRUE)
这告诉 R 创建一个新图。当您一起运行所有代码时，第二个图会自动覆盖第一个图
plot(Data$Touch[Data$Category == "Red"], Data$Gaze[Data$Category == "Red"],axes = FALSE, xlab = "", ylab = "", col =“红色”，pch = 2）
这是第二个图。我在编写这些代码时发现，R 不仅将数据点放置在蓝色图上，而且还放置了坐标区、坐标区标题和主标题。
为了摆脱烦人的重叠问题，我使用axes函数来摆脱轴本身并将标题设置为空白。
图例(x = 60, y = 50, legend = c("蓝色", "红色"), col = c("蓝色", "红色"), pch = c(20, 2) ）
添加一个漂亮的图例来完善项目

这种方式可能比漂亮的 ggplots 长一点，但我今天不想学习全新的东西，希望这对某人有帮助！

This article is old, but I spent a hot minute trying to figure this out so I figured I would post an updated response. My main source is this wonderful PowerPoint: http://www.lrdc.pitt.edu/maplelab/slides/14-Plotting.pdf. Okay, here's what I did:

In this example, my data set is called 'Data' and I was comparing 'Touch' data against 'Gaze' data. The subjects were divided into two groups: 'Red' and 'Blue'.

`plot(Data$Touch[Data$Category == "Blue"], Data$Gaze[Data$Category == "Blue"], main = "Touch v Gaze", xlab = "Gaze(s)", ylab = "Touch (s)", col = "blue", pch = 20)`

This set of code creates a scatterplot of Touch v Gaze of my Blue group
par(new = TRUE)
This tells R to create a new plot. This second plot is laid over the first automatically by R when you run all the code together
plot(Data$Touch[Data$Category == "Red"], Data$Gaze[Data$Category == "Red"], axes = FALSE, xlab = "", ylab = "", col = "red", pch = 2)
This is the second plot. I found when I was coding these that R didn't just lay over the data points onto the Blue plot, but it also lay the axes, axes titles, and main title.
To get rid of the annoying overlap problem, I used the axes function to get rid of the axes themselves and set the titles to be blank.
legend(x = 60, y = 50, legend = c("Blue", "Red"), col = c("blue", "red"), pch = c(20, 2))
Adding a pretty legend to round out the project

This way may be a bit longer than the pretty ggplots but I did not want to learn something completely new today, hope this helps someone!

回复收藏 0 原文

迷迭香的记忆 2024-12-12 18:24:26

这是使用传统图形（和 Dirk 数据）的解决方案：

> DF <- data.frame(x=1:10, y=rnorm(10)+5, z=sample(letters[1:3], 10, replace=TRUE)) 
> DF
    x        y z
1   1 6.628380 c
2   2 6.403279 b
3   3 6.708716 a
4   4 7.011677 c
5   5 6.363794 a
6   6 5.912945 b
7   7 2.996335 a
8   8 5.242786 c
9   9 4.455582 c
10 10 4.362427 a
> attach(DF); plot(x, y, col=c("red","blue","green")[z]); detach(DF)

这依赖于 DF$z 是一个因子的事实，因此当用它进行子集化时，其值将被视为整数。因此，颜色向量的元素将随 z 变化，如下所示：

> c("red","blue","green")[DF$z]
 [1] "green" "blue"  "red"   "green" "red"   "blue"  "red"   "green" "green" "red"

您可以使用 legend 函数添加图例：

legend(x="topright", legend = levels(DF$z), col=c("red","blue","green"), pch=1)

Here is a solution using traditional graphics (and Dirk's data):

> DF <- data.frame(x=1:10, y=rnorm(10)+5, z=sample(letters[1:3], 10, replace=TRUE)) 
> DF
    x        y z
1   1 6.628380 c
2   2 6.403279 b
3   3 6.708716 a
4   4 7.011677 c
5   5 6.363794 a
6   6 5.912945 b
7   7 2.996335 a
8   8 5.242786 c
9   9 4.455582 c
10 10 4.362427 a
> attach(DF); plot(x, y, col=c("red","blue","green")[z]); detach(DF)

This relies on the fact that DF$z is a factor, so when subsetting by it, its values will be treated as integers. So the elements of the color vector will vary with z as follows:

> c("red","blue","green")[DF$z]
 [1] "green" "blue"  "red"   "green" "red"   "blue"  "red"   "green" "green" "red"

You can add a legend using the legend function:

legend(x="topright", legend = levels(DF$z), col=c("red","blue","green"), pch=1)

回复收藏 0 原文

变身佩奇 2024-12-12 18:24:26

以下是我基于此页面构建的示例。

library(e1071); library(ggplot2)

mysvm      <- svm(Species ~ ., iris)
Predicted  <- predict(mysvm, iris)

mydf = cbind(iris, Predicted)
qplot(Petal.Length, Petal.Width, colour = Species, shape = Predicted, 
   data = iris)

这给你输出。从这张图中你可以很容易地发现错误分类的物种。

在此处输入图像描述

Here is an example that I built based on this page.

library(e1071); library(ggplot2)

mysvm      <- svm(Species ~ ., iris)
Predicted  <- predict(mysvm, iris)

mydf = cbind(iris, Predicted)
qplot(Petal.Length, Petal.Width, colour = Species, shape = Predicted, 
   data = iris)

This gives you the output. You can easily spot the misclassified species from this figure.

enter image description here

回复收藏 0 原文

凉世弥音 2024-12-12 18:24:26

一种方法是使用lattice包和xyplot()：

R> DF <- data.frame(x=1:10, y=rnorm(10)+5, 
+>                  z=sample(letters[1:3], 10, replace=TRUE))
R> DF
    x       y z
1   1 3.91191 c
2   2 4.57506 a
3   3 3.16771 b
4   4 5.37539 c
5   5 4.99113 c
6   6 5.41421 a
7   7 6.68071 b
8   8 5.58991 c
9   9 5.03851 a
10 10 4.59293 b
R> with(DF, xyplot(y ~ x, group=z))

通过变量z给出显式分组信息，您可以获得不同的颜色。您可以指定颜色等，请参阅点阵文档。

因为这里的 z 是一个因子变量，我们可以为其获取级别（==数字索引），所以您也可以这样做

R> with(DF, plot(x, y, col=z))

，但这不太透明（至少对我来说是这样：）然后 xyplot( ）等。

One way is to use the lattice package and xyplot():

R> DF <- data.frame(x=1:10, y=rnorm(10)+5, 
+>                  z=sample(letters[1:3], 10, replace=TRUE))
R> DF
    x       y z
1   1 3.91191 c
2   2 4.57506 a
3   3 3.16771 b
4   4 5.37539 c
5   5 4.99113 c
6   6 5.41421 a
7   7 6.68071 b
8   8 5.58991 c
9   9 5.03851 a
10 10 4.59293 b
R> with(DF, xyplot(y ~ x, group=z))

By giving explicit grouping information via variable z, you obtain different colors. You can specify colors etc, see the lattice documentation.

Because z here is a factor variable for which we obtain the levels (== numeric indices), you can also do

R> with(DF, plot(x, y, col=z))

but that is less transparent (to me, at least :) then xyplot() et al.

回复收藏 0 原文

放手` 2024-12-12 18:24:26

以下是我在 2018 年的做法。谁知道呢，也许 R 新手有一天会看到它并爱上 ggplot2。

library(ggplot2)

ggplot(data = iris, aes(Petal.Length, Petal.Width, color = Species)) +
  geom_point() +
  scale_color_manual(values = c("setosa" = "red", "versicolor" = "blue", "virginica" = "yellow"))

Here is how I do it in 2018. Who knows, maybe an R newbie will see it one day and fall in love with ggplot2.

library(ggplot2)

ggplot(data = iris, aes(Petal.Length, Petal.Width, color = Species)) +
  geom_point() +
  scale_color_manual(values = c("setosa" = "red", "versicolor" = "blue", "virginica" = "yellow"))

回复收藏 0 原文

半寸时光 2024-12-12 18:24:26

如果您将类分隔在数据框或矩阵中，则可以使用 matplot。例如，如果我们有

dat<-as.data.frame(cbind(c(1,2,5,7),c(2.1,4.2,-0.5,1),c(9,3,6,2.718)))

plot.new()
plot.window(c(0,nrow(dat)),range(dat))
matplot(dat,col=c("red","blue","yellow"),pch=20)

那么您将得到一个散点图，其中 dat 的第一列绘制为红色，第二列绘制为蓝色，第三列绘制为黄色。当然，如果您希望颜色类别有单独的 x 和 y 值，那么您可以使用 datx 和 daty 等。

另一种方法是附加一个额外的值列指定您想要的颜色（或保留额外的颜色向量，使用 for 循环和一些 if 分支迭代填充它）。例如，这将为您提供相同的情节：

dat<-as.data.frame(
    cbind(c(1,2,5,7,2.1,4.2,-0.5,1,9,3,6,2.718)
    ,c(rep("red",4),rep("blue",4),rep("yellow",4))))

dat[,1]=as.numeric(dat[,1]) #This is necessary because
                            #the second column consisting of strings confuses R
                            #into thinking that the first column must consist of strings, too
plot(dat[,1],pch=20,col=dat[,2])

If you have the classes separated in a data frame or a matrix, then you can use matplot. For example, if we have

dat<-as.data.frame(cbind(c(1,2,5,7),c(2.1,4.2,-0.5,1),c(9,3,6,2.718)))

plot.new()
plot.window(c(0,nrow(dat)),range(dat))
matplot(dat,col=c("red","blue","yellow"),pch=20)

Then you'll get a scatterplot where the first column of dat is plotted in red, the second in blue, and the third in yellow. Of course, if you want separate x and y values for your color classes, then you can have datx and daty, etc.

An alternate approach would be to tack on an extra column specifying what color you want (or keeping an extra vector of colors, filling it iteratively with a for loop and some if branches). For example, this will get you the same plot:

dat<-as.data.frame(
    cbind(c(1,2,5,7,2.1,4.2,-0.5,1,9,3,6,2.718)
    ,c(rep("red",4),rep("blue",4),rep("yellow",4))))

dat[,1]=as.numeric(dat[,1]) #This is necessary because
                            #the second column consisting of strings confuses R
                            #into thinking that the first column must consist of strings, too
plot(dat[,1],pch=20,col=dat[,2])

回复收藏 0 原文