R 中带有凸多边形的主成分分析双标图

发布于 2024-12-24 17:27:08 字数 262 浏览 0 评论 0原文

我使用以下代码生成了附加的双标图：

dd = data.frame(x = runif(10), y=runif(10)) 
pcr = prcomp(~x + y, data=dd)
biplot(pcr)

这会生成一个双标图，显示 x 和 Y 轴以及 10 个数据点中的每一个。

假设 10 个数据点由 2 个不同的组组成（一组 5 个，另一组 5 个）。如何生成每个组周围具有最小凸多边形的双图，以显示 2 个组的划分？

原文

I’ve produced attached biplot using the following code:

dd = data.frame(x = runif(10), y=runif(10)) 
pcr = prcomp(~x + y, data=dd)
biplot(pcr)

This produces a biplot showing the axis for x and Y and each of the 10 data points.

Lets say that the 10 data points are made up of 2 different groups, (5 in one group, 5 in the other group). How can I produce a biplot with a Minimum Convex Polygon around each group, to show a division for the 2 groups?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

喵星人汪星人 2024-12-31 17:27:08

我研究了 stats:::biplot.default 和 stats:::biplot.prcomp ，我很接近你想要的。您可以修改此代码以满足您的需要。（我使用了 iris 数据集）

require(plyr)

data(iris)

pcr <- prcomp(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris)

indiv <- data.frame(pcr$x[,1:2])

indiv$species <- iris$Species

column <- data.frame(pcr$rotation[ ,1:2])

n <- nrow(indiv)

eigenval <- pcr$sdev[1:2]

eigenval <- eigenval * sqrt(n)

indiv <- transform(indiv, pc1 = PC1 / eigenval[1], pc2  = PC2 / eigenval[2])

column <- transform(column, pc1 = PC1 * eigenval[1], pc2  = PC2 * eigenval[2])

### based on stats:::biplot.default

unsigned.range <- function(x) c(-abs(min(x, na.rm = TRUE)),  abs(max(x, na.rm = TRUE)))

rangx1 <- unsigned.range(indiv[, 4])
rangx2 <- unsigned.range(indiv[, 5])
rangy1 <- unsigned.range(column[, 3])
rangy2 <- unsigned.range(column[, 4])

mylim <- range(rangx1, rangx2)
ratio <- max(rangy1/rangx1, rangy2/rangx2)

nspecies <- table(iris$Species)

# compute the convex hull for each species
hull <- dlply(indiv[,1:3], .(species), chull)

# get points connected
hull <- llply(hull, function(x) c(x, x[1]))


plot(pc2 ~ pc1, data = indiv, cex = 0.5, col = c("blue", "yellow", "green")[iris$Species], xlim = mylim, ylim = mylim)

lines(indiv$pc1[hull$setosa], indiv$pc2[hull$setosa] , col = "blue")

lines(indiv$pc1[cumsum(nspecies)[1] + hull$versicolor], indiv$pc2[cumsum(nspecies)[1] + hull$versicolor], col = "yellow")

lines(indiv$pc1[cumsum(nspecies)[2] + hull$virginica],  indiv$pc2[cumsum(nspecies)[2] + hull$virginica], col = "green")

par(new = TRUE)

plot(pc1 ~ pc2, data = column, axes = FALSE, type = "n", xlim = mylim * ratio, ylim = mylim * ratio, xlab = "", ylab = "")

text(column$pc1, column$pc2, labels = rownames(column), cex = 0.5, col = "red")

arrows(0, 0, column$pc1 * 0.8, column$pc2 * 0.8, col = "red", length = 0.2)

axis(3, col = "red")

axis(4, col = "red")

I looked into stats:::biplot.default and stats:::biplot.prcomp and i'm close to what you want. You can modify this code to suit your need. ( I used the iris dataset )

require(plyr)

data(iris)

pcr <- prcomp(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris)

indiv <- data.frame(pcr$x[,1:2])

indiv$species <- iris$Species

column <- data.frame(pcr$rotation[ ,1:2])

n <- nrow(indiv)

eigenval <- pcr$sdev[1:2]

eigenval <- eigenval * sqrt(n)

indiv <- transform(indiv, pc1 = PC1 / eigenval[1], pc2  = PC2 / eigenval[2])

column <- transform(column, pc1 = PC1 * eigenval[1], pc2  = PC2 * eigenval[2])

### based on stats:::biplot.default

unsigned.range <- function(x) c(-abs(min(x, na.rm = TRUE)),  abs(max(x, na.rm = TRUE)))

rangx1 <- unsigned.range(indiv[, 4])
rangx2 <- unsigned.range(indiv[, 5])
rangy1 <- unsigned.range(column[, 3])
rangy2 <- unsigned.range(column[, 4])

mylim <- range(rangx1, rangx2)
ratio <- max(rangy1/rangx1, rangy2/rangx2)

nspecies <- table(iris$Species)

# compute the convex hull for each species
hull <- dlply(indiv[,1:3], .(species), chull)

# get points connected
hull <- llply(hull, function(x) c(x, x[1]))


plot(pc2 ~ pc1, data = indiv, cex = 0.5, col = c("blue", "yellow", "green")[iris$Species], xlim = mylim, ylim = mylim)

lines(indiv$pc1[hull$setosa], indiv$pc2[hull$setosa] , col = "blue")

lines(indiv$pc1[cumsum(nspecies)[1] + hull$versicolor], indiv$pc2[cumsum(nspecies)[1] + hull$versicolor], col = "yellow")

lines(indiv$pc1[cumsum(nspecies)[2] + hull$virginica],  indiv$pc2[cumsum(nspecies)[2] + hull$virginica], col = "green")

par(new = TRUE)

plot(pc1 ~ pc2, data = column, axes = FALSE, type = "n", xlim = mylim * ratio, ylim = mylim * ratio, xlab = "", ylab = "")

text(column$pc1, column$pc2, labels = rownames(column), cex = 0.5, col = "red")

arrows(0, 0, column$pc1 * 0.8, column$pc2 * 0.8, col = "red", length = 0.2)

axis(3, col = "red")

axis(4, col = "red")

回复收藏 0 原文

~没有更多了~