生成多维数据

发布于 2024-11-15 13:26:57 字数 53 浏览 4 评论 0原文

R 有用于在多维空间中生成随机数的包吗?例如,假设我想在长方体或球体内生成 1000 个点。

Does R have a package for generating random numbers in multi-dimensional space? For example, suppose I want to generate 1000 points inside a cuboid or a sphere.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

ˇ宁静的妩媚 2024-11-22 13:26:57

我有一些用于超立方体和 n 球体选择的函数,它们生成具有笛卡尔坐标的数据帧,并保证任意数量的维度通过超立方体或 n 球体的均匀分布:

GenerateCubiclePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),l=1){

    x <-  matrix(runif(nrPoints*nrDim,-1,1),ncol=nrDim)
    x <-  as.data.frame(
            t(apply(x*(l/2),1,'+',center))
          )
    names(x) <- make.names(seq_len(nrDim))
    x
}

位于 nrDim 具有中心l 一侧长度的尺寸。

对于具有 nrDim 维度的 n 球体,您可以执行类似的操作,其中 r 是半径:

GenerateSpherePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),r=1){
    #generate the polar coordinates!
    x <-  matrix(runif(nrPoints*nrDim,-pi,pi),ncol=nrDim)
    x[,nrDim] <- x[,nrDim]/2
    #recalculate them to cartesians
    sin.x <- sin(x)
    cos.x <- cos(x)
    cos.x[,nrDim] <- 1  # see the formula for n.spheres

    y <- sapply(1:nrDim, function(i){
        if(i==1){
          cos.x[,1]
        } else {
          cos.x[,i]*apply(sin.x[,1:(i-1),drop=F],1,prod)
        }
    })*sqrt(runif(nrPoints,0,r^2))

    y <-  as.data.frame(
            t(apply(y,1,'+',center))
          )

    names(y) <- make.names(seq_len(nrDim))
    y
}

在 2 维中,这些给出:

在此处输入图像描述

来自代码:

 T1 <- GenerateCubiclePoints(10000,2,c(4,3),5)
 T2 <- GenerateSpherePoints(10000,2,c(-5,3),2)
 op <- par(mfrow=c(1,2))
 plot(T1)
 plot(T2)
 par(op)

I have some functions for hypercube and n-sphere selection that generate dataframes with cartesian coordinates and guarantee a uniform distribution through the hypercube or n-sphere for an arbitrary amount of dimensions :

GenerateCubiclePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),l=1){

    x <-  matrix(runif(nrPoints*nrDim,-1,1),ncol=nrDim)
    x <-  as.data.frame(
            t(apply(x*(l/2),1,'+',center))
          )
    names(x) <- make.names(seq_len(nrDim))
    x
}

is in a cube/hypercube of nrDim dimensions with a center and l the length of one side.

For an n-sphere with nrDim dimensions, you can do something similar, where r is the radius :

GenerateSpherePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),r=1){
    #generate the polar coordinates!
    x <-  matrix(runif(nrPoints*nrDim,-pi,pi),ncol=nrDim)
    x[,nrDim] <- x[,nrDim]/2
    #recalculate them to cartesians
    sin.x <- sin(x)
    cos.x <- cos(x)
    cos.x[,nrDim] <- 1  # see the formula for n.spheres

    y <- sapply(1:nrDim, function(i){
        if(i==1){
          cos.x[,1]
        } else {
          cos.x[,i]*apply(sin.x[,1:(i-1),drop=F],1,prod)
        }
    })*sqrt(runif(nrPoints,0,r^2))

    y <-  as.data.frame(
            t(apply(y,1,'+',center))
          )

    names(y) <- make.names(seq_len(nrDim))
    y
}

in 2 dimensions, these give :

enter image description here

From code :

 T1 <- GenerateCubiclePoints(10000,2,c(4,3),5)
 T2 <- GenerateSpherePoints(10000,2,c(-5,3),2)
 op <- par(mfrow=c(1,2))
 plot(T1)
 plot(T2)
 par(op)
你列表最软的妹 2024-11-22 13:26:57

另请查看 copula 包。这将在具有统一边距的立方体/超立方体内生成数据,但具有您设置的相关结构。然后可以将生成的变量转换为表示其他形状,但仍然具有非独立关系。

如果您想要更复杂的形状,但对形状内的均匀和独立感到满意,那么您可以进行拒绝采样:在包含您的形状的立方体内生成数据,然后测试这些点是否在您的形状内,如果不在您的形状内则拒绝它们,然后继续这样做,直到有足够的点。

Also check out the copula package. This will generate data within a cube/hypercube with uniform margins, but with correlation structures that you set. The generated variables can then be transformed to represent other shapes, but still with relations other than independent.

If you want more complex shapes but are happy with uniform and idependent within the shape then you can just do rejection sampling: generate data within a cube that contains your shape, then test if the points are within your shape, reject them if not, then keep doing this until there are enough points.

旧伤还要旧人安 2024-11-22 13:26:57

几年前,我制作了一个名为 geozoo 的包。它可以在 CRAN 上使用。

install.packages("geozoo")
library(geozoo)

它有许多不同的函数来生成 N 维对象。

p = 4
n = 1000

# Cube with points on it's face.  
# A 3D version would be a box with solid walls and a hollow interior.
cube.face(p)

# Hollow sphere
sphere.hollow(p, n)


# Solid cube
cube.solid.random(p, n)
cube.solid.grid(p, 10) # evenly spaced points

# Solid Sphere
sphere.solid.random(p, n)
sphere.solid.grid(p, 10) # evenly spaced points

我最喜欢观看的动画之一是边缘有点的立方体,因为它是我制作的第一个对象之一。它还可以让您感受到顶点之间的距离。

# Cube with points along it's edges.  
cube.dotline(4)

另外,请查看网站:http://streaming.stat.iastate.edu/ ~dicook/几何数据/。它包含图片和可下载的数据集。

希望它能满足您的需求!

A couple of years ago, I made a package called geozoo. It is available on CRAN.

install.packages("geozoo")
library(geozoo)

It has many different functions to produce objects in N-dimensions.

p = 4
n = 1000

# Cube with points on it's face.  
# A 3D version would be a box with solid walls and a hollow interior.
cube.face(p)

# Hollow sphere
sphere.hollow(p, n)


# Solid cube
cube.solid.random(p, n)
cube.solid.grid(p, 10) # evenly spaced points

# Solid Sphere
sphere.solid.random(p, n)
sphere.solid.grid(p, 10) # evenly spaced points

One of my favorite ones to watch animate is a cube with points along its edges, because it was one of the first objects that I made. It also gives you a sense of distance between vertices.

# Cube with points along it's edges.  
cube.dotline(4)

Also, check out the website: http://streaming.stat.iastate.edu/~dicook/geometric-data/. It contains pictures and downloadable data sets.

Hope it meets your needs!

十二 2024-11-22 13:26:57

长方体:

df <- data.frame(
    x = runif(1000),
    y = runif(1000),
    z = runif(1000)
)

head(df)

          x           y         z
1 0.7522104 0.579833314 0.7878651
2 0.2846864 0.520284731 0.8435828
3 0.2240340 0.001686003 0.2143208
4 0.4933712 0.250840233 0.4618258
5 0.6749785 0.298335804 0.4494820
6 0.7089414 0.141114804 0.3772317

球体:

df <- data.frame(
    radius = runif(1000),
    inclination = 2*pi*runif(1000),
    azimuth = 2*pi*runif(1000)
)


head(df)

     radius inclination  azimuth
1 0.1233281    5.363530 1.747377
2 0.1872865    5.309806 4.933985
3 0.2371039    5.029894 6.160549
4 0.2438854    2.962975 2.862862
5 0.5300013    3.340892 1.647043
6 0.6972793    4.777056 2.381325

注意:编辑以包含球体的代码

Cuboid:

df <- data.frame(
    x = runif(1000),
    y = runif(1000),
    z = runif(1000)
)

head(df)

          x           y         z
1 0.7522104 0.579833314 0.7878651
2 0.2846864 0.520284731 0.8435828
3 0.2240340 0.001686003 0.2143208
4 0.4933712 0.250840233 0.4618258
5 0.6749785 0.298335804 0.4494820
6 0.7089414 0.141114804 0.3772317

Sphere:

df <- data.frame(
    radius = runif(1000),
    inclination = 2*pi*runif(1000),
    azimuth = 2*pi*runif(1000)
)


head(df)

     radius inclination  azimuth
1 0.1233281    5.363530 1.747377
2 0.1872865    5.309806 4.933985
3 0.2371039    5.029894 6.160549
4 0.2438854    2.962975 2.862862
5 0.5300013    3.340892 1.647043
6 0.6972793    4.777056 2.381325

Note: edited to include code for sphere

时光无声 2024-11-22 13:26:57

这是一种方法。
假设我们希望生成一堆 y = (y_1, y_2, y_3) 形式的 3d 点

  1. 来自均值为零和协方差矩阵 R 的多元高斯的样本 X。

     (x_1, x_2, x_3) ~ Multivariate_Gaussian(u = [0,0,0], R = [[r_11, r_12, r_13],r_21, r_22, r_23], [r_31, r_32, r_33 ]]
    

    您可以在 R 包中找到一个生成多元高斯样本的函数。

  2. 取每个协变量的高斯 cdf (phi(x_1) 、 phi(x_2) 、 phi(x_3) )。在本例中,phi 是变量的高斯 cdf。即 phi(x_1) = Pr[x <= x_1] 通过概率积分变换,这些 (phi(x_1) , phi(x_2), phi(x_3)) = (u_1, u_2, u_3),将各自均匀地在 [0,1] 上分布。

  3. 然后,取每个均匀分布边际的逆 cdf。换句话说,取 u_1、u_2、u_3 的逆 cdf:

    F^{-1}(u_1), F^{-2}(u_2), F^{-3}(u_3) = (y_1, y_2, y_3),其中 F 是分布的边际 cdf您正在尝试从中采样。

Here is one way to do it.
Say we hope to generate a bunch of 3d points of the form y = (y_1, y_2, y_3)

  1. Sample X from multivariate Gaussian with mean zero and covariance matrix R.

       (x_1, x_2, x_3) ~ Multivariate_Gaussian(u = [0,0,0], R = [[r_11, r_12, r_13],r_21, r_22, r_23], [r_31, r_32, r_33]]
    

    You can find a function which generates Multivariate Gaussian samples in an R package.

  2. Take the Gaussian cdf of each covariate (phi(x_1) , phi(x_2), phi(x_3)). In this case, phi is the Gaussian cdf of our variables. Ie phi(x_1) = Pr[x <= x_1] By the probability integral transform, these (phi(x_1) , phi(x_2), phi(x_3)) = (u_1, u_2, u_3), will each be uniformly distrubted on [0,1].

  3. Then, take the inverse cdf of each uniformly distributed marginal. In other words take the inverse cdf of u_1, u_2, u_3:

    F^{-1}(u_1), F^{-2}(u_2), F^{-3}(u_3) = (y_1, y_2, y_3), where F is the marginal cdf of the distrubution you are trying to sample from.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文