生成多维数据
R 有用于在多维空间中生成随机数的包吗?例如,假设我想在长方体或球体内生成 1000 个点。
Does R have a package for generating random numbers in multi-dimensional space? For example, suppose I want to generate 1000 points inside a cuboid or a sphere.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我有一些用于超立方体和 n 球体选择的函数,它们生成具有笛卡尔坐标的数据帧,并保证任意数量的维度通过超立方体或 n 球体的均匀分布:
位于
nrDim
具有中心
和l
一侧长度的尺寸。对于具有
nrDim
维度的 n 球体,您可以执行类似的操作,其中r
是半径:在 2 维中,这些给出:
来自代码:
I have some functions for hypercube and n-sphere selection that generate dataframes with cartesian coordinates and guarantee a uniform distribution through the hypercube or n-sphere for an arbitrary amount of dimensions :
is in a cube/hypercube of
nrDim
dimensions with acenter
andl
the length of one side.For an n-sphere with
nrDim
dimensions, you can do something similar, wherer
is the radius :in 2 dimensions, these give :
From code :
另请查看 copula 包。这将在具有统一边距的立方体/超立方体内生成数据,但具有您设置的相关结构。然后可以将生成的变量转换为表示其他形状,但仍然具有非独立关系。
如果您想要更复杂的形状,但对形状内的均匀和独立感到满意,那么您可以进行拒绝采样:在包含您的形状的立方体内生成数据,然后测试这些点是否在您的形状内,如果不在您的形状内则拒绝它们,然后继续这样做,直到有足够的点。
Also check out the copula package. This will generate data within a cube/hypercube with uniform margins, but with correlation structures that you set. The generated variables can then be transformed to represent other shapes, but still with relations other than independent.
If you want more complex shapes but are happy with uniform and idependent within the shape then you can just do rejection sampling: generate data within a cube that contains your shape, then test if the points are within your shape, reject them if not, then keep doing this until there are enough points.
几年前,我制作了一个名为 geozoo 的包。它可以在 CRAN 上使用。
它有许多不同的函数来生成 N 维对象。
我最喜欢观看的动画之一是边缘有点的立方体,因为它是我制作的第一个对象之一。它还可以让您感受到顶点之间的距离。
另外,请查看网站:http://streaming.stat.iastate.edu/ ~dicook/几何数据/。它包含图片和可下载的数据集。
希望它能满足您的需求!
A couple of years ago, I made a package called geozoo. It is available on CRAN.
It has many different functions to produce objects in N-dimensions.
One of my favorite ones to watch animate is a cube with points along its edges, because it was one of the first objects that I made. It also gives you a sense of distance between vertices.
Also, check out the website: http://streaming.stat.iastate.edu/~dicook/geometric-data/. It contains pictures and downloadable data sets.
Hope it meets your needs!
长方体:
球体:
注意:编辑以包含球体的代码
Cuboid:
Sphere:
Note: edited to include code for sphere
这是一种方法。
假设我们希望生成一堆 y = (y_1, y_2, y_3) 形式的 3d 点
来自均值为零和协方差矩阵 R 的多元高斯的样本 X。
您可以在 R 包中找到一个生成多元高斯样本的函数。
取每个协变量的高斯 cdf (phi(x_1) 、 phi(x_2) 、 phi(x_3) )。在本例中,phi 是变量的高斯 cdf。即 phi(x_1) = Pr[x <= x_1] 通过概率积分变换,这些 (phi(x_1) , phi(x_2), phi(x_3)) = (u_1, u_2, u_3),将各自均匀地在 [0,1] 上分布。
然后,取每个均匀分布边际的逆 cdf。换句话说,取 u_1、u_2、u_3 的逆 cdf:
F^{-1}(u_1), F^{-2}(u_2), F^{-3}(u_3) = (y_1, y_2, y_3),其中 F 是分布的边际 cdf您正在尝试从中采样。
Here is one way to do it.
Say we hope to generate a bunch of 3d points of the form y = (y_1, y_2, y_3)
Sample X from multivariate Gaussian with mean zero and covariance matrix R.
You can find a function which generates Multivariate Gaussian samples in an R package.
Take the Gaussian cdf of each covariate (phi(x_1) , phi(x_2), phi(x_3)). In this case, phi is the Gaussian cdf of our variables. Ie phi(x_1) = Pr[x <= x_1] By the probability integral transform, these (phi(x_1) , phi(x_2), phi(x_3)) = (u_1, u_2, u_3), will each be uniformly distrubted on [0,1].
Then, take the inverse cdf of each uniformly distributed marginal. In other words take the inverse cdf of u_1, u_2, u_3:
F^{-1}(u_1), F^{-2}(u_2), F^{-3}(u_3) = (y_1, y_2, y_3), where F is the marginal cdf of the distrubution you are trying to sample from.