C++ 中多元正态/高斯分布的样本

发布于 2024-11-10 02:26:40 字数 894 浏览 0 评论 0原文

我一直在寻找一种从多元正态分布中采样的便捷方法。有谁知道有现成的代码片段可以做到这一点?对于矩阵/向量,我更喜欢使用 BoostEigen 或其他我不熟悉的非凡库,但是我可以在紧要关头使用 GSL 。如果该方法接受非负定协方差矩阵而不是要求正定矩阵(例如,与 Cholesky 分解一样),我也希望如此。这存在于 MATLAB、NumPy 和其他语言中,但我很难找到现成的 C/C++ 解决方案。

如果我必须自己实现它,我会抱怨,但没关系。如果我这样做,维基百科听起来就像我应该

  1. 生成n 0均值、单位方差、独立正态样本(Boost将执行此操作)
  2. 找到协方差矩阵的特征分解,
  3. 按相应特征值的平方根缩放每个n样本
  4. 通过将缩放向量预乘以分解发现的正交特征向量矩阵来旋转样本向量

我希望它能够快速工作。有人有直觉知道什么时候值得检查协方差矩阵是否为正,如果是,则使用 Cholesky 代替?

I've been hunting for a convenient way to sample from a multivariate normal distribution. Does anyone know of a readily available code snippet to do that? For matrices/vectors, I'd prefer to use Boost or Eigen or another phenomenal library I'm not familiar with, but I could use GSL in a pinch. I'd also like it if the method accepted nonnegative-definite covariance matrices rather than requiring positive-definite (e.g., as with the Cholesky decomposition). This exists in MATLAB, NumPy, and others, but I've had a hard time finding a ready-made C/C++ solution.

If I have to implement it myself, I'll grumble but that's fine. If I do that, Wikipedia makes it sound like I should

  1. generate n 0-mean, unit-variance, independent normal samples (boost will do this)
  2. find the eigen-decomposition of the covariance matrix
  3. scale each of the n samples by the square-root of the corresponding eigenvalue
  4. rotate the vector of samples by pre-multiplying the scaled vector by the matrix of orthonormal eigenvectors found by the decomposition

I would like this to work quickly. Does someone have an intuition for when it would be worthwhile to check to see if the covariance matrix is positive, and if so, use Cholesky instead?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦醒时光 2024-11-17 02:26:40

由于这个问题已经获得了很多观点,我想我应该发布我找到的最终答案的代码,部分是由 发帖到 Eigen 论坛。该代码使用 Boost 进行单变量正态分布,使用 Eigen 进行矩阵处理。这感觉相当不正统,因为它涉及使用“内部”名称空间,但它确实有效。如果有人建议一种方法,我愿意改进它。

#include <Eigen/Dense>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/normal_distribution.hpp>    

/*
  We need a functor that can pretend it's const,
  but to be a good random number generator 
  it needs mutable state.
*/
namespace Eigen {
namespace internal {
template<typename Scalar> 
struct scalar_normal_dist_op 
{
  static boost::mt19937 rng;    // The uniform pseudo-random algorithm
  mutable boost::normal_distribution<Scalar> norm;  // The gaussian combinator

  EIGEN_EMPTY_STRUCT_CTOR(scalar_normal_dist_op)

  template<typename Index>
  inline const Scalar operator() (Index, Index = 0) const { return norm(rng); }
};

template<typename Scalar> boost::mt19937 scalar_normal_dist_op<Scalar>::rng;

template<typename Scalar>
struct functor_traits<scalar_normal_dist_op<Scalar> >
{ enum { Cost = 50 * NumTraits<Scalar>::MulCost, PacketAccess = false, IsRepeatable = false }; };
} // end namespace internal
} // end namespace Eigen

/*
  Draw nn samples from a size-dimensional normal distribution
  with a specified mean and covariance
*/
void main() 
{
  int size = 2; // Dimensionality (rows)
  int nn=5;     // How many samples (columns) to draw
  Eigen::internal::scalar_normal_dist_op<double> randN; // Gaussian functor
  Eigen::internal::scalar_normal_dist_op<double>::rng.seed(1); // Seed the rng

  // Define mean and covariance of the distribution
  Eigen::VectorXd mean(size);       
  Eigen::MatrixXd covar(size,size);

  mean  <<  0,  0;
  covar <<  1, .5,
           .5,  1;

  Eigen::MatrixXd normTransform(size,size);

  Eigen::LLT<Eigen::MatrixXd> cholSolver(covar);

  // We can only use the cholesky decomposition if 
  // the covariance matrix is symmetric, pos-definite.
  // But a covariance matrix might be pos-semi-definite.
  // In that case, we'll go to an EigenSolver
  if (cholSolver.info()==Eigen::Success) {
    // Use cholesky solver
    normTransform = cholSolver.matrixL();
  } else {
    // Use eigen solver
    Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigenSolver(covar);
    normTransform = eigenSolver.eigenvectors() 
                   * eigenSolver.eigenvalues().cwiseSqrt().asDiagonal();
  }

  Eigen::MatrixXd samples = (normTransform 
                           * Eigen::MatrixXd::NullaryExpr(size,nn,randN)).colwise() 
                           + mean;

  std::cout << "Mean\n" << mean << std::endl;
  std::cout << "Covar\n" << covar << std::endl;
  std::cout << "Samples\n" << samples << std::endl;
}

Since this question has garnered a lot of views, I thought I'd post code for the final answer that I found, in part, by posting to the Eigen forums. The code uses Boost for the univariate normal and Eigen for matrix handling. It feels rather unorthodox, since it involves using the "internal" namespace, but it works. I'm open to improving it if someone suggests a way.

#include <Eigen/Dense>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/normal_distribution.hpp>    

/*
  We need a functor that can pretend it's const,
  but to be a good random number generator 
  it needs mutable state.
*/
namespace Eigen {
namespace internal {
template<typename Scalar> 
struct scalar_normal_dist_op 
{
  static boost::mt19937 rng;    // The uniform pseudo-random algorithm
  mutable boost::normal_distribution<Scalar> norm;  // The gaussian combinator

  EIGEN_EMPTY_STRUCT_CTOR(scalar_normal_dist_op)

  template<typename Index>
  inline const Scalar operator() (Index, Index = 0) const { return norm(rng); }
};

template<typename Scalar> boost::mt19937 scalar_normal_dist_op<Scalar>::rng;

template<typename Scalar>
struct functor_traits<scalar_normal_dist_op<Scalar> >
{ enum { Cost = 50 * NumTraits<Scalar>::MulCost, PacketAccess = false, IsRepeatable = false }; };
} // end namespace internal
} // end namespace Eigen

/*
  Draw nn samples from a size-dimensional normal distribution
  with a specified mean and covariance
*/
void main() 
{
  int size = 2; // Dimensionality (rows)
  int nn=5;     // How many samples (columns) to draw
  Eigen::internal::scalar_normal_dist_op<double> randN; // Gaussian functor
  Eigen::internal::scalar_normal_dist_op<double>::rng.seed(1); // Seed the rng

  // Define mean and covariance of the distribution
  Eigen::VectorXd mean(size);       
  Eigen::MatrixXd covar(size,size);

  mean  <<  0,  0;
  covar <<  1, .5,
           .5,  1;

  Eigen::MatrixXd normTransform(size,size);

  Eigen::LLT<Eigen::MatrixXd> cholSolver(covar);

  // We can only use the cholesky decomposition if 
  // the covariance matrix is symmetric, pos-definite.
  // But a covariance matrix might be pos-semi-definite.
  // In that case, we'll go to an EigenSolver
  if (cholSolver.info()==Eigen::Success) {
    // Use cholesky solver
    normTransform = cholSolver.matrixL();
  } else {
    // Use eigen solver
    Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigenSolver(covar);
    normTransform = eigenSolver.eigenvectors() 
                   * eigenSolver.eigenvalues().cwiseSqrt().asDiagonal();
  }

  Eigen::MatrixXd samples = (normTransform 
                           * Eigen::MatrixXd::NullaryExpr(size,nn,randN)).colwise() 
                           + mean;

  std::cout << "Mean\n" << mean << std::endl;
  std::cout << "Covar\n" << covar << std::endl;
  std::cout << "Samples\n" << samples << std::endl;
}
紫瑟鸿黎 2024-11-17 02:26:40

这是一个在 Eigen 中生成多元正态随机变量的类,它使用 C++11 随机数生成,并通过使用 Eigen::MatrixBase::unaryExpr() 避免了 Eigen::internal 的东西

struct normal_random_variable
{
    normal_random_variable(Eigen::MatrixXd const& covar)
        : normal_random_variable(Eigen::VectorXd::Zero(covar.rows()), covar)
    {}

    normal_random_variable(Eigen::VectorXd const& mean, Eigen::MatrixXd const& covar)
        : mean(mean)
    {
        Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigenSolver(covar);
        transform = eigenSolver.eigenvectors() * eigenSolver.eigenvalues().cwiseSqrt().asDiagonal();
    }

    Eigen::VectorXd mean;
    Eigen::MatrixXd transform;

    Eigen::VectorXd operator()() const
    {
        static std::mt19937 gen{ std::random_device{}() };
        static std::normal_distribution<> dist;

        return mean + transform * Eigen::VectorXd{ mean.size() }.unaryExpr([&](auto x) { return dist(gen); });
    }
};

它可以用作

int size = 2;
Eigen::MatrixXd covar(size,size);
covar << 1, .5,
        .5, 1;

normal_random_variable sample { covar };

std::cout << sample() << std::endl;
std::cout << sample() << std::endl;

Here is a class to generate multivariate normal random variables in Eigen which uses C++11 random number generation and avoids the Eigen::internal stuff by using Eigen::MatrixBase::unaryExpr():

struct normal_random_variable
{
    normal_random_variable(Eigen::MatrixXd const& covar)
        : normal_random_variable(Eigen::VectorXd::Zero(covar.rows()), covar)
    {}

    normal_random_variable(Eigen::VectorXd const& mean, Eigen::MatrixXd const& covar)
        : mean(mean)
    {
        Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigenSolver(covar);
        transform = eigenSolver.eigenvectors() * eigenSolver.eigenvalues().cwiseSqrt().asDiagonal();
    }

    Eigen::VectorXd mean;
    Eigen::MatrixXd transform;

    Eigen::VectorXd operator()() const
    {
        static std::mt19937 gen{ std::random_device{}() };
        static std::normal_distribution<> dist;

        return mean + transform * Eigen::VectorXd{ mean.size() }.unaryExpr([&](auto x) { return dist(gen); });
    }
};

It can be used as

int size = 2;
Eigen::MatrixXd covar(size,size);
covar << 1, .5,
        .5, 1;

normal_random_variable sample { covar };

std::cout << sample() << std::endl;
std::cout << sample() << std::endl;
猫九 2024-11-17 02:26:40

对于现成的解决方案,犰狳 C++ 库支持从多元高斯分布(甚至从正半分布)中采样-定协方差矩阵),使用函数 mvnrnd()

For a ready-made solution, the armadillo C++ library supports sampling from a multivariate Gaussian distribution (even from positive semi-definite covariance matrices) with the function mvnrnd().

无可置疑 2024-11-17 02:26:40

做一个 SVD 然后检查矩阵是否是 PD 怎么样?请注意,这不需要您计算 Cholskey 分解。虽然,我认为 SVD 比 Cholskey 慢,但它们的失败次数一定都是三次方。

What about doing an SVD and then checking if the matrix is PD? Note that this does not require you to compute the Cholskey factorization. Although, I think SVD is slower than Cholskey, but they must both be cubic in number of flops.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文