GPU 上的蒙特卡罗

发布于 2024-10-30 21:05:57 字数 239 浏览 10 评论 0原文

今天我和一个朋友谈话,他告诉我他尝试使用 GPU 进行一些蒙特卡罗模拟。有趣的是,他告诉我,他想在不同的处理器上随机抽取数字,并假设它们不相关。但他们不是

问题是,是否存在一种方法可以在多个GPU上绘制独立数字集?他认为为每个人使用不同的种子可以解决问题,但事实并非如此。

如果需要任何澄清,请告诉我,我会请他提供更多详细信息。

Today I had a talk with a friend of mine told me he tries to make some monte carlo simulations using GPU. What was interesting he told me that he wanted to draw numbers randomly on different processors and assumed that there were uncorrelated. But they were not.

The question is, whether there exists a method to draw independent sets of numbers on several GPUs? He thought that taking a different seed for each of them would solve the problem, but it does not.

If any clarifications are need please let me know, I will ask him to provide more details.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

傾城如夢未必闌珊 2024-11-06 21:05:57

要生成完全独立的随机数,需要使用并行随机数生成器。本质上,您选择一个种子,它会生成M个独立的随机数流。因此,在每个 M 个 GPU 上,您都可以从独立的流中生成随机数。

在处理多个 GPU 时,您需要意识到您想要:

  • GPU 内的独立流(如果 RN 由每个 GPU 生成)
  • GPU 之间的独立流。

事实证明,在每个 GPU 核心上生成随机数是很棘手的(请参阅这个问题我不久前问过)。当我一直在研究 GPU 和 RN 时,如果您一次生成大量数字,您只会在 GPU 上生成随机数时获得加速。

相反,我会在 CPU 上生成随机数,因为:

  • 在 CPU 上生成它们并进行传输更容易,有时更快。
  • 您可以使用经过良好测试的并行随机数生成器
  • 现成的随机数生成器类型可用于GPU 非常有限。
  • 当前的 GPU 随机数库仅从少量分布生成 RN。

在评论中回答您的问题:随机数取决于什么?

一个非常基本的随机数生成器是线性同余生成器。尽管该生成器已被较新的方法超越,但它应该能让您了解它们的工作原理。基本上,第 i 个随机数取决于第 (i-1) 个随机数。正如您所指出的,如果您运行两个流足够长的时间,它们将会重叠。最大的问题是,你不知道它们什么时候会重叠。

To generate completely independent random numbers, you need to use a parallel random number generator. Essentially, you choose a single seed and it generates M independent random number streams. So on each of the M GPUs you could then generate random numbers from independent streams.

When dealing with multiple GPUs you need to be aware that you want:

  • independent streams within GPUs (if RNs are generate by each GPU)
  • independent streams between GPUs.

It turns out that generating random numbers on each GPU core is tricky (see this question I asked a while back). When I've been playing about with GPUs and RNs, you only get a speed-up generating random on the GPU if you generate large numbers at once.

Instead, I would generate random numbers on the CPU, since:

  • It's easier and sometimes quicker to generate them on the CPU and transfer across.
  • You can use well tested parallel random number generators
  • The types of off-the shelf random number generators available for GPUs is very limited.
  • Current GPU random number libraries only generate RNs from a small number of distributions.

To answer your question in the comments: What do random numbers depend on?

A very basic random number generator is the linear congruential generator. Although this generator has been surpassed by newer methods, it should give you an idea of how they work. Basically, the ith random number depends on the (i-1) random number. As you point out, if you run two streams long enough, they will overlap. The big problem is, you don't know when they will overlap.

别在捏我脸啦 2024-11-06 21:05:57

为了生成iid统一变量,您只需使用不同的种子初始化生成器。借助 Cuda,您可以使用 NVIDIA Curand 库来实现 Mersenne Twister 生成器。

例如,以下代码由 100 个内核并行执行,将绘制 (R^10)-uniform 的 10 个样本

__global__ void setup_kernel(curandState *state,int pseed)
{
    int id =  blockIdx.x * blockDim.x + threadIdx.x;
    int seed = id%10+pseed;

    /* 10 differents seed for uncorrelated rv, 
    a different sequence number,    no offset */
    curand_init(seed, id, 0, &state[id]);
}

For generating iid uniform variables, you just have to initialize your generators with differents seeds. With Cuda, you may use the NVIDIA Curand Library which implements the Mersenne Twister generator.

For example, the following code executed by 100 kernels in parallel, will draw 10 sample of a (R^10)-uniform

__global__ void setup_kernel(curandState *state,int pseed)
{
    int id =  blockIdx.x * blockDim.x + threadIdx.x;
    int seed = id%10+pseed;

    /* 10 differents seed for uncorrelated rv, 
    a different sequence number,    no offset */
    curand_init(seed, id, 0, &state[id]);
}
星軌x 2024-11-06 21:05:57

如果您使用任何“好的”生成器(例如 Mersenne Twister 等),则具有不同随机种子的两个序列将不相关,无论是在 GPU 还是 CPU 上。因此,我不确定你所说的在不同 GPU 上采用不同种子是不够的是什么意思。你能详细说明一下吗?

If you take any ``good'' generator (e.g. Mersenne Twister etc), two sequences with different random seeds will be uncorrelated, be it on GPU or CPU. Hence I'm not sure what you mean by saying taking different seeds on different GPUs were not enough. Would you elaborate?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文