gpu.js具有bigdecimal或float64array吗？

发布于 2025-02-04 05:33:12 字数 1655 浏览 2 评论 0 原文

我正在借助 gpu.js 来编写一个mandelbrot-calculator，直到现在，一切都很好。我面临的唯一问题是，GPU只想计算32位浮标。或至少这是官方文档告诉我的。
但是，当使用 python 和 numba - 在同一gpu上运行时 - 在呈现Mandelbrot Fractal时也更精确。
使用Python，我能够几乎可以获取 1E-15 ，而在JavaScript中，图像在 1e-1e-7 周围变得模糊。

Python内核：

@cuda.jit(device=True)
def mandel(x, y, max_iters):
    c = complex(x, y)
    z = 0.0j
    for i in range(max_iters):
        z = z * z + c
        if (z.real * z.real + z.imag * z.imag) >= 4:
            return i

    return max_iters

JavaScript内核：

const recalculateMandelbrot = gpu.createKernel(function(x_start, x_end, y_start, y_end, iters){
    let c_re = x_start + (x_end - x_start) * this.thread.x / 1024;
    let c_im = y_start + (y_end - y_start) * this.thread.y / 1024;
    let z_re = 0, z_im = 0;
    let z_re_prev = 0;

    for(let i = 0; i < iters; i++) {
        z_re_prev = z_re;
        z_re = z_re * z_re - z_im * z_im + c_re;
        z_im = z_re_prev * z_im + z_re_prev * z_im + c_im;

        if ((z_re * z_re + z_im * z_im) >= 4) {
            return i;
        }
    }
    return iters;
}).setOutput([1024, 1024]).setPrecision('single');

算法彼此相等，除了在Python中，我可以使用其内置复杂类型。

因此，我考虑过使用BigDecimal，可以在这里实现任意精度（因此我可以根据需要放大），但是我不知道如何将其添加到GPU-KERNEL中。

更新：

python更精确的原因是，复杂类型由两个64位浮子组成。因此，JavaScript计算的原因不那么精确，是因为JavaScript本身。
因此，我的问题现在关注如何将 big.js 添加到我的gpu-kernel中？

原文

I am writing a mandelbrot-calculator with the help of gpu.js and till now, everything works perfectly fine. The only issue I am facing is, that the GPU only wants to compute 32-Bit floats. Or at least this is what the official docs are telling me.
But when doing the same calculation with python and numba - which also runs on the same GPU - is much more precise when rendering the mandelbrot fractal.
With Python, I am able to get nearly around 1e-15 whereas in Javascript, the image becomes blurry at around 1e-7.

Python Kernel:

@cuda.jit(device=True)
def mandel(x, y, max_iters):
    c = complex(x, y)
    z = 0.0j
    for i in range(max_iters):
        z = z * z + c
        if (z.real * z.real + z.imag * z.imag) >= 4:
            return i

    return max_iters

Javascript Kernel:

const recalculateMandelbrot = gpu.createKernel(function(x_start, x_end, y_start, y_end, iters){
    let c_re = x_start + (x_end - x_start) * this.thread.x / 1024;
    let c_im = y_start + (y_end - y_start) * this.thread.y / 1024;
    let z_re = 0, z_im = 0;
    let z_re_prev = 0;

    for(let i = 0; i < iters; i++) {
        z_re_prev = z_re;
        z_re = z_re * z_re - z_im * z_im + c_re;
        z_im = z_re_prev * z_im + z_re_prev * z_im + c_im;

        if ((z_re * z_re + z_im * z_im) >= 4) {
            return i;
        }
    }
    return iters;
}).setOutput([1024, 1024]).setPrecision('single');

The algorithms are equal to each other, except that in Python, I can use its built-in complex type.

So I thought about using BigDecimal, where I can achieve arbitrary precision (so I can zoom in as far as I want), but I do not know how to add this to my gpu-kernel.

Update:

The reason why python works more precise, is because the complex type consists of two 64-Bit Floats. So the reason why JavaScript is calculation less precise, is because of JavaScript itself.
So my question now focuses on how to add big.js to my gpu-kernel?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

若水般的淡然安静女子 2025-02-11 05:33:12

There is custom kernel test file here:

这个方法：

const glslDivide = `float divide(float a, float b) {
  return a / b;
}`;

也许可以使用 “ double”代替“ float”，并代替简单的乘法和添加。

我不知道如何使用GLSL，但是如果可以将后端附加到GLSL上，则应该有效。另外，“双”需要GLSL 4.0，因此它可能不起作用。

即使起作用，它仍然需要所有迭代递归，因为双重的“状态变量”定义没有。就像调用相同的功能直到达到最大材料。应该有两个函数参数来跟踪虚构和真实零件以及两个其他功能参数，以跟踪迭代次数的数量，也许是C值。考虑到递归的深度是有限的，可能会再次起作用。

即使在WebGL后端，看起来像您只需编辑某些函数（例如模型）（例如，都可以编辑某些模型（在那里）并像乘法/加法一样使用它。

您还可以在此处阅读以获取有关乘法/添加两个由四个32位值模拟的乘法的灵感： https://github.com/visgl/visgl/luma.gl/luma.gl/blob/blob/master/master/mmaster/mmodertools/mmodertools/shadertools/shadertools/shadertools/shaderq一下.glsl.ts

以任何方式，缓冲区（JS环境和GPU驱动程序之间）可能根本无法使用，并且您在内核内使用100％的功能编程（没有状态处理64位变量）。

这些不是不错的选择。您应该使用一个库，使您可以将本机内核写为字符串（例如 https://github.com/kashif/node-cuda/blob/master/test/test.js ）并直接从JavaScript使用它，以便您几乎可以在CUDA中做任何事情/opencl。

There is custom kernel test file here: https://github.com/gpujs/gpu.js/blob/1be50c09ed4edb7c7e846b1815414ef504089ab6/test/features/add-custom-native-function.js

with this kernel sample:

const glslDivide = `float divide(float a, float b) {
  return a / b;
}`;

Maybe there can be a way to use "double" in place of "float" and use it in place of simple multiplications and additions.

I don't know how to use glsl but if backend can be attached to glsl somehow, this should work. Also "double" needs glsl 4.0 so it may not work.

Even if it works, it still needs all of the iterations recursive because there is not "state variable" definition for double. Just like calling same function until max-iteration is reached. There should be two function parameters to keep track of imaginary and real parts and two other function parameters to keep track of number of iterations and maybe c value. Considering the depth of recursion is limited, it may not work, again.

Even in webgl backend, https://github.com/gpujs/gpu.js/blob/1be50c09ed4edb7c7e846b1815414ef504089ab6/src/backend/web-gl/fragment-shader.js it looks like you could just edit the string of some functions (like module down there) and use it like multiplication/addition.

You can also read here for an inspiration on multiplying/adding two 64bit values simulated by four 32bit values: https://github.com/visgl/luma.gl/blob/master/modules/shadertools/src/modules/fp64/fp64-arithmetic.glsl.ts

In any way, buffers(between Js environment & GPU driver) may not work at all and you are stuck with 100% functional programming within the kernel (no state handling for 64bit variables).

These are not good options. You should use a library that lets you write native kernel as a string (like this https://github.com/kashif/node-cuda/blob/master/test/test.js) and use it directly from Javascript so that you can do nearly anything you'd do within CUDA/OpenCL.

回复收藏 0 原文

~没有更多了~