C++：逐项评估自己的数学函数（如多维分析的函数数组）

发布于 2024-11-06 04:10:36 字数 1287 浏览 1 评论 0 原文

嘿，我想评估一个数学函数（用户定义），它在 C++ 中返回数组中的多个值（该函数是基于向量的函数 f:R^n->R^m，具有 n 个输入坐标和 m 个输出函数）某些参数，例如：

double *my_func(const mxArray *point)
{
    double *dat = mxGetPr(point);
    double *vals = new double[ 3 ];

    vals[0] = dat[0]*dat[0]*dat[0]*dat[0]*dat[0];
    vals[1] = sin(dat[0])*dat[1]*dat[2]*dat[2]*cos(dat[1]);
    vals[2] = exp(dat[0])*sin(dat[0])*dat[3];

    double *pnt = vals; 
    return pnt;
}

目前我在CPU上执行此操作。所以我调用该函数一次并返回一个包含所有函数值的数组。因为我现在想在 GPU 上并行化它，所以我考虑了如何去做。

我认为在每个线程中完全评估 my_func() 有点愚蠢，因为每个线程都会计算整个函数数组。 这是正确的假设吗？

是否有任何方法可以轻松地仅计算函数数组的第 n 个元素并返回它，以便 5 个线程可以轻松地并行计算函数数组一个CPU完全“单独”计算它？

我能想到的唯一方法是：

double my_func0(const mxArray *point)
{
    double *dat = mxGetPr(point);
    return dat[0]*dat[0]*dat[0]*dat[0]*dat[0];
}
double my_func1(const mxArray *point)
{
    double *dat = mxGetPr(point);
    return sin(dat[0])*dat[1]*dat[2]*dat[2]*cos(dat[1]);
}
double my_func2(const mxArray *point)
{
    double *dat = mxGetPr(point);
    return exp(dat[0])*sin(dat[0])*dat[3];
}

等等...但这对于稍后使用该程序的用户来说会非常“不舒服”，因为如果他想扩展函数数组而不是，他总是必须创建新的 C++ 函数只是调整一个 C++ 函数。另一个问题是：我必须动态调用该函数，因为函数的数量是“动态”的，因此我必须调用 my_func_%%i%% 并且不这样做知道这是否是一个好方法...所以问题是是否有更好的方法来处理这个问题？

原文

Hey there,
I want to evaluate a mathematical function (user-defined) which returns several values in an array (this function is a vector based function f:R^n->R^m with n input coordinates and m output functions) in C++ for certain parameters, e.g.:

double *my_func(const mxArray *point)
{
    double *dat = mxGetPr(point);
    double *vals = new double[ 3 ];

    vals[0] = dat[0]*dat[0]*dat[0]*dat[0]*dat[0];
    vals[1] = sin(dat[0])*dat[1]*dat[2]*dat[2]*cos(dat[1]);
    vals[2] = exp(dat[0])*sin(dat[0])*dat[3];

    double *pnt = vals; 
    return pnt;
}

Currently I do this on the CPU. So I call the function once and get back an array with all function values. As I want to parallelize it now on the GPU, I thought about how to do it.

I assume it would be kind of stupid to evaluate my_func() completely in each thread since than each thread would calculate the whole function-array. Is this the right assumption?

Would there be any way to comfortable calculate only the n-th element of the function-array and return it, so that 5 threads could easily calculate the function-array in parallel instead of one CPU calculating it completely 'alone'?

The only way I could think off was:

double my_func0(const mxArray *point)
{
    double *dat = mxGetPr(point);
    return dat[0]*dat[0]*dat[0]*dat[0]*dat[0];
}
double my_func1(const mxArray *point)
{
    double *dat = mxGetPr(point);
    return sin(dat[0])*dat[1]*dat[2]*dat[2]*cos(dat[1]);
}
double my_func2(const mxArray *point)
{
    double *dat = mxGetPr(point);
    return exp(dat[0])*sin(dat[0])*dat[3];
}

etc... But this would be quite 'uncomfortable' for the user who uses the program later because he always would have to create new C++ functions if he wants to extend the function-array instead of just adapting ONE single C++-function. And a further problem would be: I have to dynamically call the function since the number of functions is 'dynamic' and thus I would have to do a call to my_func_%%i%% and don't know if this is a good way to do it... So the question is if there would be a better way to deal with this problem?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

空气里的味道 2024-11-13 04:10:36

当你说“user_defined”时，我想你的意思是其他人编写了 my_func() 然后你的代码调用了它？

如果是这种情况，请考虑并行运行对 my_func() 的多次调用，而不是尝试分解该函数。这意味着编写 my_func() 的人只需要编写一个函数，而您将负责委托多个调用，确保它们有正确的数据可以处理，并收集结果。

根据评论更新

在您的情况下，如果计算 vals 每个成员所需的操作不同，则用户必须参数化 my_func() 按所需索引；正如您建议的那样，请注意它现在如何返回单个双精度值而不是整个结果数组。或者为每个索引提供不同的my_func()； double my_func_n(const mxArray *point)。

然后，您可以从任意多个不同的线程调用此函数或函数集，并获得单个结果以进行进一步计算。我们忽略了许多并发问题，但是需要考虑同时读取/写入数据。

一般多任务建议

在研究 GPU 的多任务处理之前，请先了解一下 CPU 上的标准多线程处理（我建议使用 Boost 线程库来提供帮助：http://www.boost.org/）。一旦您了解了如何创建和使用线程，您可能会发现您更好地理解了可以用它们做什么以及如何去做。

如果您将数学函数应用于非常大的矩阵或向量，并且可以使用某些图形函数的硬件实现来获得数学结果，那么使用 GPU 进行多任务处理会变得更加有用。还有更多库支持 GPGPU（通用 GPU）编程，例如 OpenCL、Nvidia 的 CUDA 或 ATI 的 Stream。查看这些库提供的内容，让您了解它们如何适用于您的情况。