不同内存空间中的 OpenCL 结构声明

发布于 2024-12-14 00:34:50 字数 796 浏览 3 评论 0原文

在 OpenCL 中,以下结构声明之间的后果和差异是什么。如果它们是非法的,为什么?

struct gr_array
{
    int ndims;
    __global m_integer* dim_size;
    __global m_real* data;
};
typedef struct gr_array g_real_array;

struct lr_array
{
    int ndims;
    __local m_integer* dim_size;
    __local m_real* data;
};
typedef struct lr_array l_real_array;

__ kernel temp(...){

        __local g_real_array A;
        g_real_array B;

        __local l_real_array C;
        l_real_array D;

}

我的问题是结构(和成员)将分配在哪里?谁可以访问它们?这是一个好的做法吗?

编辑

struct r_array
    {
       __local int ndims;
    };

typedef struct r_array real_array;

__ kernel temp(...){

        __local real_array A;
        real_array B;

}

如果工作项修改了结构体 B 中的 ndims,那么该更改对工作组中的其他工作项是否可见

In OpenCL what will be the consequences and differences between the following struct declarations. And if they are illegal, why?

struct gr_array
{
    int ndims;
    __global m_integer* dim_size;
    __global m_real* data;
};
typedef struct gr_array g_real_array;

struct lr_array
{
    int ndims;
    __local m_integer* dim_size;
    __local m_real* data;
};
typedef struct lr_array l_real_array;

__ kernel temp(...){

        __local g_real_array A;
        g_real_array B;

        __local l_real_array C;
        l_real_array D;

}

My question is where will the structures be allocated (and the members)? who can access them? And is this a good practice or not?

EDIT

how about this

struct r_array
    {
       __local int ndims;
    };

typedef struct r_array real_array;

__ kernel temp(...){

        __local real_array A;
        real_array B;

}

if a work-item modifies ndims in struct B, is the change visible to other work-items in the work-group?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一页 2024-12-21 00:34:50

我已经将您的代码重写为有效的 CL,或者至少是可以编译的 CL。这里:

typedef struct gr_array {
    int ndims;
    global int* dim_size;
    global float* data;
} g_float_array;

typedef struct lr_array {
    int ndims;
    local int* dim_size;
    local float* data;
} l_float_array;

kernel void temp() {
    local g_float_array A;
    g_float_array B;

    local l_float_array C;
    l_float_array D;
}

一一地,这是如何分解的:

  • A 在本地空间中。它是一个由一个 int 和两个指针组成的结构体。这些指针指向全局空间中的数据,但它们本身分配在本地空间中。

  • B在私人空间;它是一个自动变量。它由一个 int 和两个指向全局内存中内容的指针组成。

  • C 位于本地空间。它包含一个 int 和两个指向本地空间中内容的指针。

  • D,此时您可能已经猜到了。它位于私有空间中,包含一个 int 和两个指向本地空间中内容的指针。

我不能说这两种方法是否更适合您的问题,因为您还没有描述您想要实现的目标。

编辑:我意识到我没有解决你问题的第二部分——谁可以访问结构字段。

那么,您可以访问变量在范围内的任何位置的字段。我猜测您认为在 g_float_array 中标记为全局的字段位于全局空间(l_float_array 的本地空间)中。但它们只是指向全球(或本地)空间中的东西。

所以,你可以像这样使用它们:

kernel void temp(
            global float* data, global int* global_size,
            local float* data_local, local int* local_size,
            int num) 
{
    local g_float_array A;
    g_float_array B;

    local l_float_array C;
    l_float_array D;

    A.ndims = B.ndims = C.ndims = D.ndims = num;

    A.data = B.data = data;
    A.dim_size = B.dim_size = global_size;

    C.data = D.data = data_local;
    C.dim_size = D.dim_size = local_size;
}

顺便说一句,如果你在运行 Lion 的 Mac 上破解 CL,你可以使用“离线”CL 编译器来编译 .cl 文件,这使得尝试这种东西变得很容易。容易一点。它位于此处:

/System/Library/Frameworks/OpenCL.framework/Libraries/openclc

有一些示例代码 这里

I've rewritten your code as valid CL, or at least CL that will compile. Here:

typedef struct gr_array {
    int ndims;
    global int* dim_size;
    global float* data;
} g_float_array;

typedef struct lr_array {
    int ndims;
    local int* dim_size;
    local float* data;
} l_float_array;

kernel void temp() {
    local g_float_array A;
    g_float_array B;

    local l_float_array C;
    l_float_array D;
}

One by one, here's how this breaks down:

  • A is in local space. It's a struct that is composed of one int and two pointers. These pointers point to data in global space, but are themselves allocated in local space.

  • B is in private space; it's an automatic variable. It is composed of an int and two pointers that point to stuff in global memory.

  • C is in local space. It contains an int and two pointers to stuff in local space.

  • D, you can probably guess at this point. It's in private space, and contains an int and two pointers that point to stuff in local space.

I cannot say if either is preferable for your problem, since you haven't described what your are trying to accomplish.

EDIT: I realized I didn't address the second part of your question -- who can access the structure fields.

Well, you can access the fields anywhere the variable is in scope. I'm guessing that you were thinking that the fields you had marked as global in g_float_array were in global space (an local space for l_float_array). But they're just pointing to stuff in global (or local) space.

So, you'd use them like this:

kernel void temp(
            global float* data, global int* global_size,
            local float* data_local, local int* local_size,
            int num) 
{
    local g_float_array A;
    g_float_array B;

    local l_float_array C;
    l_float_array D;

    A.ndims = B.ndims = C.ndims = D.ndims = num;

    A.data = B.data = data;
    A.dim_size = B.dim_size = global_size;

    C.data = D.data = data_local;
    C.dim_size = D.dim_size = local_size;
}

By the way -- if you're hacking CL on a Mac running Lion, you can compile .cl files using the "offline" CL compiler, which makes experimenting with this kind of stuff a bit easier. It's located here:

/System/Library/Frameworks/OpenCL.framework/Libraries/openclc

There is some sample code here.

月下凄凉 2024-12-21 00:34:50

它可能行不通,因为当前的 GPU 对于 OpenCL 内核和普通程序有不同的内存空间。你必须显式调用才能在两个空间之间传输数据,这通常是程序的瓶颈(因为PCI-X显卡的带宽相当低)。

It probably won't work, because the current GPU-s have different memory spaces for OpenCL kernels and for the ordinary program. You have to make explicit calls to transmit data between both spaces, and it is often the bottleneck of the program (because the bandwidth of PCI-X graphics card is quite low).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文