C++ 中缓存对齐内存使用的类模板

发布于 2024-10-12 11:51:25 字数 3216 浏览 1 评论 0原文

（提供理解我的问题所需的信息很多，但是它已经被压缩）

我尝试实现一个类模板来分配和访问对齐的数据缓存。这非常有效，但是尝试实现对数组的支持是一个问题。

从语义上讲，代码应在内存中为单个元素提供这种映射，如下

cache_aligned<element_type>* my_el = 
          new(cache_line_size) cache_aligned<element_type>();
| element | buffer |

所示：访问（到目前为止）如下所示：

*my_el; // returns cache_aligned<element_type>
**my_el; //returns element_type
*my_el->member_of_element();

但是对于数组，我想要这样：

 cache_aligned<element_type>* my_el_array = 
         new(cache_line_size)  cache_aligned<element_type()[N];
 | element 0 | buffer | element 1 | buffer | ... | element (N-1) | buffer |

到目前为止，我有以下代码

template <typename T>
class cache_aligned {
    private:
        T instance;
    public:
        cache_aligned()
        {}
        cache_aligned(const T& other)
        :instance(other.instance)
        {}
        static void* operator new (size_t size, uint c_line_size) {
             return c_a_malloc(size, c_line_size);
        }
        static void* operator new[] (size_t size, uint c_line_size) {
             int num_el = (size - sizeof(cache_aligned<T>*) 
                              / sizeof(cache_aligned<T>);
             return c_a_array(sizeof(cache_aligned<T>), num_el, c_line_size);
        }
        static void operator delete (void* ptr) {
             free_c_a(ptr);
        }
        T* operator-> () {
             return &instance;
        }
        T& operator * () {
             return instance;
        }
};

：函数

void* c_a_array(uint size, ulong num_el, uint c_line_size) {
    void* mem = malloc((size + c_line_size) * num_el + sizeof(void*));
    void** ptr = (void**)((long)mem + sizeof(void*));
    ptr[-1] = mem;
    return ptr;
}

void free_c_a(void ptr) {
    free(((void**)ptr)[-1]);
}

cache_aligned_malloc问题就在这里，对数据的访问应该像这样工作：

my_el_array[i]; // returns cache_aligned<element_type>
*(my_el_array[i]); // returns element_type
my_el_array[i]->member_of_element();

我解决它的想法是：

（1）与此类似的东西，以重载 sizeof 运算符：

static size_t operator sizeof () {
   return sizeof(cache_aligned<T>) + c_line_size;
}

-->不可能，因为重载 sizeof 运算符是非法的

(2) 像这样，重载指针类型的运算符 []：

static T& operator [] (uint index, cache_aligned<T>* ptr) {
    return ptr + ((sizeof(cache_aligned<T>) + c_line_size) * index);
}

-->无论如何，在 C++ 中是不可能的

(3) 完全微不足道的解决方案

template <typename T> cache_aligned {
    private:
          T instance;
          bool buffer[CACHE_LINE_SIZE]; 
          // CACHE_LINE_SIZE defined as macro
    public:
          // trivial operators and methods ;)
};

-->我不知道这是否可靠，实际上我在linux中使用gcc-4.5.1...

(4)替换T实例；通过 T* instance_ptr；在类模板中并使用运算符 [] 来计算元素的位置，如下所示：

|指向实例的指针 | ----> |元素 0 |缓冲| ... |元素 (N-1) |缓冲|

这不是预期的语义，因为类模板的实例成为计算元素地址时的瓶颈。

感谢您的阅读！我不知道如何缩短这个问题。如果您能提供帮助，那就太好了！任何解决办法都会有很大帮助。

我知道对齐是 C++0x 中的扩展。然而，在 gcc 中它还不可用。

问候，塞玛

原文

(to provide the information you need to understand my question is a lot, however it is already compressed)

i try to implement a class template to allocate and access data cache aligned. This works very good, however trying to implement support for arrays is a problem.

Semantically the code shall provide this mapping in memory for a single element like this:

cache_aligned<element_type>* my_el = 
          new(cache_line_size) cache_aligned<element_type>();
| element | buffer |

the access (so far) looks like this:

*my_el; // returns cache_aligned<element_type>
**my_el; //returns element_type
*my_el->member_of_element();

HOWEVER for an array, i'd like to have this:

 cache_aligned<element_type>* my_el_array = 
         new(cache_line_size)  cache_aligned<element_type()[N];
 | element 0 | buffer | element 1 | buffer | ... | element (N-1) | buffer |

So far i have the following code

template <typename T>
class cache_aligned {
    private:
        T instance;
    public:
        cache_aligned()
        {}
        cache_aligned(const T& other)
        :instance(other.instance)
        {}
        static void* operator new (size_t size, uint c_line_size) {
             return c_a_malloc(size, c_line_size);
        }
        static void* operator new[] (size_t size, uint c_line_size) {
             int num_el = (size - sizeof(cache_aligned<T>*) 
                              / sizeof(cache_aligned<T>);
             return c_a_array(sizeof(cache_aligned<T>), num_el, c_line_size);
        }
        static void operator delete (void* ptr) {
             free_c_a(ptr);
        }
        T* operator-> () {
             return &instance;
        }
        T& operator * () {
             return instance;
        }
};

the functions cache_aligned_malloc

void* c_a_array(uint size, ulong num_el, uint c_line_size) {
    void* mem = malloc((size + c_line_size) * num_el + sizeof(void*));
    void** ptr = (void**)((long)mem + sizeof(void*));
    ptr[-1] = mem;
    return ptr;
}

void free_c_a(void ptr) {
    free(((void**)ptr)[-1]);
}

The problem is here, the access to the data should work like this:

my_el_array[i]; // returns cache_aligned<element_type>
*(my_el_array[i]); // returns element_type
my_el_array[i]->member_of_element();

My ideas to solve it, are:

(1) something similar to this, to overload sizeof operator:

static size_t operator sizeof () {
   return sizeof(cache_aligned<T>) + c_line_size;
}

--> not possible since overloading sizeof operator is illegal

(2) something like this, to overload the operator [] for the pointer type:

static T& operator [] (uint index, cache_aligned<T>* ptr) {
    return ptr + ((sizeof(cache_aligned<T>) + c_line_size) * index);
}

--> not possible in C++, anyway

(3) totally trivial solution

template <typename T> cache_aligned {
    private:
          T instance;
          bool buffer[CACHE_LINE_SIZE]; 
          // CACHE_LINE_SIZE defined as macro
    public:
          // trivial operators and methods ;)
};

--> i don't know whether this is reliable, actually i'm using gcc-4.5.1 in linux ...

(4) Replacing T instance; by T* instance_ptr; in the class template and using the operator [] to calculate the position of the element, like this:

| pointer-to-instance | ----> | element 0 | buffer | ... | element (N-1) | buffer |

this is not the intended semantic, since the instance of the class template becomes the bottleneck when calculating the address of the elements.

Thanks for reading! I dont' know how to shorten the problem. It would be great, if you can help! Any work around would help a lot.

I know alignment is an extension in C++0x. However, in gcc it is not available yet.

Greetz, sema

分享到QQ

分享到微博