防止两个对象内部出现混叠

发布于 2024-11-14 14:04:43 字数 394 浏览 6 评论 0原文

我有一个与此类似的函数签名

void Mutliply(const MatrixMN& a, const MatrixMN& b, MatrixMN& out);

在内部,矩阵类有一个表示 mx n 组件的 float* data; 。我想告诉编译器 ab 不会对输出矩阵进行别名,因此它不会执行大量的加载存储。

我该怎么做呢?我知道我可以传递指向函数签名的指针,并用 __restrict 标记指针(在 MSVC 中),但我想保留通过引用传递对象的习惯用法,其中对象包含指向内存的指针。

我还知道 __restrict 不适用于对象引用。

I have a function signature similiar to this

void Mutliply(const MatrixMN& a, const MatrixMN& b, MatrixMN& out);

Internally the matrix class has a float* data; that represents the m x n components. I'd like to tell the compiler that a and b do not alias the out matrix so it doesn't do a ton of load-stores.

How would I go about doing that? I know I could pass in pointers to the function signature and mark the pointers with __restrict(in MSVC) but I'd like to keep the idiom of object passed by reference where the object contains pointers to memory.

I also know that __restrict does not work on object references.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

行至春深 2024-11-21 14:04:43

根据优化器的工作方式,顶部的 assert(&in1 != &out && &in2 != &out) 可能会起到作用。您还可以删除 out 参数,并相信优化器会删除多余的副本(当然假设它是纯 out 参数)。如果代码是内联的候选代码,编译器可能会发现它本身没有任何别名。如果 restrict 确实对引用参数不起作用,您可以为函数调用设置一个额外的级别,并将所有三个级别传递给接受正确限制的指针的第二个函数。希望该内容能为您内联。

Depending on how the optimizer works, an assert(&in1 != &out && &in2 != &out) at the top may do the trick. You could also get rid of the out parameter, and trust the optimizer to get rid of the superfluous copies (assuming it is a pure out parameter, of course). If the code is a candidate for inlining the compiler may see nothing is aliased on it's own. If restrict really doesn't work on reference parameters, you can have an extra level to the function call, and pass all three to a second function that accepts pointers properly restricted. Hopefully, that one would get inlined for you.

晚风撩人 2024-11-21 14:04:43

编写一个非导出(文件静态私有)乘法函数,该函数采用float*参数,用restrict<标记参数/代码>。让 Multiply 调用此函数。

Write a non-exported (file-static, private) multiplication function that takesfloat* arguments, mark the arguments with restrict. Make Multiply call this function.

顾铮苏瑾 2024-11-21 14:04:43

由于您似乎对 __restrict 指针感到满意,因此我会使用您所知道的内容,但您仍然可以包装它并使用引用提供接口:

void Multiply(const MatrixMN& a, const MatrixMN& b, MatrixMN& out) {
  if (&a == &b || &a == &out || &b == &out) {
    // indicate precondition violation however you like
    assert(!"precondition violated");
    abort();  // assert isn't always executed
  }
  else {
    DoMultiply(&a, &b, &out);
  }
}

void DoMultiply(MatrixMN const * __restrict a, MatrixMN const * __restrict b,
              MatrixMN * __restrict out)
{
  //...
}

使指针版本“非公开”,例如将其放置在“详细信息”命名空间中,给它内部链接(在这种情况下不适用),或者给它一个特殊的名称。您甚至可以使用局部变量而不是参数,并将函数体放在“else”中,但我发现上面的内容更干净。

Since you seem to be comfortable with __restrict pointers, I would use what you know, but you can still wrap it and provide an interface using references:

void Multiply(const MatrixMN& a, const MatrixMN& b, MatrixMN& out) {
  if (&a == &b || &a == &out || &b == &out) {
    // indicate precondition violation however you like
    assert(!"precondition violated");
    abort();  // assert isn't always executed
  }
  else {
    DoMultiply(&a, &b, &out);
  }
}

void DoMultiply(MatrixMN const * __restrict a, MatrixMN const * __restrict b,
              MatrixMN * __restrict out)
{
  //...
}

Make the pointer version "non-public", such as placing it in a "details" namespace, giving it internal linkage (not applicable in this exact case), or giving it a special name. You could even use local variables instead of parameters and put the function body within the "else", but I find the above cleaner.

埋情葬爱 2024-11-21 14:04:43

宏包装器如何在编译时本身具有__restrict效果:(下面是伪代码,未检查):

#define Multiply(A,B,C) Multiply_restrict(&A, &B, &C)

现在中间方法定义为,

inline void Multiply_restrict(const MatrixMN* __restrict pA,
            const MatrixMN* __restrict pB, MatrixMN* __restrict pC)
{
  Multiply_(*pA, *pB, *pC);
}

最后只是在原始 Multiply 之后添加 _

void Mutliply_(const MatrixMN& a, const MatrixMN& b, MatrixMN& out);

因此最终效果将与您调用的完全相同:

Multiply(x, y, answer);

How about a macro wrapper to have the __restrict effect at compile time itself: (below is pseudo code, not checked):

#define Multiply(A,B,C) Multiply_restrict(&A, &B, &C)

Now the intermediate method is defined as,

inline void Multiply_restrict(const MatrixMN* __restrict pA,
            const MatrixMN* __restrict pB, MatrixMN* __restrict pC)
{
  Multiply_(*pA, *pB, *pC);
}

And finally just add an _ after your original Multiply:

void Mutliply_(const MatrixMN& a, const MatrixMN& b, MatrixMN& out);

So final effect will be exactly same as you are calling:

Multiply(x, y, answer);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文