状态检查总是一件有效的事情吗？

发布于 2024-12-11 23:49:55 字数 428 浏览 0 评论 0原文

假设只需要将一个值绑定到某个当 bState 为 true 时，某个对象的数据成员。当b状态是假的，没有必要，但也不妨碍。

以下哪段代码会更有效，为什么？

（编辑：更新，状态现在是对象的成员）

const int x;     
int i;
int iToBind;
Classname pObject[x];

for (; i < x; ++i) {
 if (pObject[i].bState) {
        pObject[i].somedatamember = iToBind;
    }
}

与：

for (; i < x; ++i) {
   pObject[i].somedatamember = iToBind;
}

原文

Suppose that it is only necessary to bind a value to a certain
datamember of a certain object when bState is true. When bState
is false, it is not necessary, but it does not hinder either.

Which of the following pieces of code would be more efficient, and why?

(EDIT: updated, state is now a member of the object)

const int x;     
int i;
int iToBind;
Classname pObject[x];

for (; i < x; ++i) {
 if (pObject[i].bState) {
        pObject[i].somedatamember = iToBind;
    }
}

Versus:

for (; i < x; ++i) {
   pObject[i].somedatamember = iToBind;
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

煮茶煮酒煮时光 2024-12-18 23:49:55

我想说后者肯定更快。第一个版本具有双向内存访问，后者具有单向内存访问。

在此版本中：

for (; i < x; ++i) {
  if (pObject[x].bState) {
    pObject[x].somedatamember = iToBind;
  }
}

在 if 语句期间出现停顿，因为 CPU 必须等待从内存读取数据。读取内存的速度取决于数据所在的位置。距离 CPU 越远，所需时间越长：L1（最快）、L2、L3、Ram、Disk（最慢）。

在此版本中：

for (; i < x; ++i) {
  pObject[x].somedatamember = iToBind;
}

仅写入内存。写入内存不会使 CPU 停顿。

除了内存访问时间之外，后一个循环在循环内没有条件跳转。条件循环是一个很大的开销，特别是如果采取/不采取的决定实际上是随机的。

I would say the latter is definitely quicker. The first version has bidirectional memory access, the latter has unidirectional memory access.

In this version:

for (; i < x; ++i) {
  if (pObject[x].bState) {
    pObject[x].somedatamember = iToBind;
  }
}

there is a stall during the if statement as the CPU must wait for the data to be read from memory. The speed the memory is read is dependent on where the data is residing. The further from the CPU the longer it takes: L1 (fastest), L2, L3, Ram, Disk (slowest).

In this version:

for (; i < x; ++i) {
  pObject[x].somedatamember = iToBind;
}

there are only writes to memory. Writes to memory do not stall the CPU.

As well as the memory access times, the latter loop has no conditional jump inside the loop. Conditional loops are a significant overhead, especially if the taken/not taken decision is effectively random.

回复收藏 0 原文

甜心小果奶 2024-12-18 23:49:55

这完全取决于您为帖子简化的内容。如果您只是为了跳过设置变量而添加分支，那么您可能不会获得任何东西，并且如果分支预测失败，您可能会失去任何东西。我会删除测试。

现在，如果要更新的对象不是简单的 int，那么...一如既往，测量、分析，然后根据实际情况而不是直觉做出决定。如果这不是紧密循环的一部分，那么您很可能不会注意到任何一种方式的差异。

回复收藏 0 原文

汹涌人海 2024-12-18 23:49:55

您听说过循环不变代码运动吗？

它是编译器的优化过程，尽可能将代码移出循环体。

例如，给定以下代码：

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
  for (int i = 0; i < argc; ++i) {
    if (argc < 100) {
      printf("%d\n", atoi(argv[1]));
    }
  }
}

Clang 生成以下 IR：

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind {
  %1 = icmp sgt i32 %argc, 0
  br i1 %1, label %.lr.ph, label %._crit_edge

.lr.ph:                                           ; preds = %0
  %2 = icmp slt i32 %argc, 100
  %3 = getelementptr inbounds i8** %argv, i64 1
  br i1 %2, label %4, label %._crit_edge

; <label>:4                                       ; preds = %4, %.lr.ph
  %i.01.us = phi i32 [ %9, %4 ], [ 0, %.lr.ph ]
  %5 = load i8** %3, align 8, !tbaa !0
  %6 = tail call i64 @strtol(i8* nocapture %5, i8** null, i32 10) nounwind
  %7 = trunc i64 %6 to i32
  %8 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 %7) nounwind
  %9 = add nsw i32 %i.01.us, 1
  %exitcond = icmp eq i32 %9, %argc
  br i1 %exitcond, label %._crit_edge, label %4

._crit_edge:                                      ; preds = %4, %.lr.ph, %0
  ret i32 0
}

可以将其翻译回 C：

int main(int argc, char** argv) {
  if (argc == 0) { return 0; }

  if (argc >= 100) { return 0; }

  for (int i = 0; i < argc; ++i) {
    printf("%d\n", atoi(argv[1]));
  }

  return 0;
}

结论： 不要费心进行微观优化，除非探查器显示它们不像您那么微观想法。

编辑：

编辑从根本上改变了问题（天哪，我讨厌那个：p）。 LCM 不再适用，并且这两个函数具有截然不同的功能。

但结论仍然相同。 for 循环中的单个 if 检查不会改变代码的基本复杂性（请记住，循环条件也在每次迭代中进行测试......）。

Have you ever heard of Loop Invariant Code Motion ?

It is an optimization pass from compiler that moves code out of the body of loops whenever possible.

For example, given the following code:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
  for (int i = 0; i < argc; ++i) {
    if (argc < 100) {
      printf("%d\n", atoi(argv[1]));
    }
  }
}

Clang generates the following IR:

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind {
  %1 = icmp sgt i32 %argc, 0
  br i1 %1, label %.lr.ph, label %._crit_edge

.lr.ph:                                           ; preds = %0
  %2 = icmp slt i32 %argc, 100
  %3 = getelementptr inbounds i8** %argv, i64 1
  br i1 %2, label %4, label %._crit_edge

; <label>:4                                       ; preds = %4, %.lr.ph
  %i.01.us = phi i32 [ %9, %4 ], [ 0, %.lr.ph ]
  %5 = load i8** %3, align 8, !tbaa !0
  %6 = tail call i64 @strtol(i8* nocapture %5, i8** null, i32 10) nounwind
  %7 = trunc i64 %6 to i32
  %8 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 %7) nounwind
  %9 = add nsw i32 %i.01.us, 1
  %exitcond = icmp eq i32 %9, %argc
  br i1 %exitcond, label %._crit_edge, label %4

._crit_edge:                                      ; preds = %4, %.lr.ph, %0
  ret i32 0
}

Which can be translated back to C:

int main(int argc, char** argv) {
  if (argc == 0) { return 0; }

  if (argc >= 100) { return 0; }

  for (int i = 0; i < argc; ++i) {
    printf("%d\n", atoi(argv[1]));
  }

  return 0;
}

Conclusion: don't bother with micro-optimizations unless a profiler reveals they are not as micro as you thought.

EDIT:

The edit radically changed the question (god I hate that :p). LICM does not apply any longer and the two functions have widely different functionalities.

The conclusion however remains identical. A single if check within a for loop does not change the fundamental complexity of your code (remember that the loop condition is tested at each iteration too...).

回复收藏 0 原文