继续循环正在大大减慢Clang的运行时间
问题
我遇到了一个leetcode问题 gas-station> gas-station
但是我发现我的代码稍微有点如果/else 而不是,如果/继续
。
编辑:在用测试案例进行扭曲之后。我发现问题是继续
Clang似乎正在认真对待继续
。我的猜测是,帽子叮当声以某种方式试图将零件放在继续之前,然后循环进行检查。
简而言之:我想知道:
- 我如何“说服”叮当声理解我对继续的实施,例如/否则?因为我不想改变我的编码风格。
编辑 sidenote:删除分支预测标签。
相对问题
我应该使用返回/继续语句而不是if-else?看来它们几乎没有差异。
调查
编辑:演示表明问题是继续
我使用a 10k数据集测试代码,发现问题是继续
本身。
如果 的身体结束,然后将运行时增加了,然后将运行时间增加到 30%〜50%,这对我来说仍然令人惊讶。
live demo
数量差:11070至15060(w/继续
)
于2022/5/30编辑:对不起,我发现我以前的天真分析错误地从一部分计算中优化了...这会导致差异按100 ... 糟糕的演示
propoling code
#include <iostream>
#include <vector>
#include <chrono>
#include <cstdlib>
using namespace std; // Yeah I know it's bad.
// input , output
class Solution {
public:
int canCompleteCircuit(vector<int>& gas, vector<int>& cost) {
if(gas.empty()){
return -1;
}
int len = gas.size();
// start at the last station, move start backward if we cannot finished the loop
int start = len - 1;
int tank = gas[start] - cost[start];
int cur = 0; // from first station
while(start != cur){ // other condition?
if(tank < 0){
// if not, move start backward, check the value
// continue...start > 0;
start--;
tank += gas[start] - cost[start];
// continue;
}
else{
tank += gas[cur] - cost[cur];
cur++;
}
}
//cout << tank ;
if(tank < 0){
return -1;
}
return start;
}
};
volatile int tryNoToBeOptimizedOut = 0;
int main() {
Solution s;
vector<int> gas{...}; // 10k data here
vector<int> cost{...};
int count = 100;
vector<int> results;
for(int i = 0; i <count; i++){
// size_t rIndex = rand()%gas.size();
// gas[rIndex] += i;
gas[i] += i;
auto start = std::chrono::system_clock::now();
int ret = s.canCompleteCircuit(gas,cost);
auto end = std::chrono::system_clock::now();
auto elapsed = end - start;
std::cout << elapsed.count() << '\n';
tryNoToBeOptimizedOut = ret;
results.push_back(ret);
}
tryNoToBeOptimizedOut = results[rand()%results.size()];
return 0;
}
旧测试
在第一位,我检查了我检查了我检查的。在GCC中实施,发现两个版本在组装中几乎相同。
gcc中的实时演示
我知道leetcode使用 clang 11 with o2 。
因此,我检查了Clang 11的差异。
clang 11
实时演示
clang 14中的实时演示没有更好的
它们之间的结果有所不同。
看来 cur ++;
合并为,而
在 If/code> If/contine
版本中。
我不确定为什么....这与优化政策有关吗?
我的代码:
int canCompleteCircuit(vector<int>& gas, vector<int>& cost) {
if(gas.empty()){
return -1;
}
int len = gas.size();
// start at the last station, move start backward if we cannot finished the loop
int start = len - 1;
int tank = gas[start] - cost[start];
int cur = 0; // from first station
while(start != cur){ // other condition?
if(tank < 0){
// if not, move start backward, check the value
// continue...start > 0;
start--;
tank += gas[start] - cost[start];
continue;
}
// else{
tank += gas[cur] - cost[cur];
cur++;
// continue; <-- If I put another continue there then the runtime is similar to if/else version. But is there a better way than adding conitnue manually?
// }
}
if(tank < 0){
return -1;
}
return start;
}
/ else
版本的
.LBB0_5: # in Loop: Header=BB0_3 Depth=1
movsxd rcx, ecx # `else` part
mov edx, dword ptr [r14 + 4*rcx]
sub edx, dword ptr [rbx + 4*rcx]
add ecx, 1
.LBB0_6: # in Loop: Header=BB0_3 Depth=1
add eax, edx
mov edx, eax
shr edx, 31
cmp esi, ecx # `while(start != cur){`
je .LBB0_7
.LBB0_3: # =>This Inner Loop Header: Depth=1
test dl, 1 # `if(tank < 0){`
je .LBB0_5 # go to else
movsxd rdi, esi
add esi, -1
mov edx, dword ptr [r14 + 4*rdi - 4]
sub edx, dword ptr [rbx + 4*rdi - 4]
jmp .LBB0_6
中的 indembly in /继续
继续继续版本
.LBB0_3: # =>This Loop Header: Depth=1
mov r9, rdx
mov edx, r10d
sub edx, r9d
add rdx, 4
movsxd rdi, r9d
lea rsi, [r12 + 4*rdi]
lea rbx, [r14 + 4*rdi]
xor edi, edi
.LBB0_4: # Parent Loop BB0_3 Depth=1
test cl, 1 # `if` part
jne .LBB0_5
add eax, dword ptr [rbx + 4*rdi] # after `if` statement
sub eax, dword ptr [rsi + 4*rdi]
mov ecx, eax
shr ecx, 31
add rdi, 1
cmp edx, edi
jne .LBB0_4
jmp .LBB0_7
.LBB0_5: # in Loop: Header=BB0_3 Depth=1
add eax, dword ptr [r14 + 4*r8 - 4] # Body of `if`
sub eax, dword ptr [r12 + 4*r8 - 4]
add r8, -1
mov esi, r10d
sub esi, r9d
add esi, 3
mov ecx, eax
shr ecx, 31
movsxd rdx, r9d
add rdx, rdi
add r10d, -1
cmp esi, edi
jne .LBB0_3
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
简而言之,
继续
在您的代码中不是理想的选择,但是如果> ,我不认为我可以给您一个具体的答案但是我认为这是分支大小的某种比例,而相同情况的机会。运行此示例我得到的结果是野性的,从跑步到运行可能会改变大量数量。在示例中,我尝试以编译器不会尝试压缩主内部所有内容的方式将主循环混合在一起,但基本上由您的解决方案组成,并使用IF/Regine和两个LOOP测试。我添加了
&lt; type&gt; _loop()
以比较循环如何在没有条件的情况下运行并实现I used the AMD profiler, this is the first time i used one of these and a least with this one is realy nice to use
Small rant: The leetcode testing system is so bad that makes overwatch look balanced, just by rerunning my code the timing changed drastically (min->
It is not, in short,
continue
is not ideal in your code but it is not slower or faster thanif
and i don't think i can give you a concrete answer but i think it's some kind of ratio of branch size over the chance of same condition. Running this example the results i get are wild, from run to run it can change a massive amount; in the example i tried to mix up a bit the main loop in a way that the compiler wont try to compress everything inside main but it basically consists of your solution and mine with if/continue and with two loop test.I added the
<type>_loop()
to compare how the loop would run without the condition and implementing something like a duff or a loop unroling (to me they are the same thing but writen in a slightly different way); somehow the "faster" version runs slower and i only suspect is a bandwidth problem just from the time waitedI used the AMD profiler, this is the first time i used one of these and a least with this one is realy nice to use
Small rant: The leetcode testing system is so bad that makes overwatch look balanced, just by rerunning my code the timing changed drastically (min->71ms, max->155ms; with 26 runs) and the memory usage is just a lie, this example runs with ~10mb and it is telling me that just the function is consuming close to 70mb
<sample.hpp>
is defines the gas/cost arrays from that long as sample you gave