当前位置：文江博客话题详情

RUST optimization syntax

与/Rust中的其他情况相比，在图案上匹配的速度是否快？

发布于 2025-01-20 03:43:30 字数 939 浏览 3 评论 0 原文

假设您有一个枚举：

enum Expr {
    binExp,
    unExp, 
    Literal, 
    Group,
}

您应该选择 match 还是 if/else 链来调用基于 Expr 的方法？

match self { 
    Expr::binExp  => todo!(),
    Expr::unExp   => todo!(),
    Expr::Literal => todo!(),
    Expr::Group   => todo!()
}

if let Expr::binExp = self {
    todo!()
} else if let Expr::unExp = self {
    todo!()
} else if let Expr::Lit = self {
    todo!()
} else if let Expr::Group(e) = self {
    todo!()
} else {
    todo!()
}

换句话说，match 是并行操作吗？我知道 if/else 会按顺序遍历每个表达式，当它们遇到 false 时对其进行评估，最后停止。

match 语句是模仿这种行为还是直接“跳转”到正确的模式？我特别问这个问题是因为 match 可以匹配多个模式，但始终选择首先匹配的模式。所以听起来 match 在这方面可能是连续的。

原文

Let's say you have an enum:

enum Expr {
    binExp,
    unExp, 
    Literal, 
    Group,
}

Should you prefer match or if/else chains to call methods based on Expr?

match self { 
    Expr::binExp  => todo!(),
    Expr::unExp   => todo!(),
    Expr::Literal => todo!(),
    Expr::Group   => todo!()
}

if let Expr::binExp = self {
    todo!()
} else if let Expr::unExp = self {
    todo!()
} else if let Expr::Lit = self {
    todo!()
} else if let Expr::Group(e) = self {
    todo!()
} else {
    todo!()
}

Stated another way, is match a parallel operation? I know if/else will go through each expression sequentially, evaluating them as they are encountered as false and finally make a stop.

Do match statements mimic this behavior or do they directly "jump" to the correct pattern? I ask this specifically because match can match on multiple patterns, but always select the pattern that matches it first. So it does sound like match may be sequential in that regard.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

墨离汐 2025-01-27 03:43:30

在发布模式下编译时，编译器旨在在两种情况下的每种情况下产生相同的代码。但是，匹配方法是更轻松供编译器处理，实际上，在某些示例中，编译器无法为产生最佳代码，如果/ else 链。

应该如何编译此类代码

该Rust编译器由两半编译。由Rust Developers撰写的一半，检查您的源代码是否有效（例如执行借用检查器规则），然后将其编译为LLVM IR；下半年是LLVM，它是多种不同语言之间共享的编译器后端，它采用LLVM IR并产生可执行文件。

对于您编写的代码，a）编写了各种 expr 的测试的顺序无关将匹配。这些是有关如何在可执行文件中表达比赛的灵活性的信息（至少在我测试的处理器上，LLVM的选择是使用一系列可能的跳跃位置，并使用它们索引。判断您的枚举，即实际上是在并行执行所有测试）。

当您编写匹配示例时，汇编的生锈部分将了解所有信息本身。它使用llvm ir的 switch> switch 语句来写“请根据枚举判别的价值跳到这四个位置之一”的LLVM IR，该语句可与Rust的 Match> Match 相媲美。，并且IR看起来像这样（请注意第一行中的“无法实现的”，指定比赛的默认情况是无法实现的 - 编译器已确定以外的四个是不可能的）：

  switch i64 %0, label %default.unreachable [
    i64 0, label %bb3
    i64 1, label %bb4
    i64 2, label %bb5
    i64 3, label %bb2
  ]

使用>如果/ else 示例，则在较早的编译阶段，编译器不明白可以使用单个测试进行此操作，并且它的写入相当于四个>如果语句，每种语句都有自己的测试（尽管从技术上讲，实际上只有三个测试是必要的，因为如果前三个失败，那么第四个将始终通过）。如果您以调试模式进行编译，则最终的可执行文件实际上分别进行了每个测试。

如果您在发布模式下编译怎么办？好吧，编译器具有许多优化（仅在发布模式下运行），这些优化旨在检测效率较低的代码并用更有效的代码替换。在这种情况下，优化注意到整个链只是对单个整数（枚举判别）的检查。因此，优化最终将IF/else链重写为一个单个开关块，几乎与 Match 案例所获得的IR几乎相同。在许多情况下，这意味着您以任何一种方式获得相同的代码。

不幸的是，在这种情况下

，在某些情况下，编译器确实无法处理这种情况并产生略有优势的代码。假设我们有一个 enum 具有七个选项，每个选项都没有字段，然后尝试 Match 和，如果/ else else < /code>链上：

pub enum ExampleEnum {
    A, B, C, D, E, F, G
}

pub fn match1(e: ExampleEnum) {
    match e { 
        ExampleEnum::A => panic!("a"),
        ExampleEnum::B => panic!("b"),
        ExampleEnum::C => panic!("c"),
        ExampleEnum::D => panic!("d"),
        ExampleEnum::E => panic!("e"),
        ExampleEnum::F => panic!("f"),
        ExampleEnum::G => panic!("g"),
    }
}

pub fn match2(e: ExampleEnum) {
    if let ExampleEnum::A = e {
        panic!("a")
    } else if let ExampleEnum::B = e {
        panic!("b")
    } else if let ExampleEnum::C = e {
        panic!("c")
    } else if let ExampleEnum::D = e {
        panic!("d")
    } else if let ExampleEnum::E = e {
        panic!("e")
    } else if let ExampleEnum::F = e {
        panic!("f")
    } else if let ExampleEnum::G = e {
        panic!("g")
    } else {
        unreachable!()
    }
}

在发布模式下，我们希望它们产生相同的代码。不幸的是，有了这么多选项，编译器失去了最终 else 块是无法到达的事实，而它折叠了整个，如果/ else 将单个切换链链链，因此LLVM IR在两种情况下看起来略有不同：

match1 （使用 match> match ） match2 代码>（使用使用/ else ）

`match1` （使用 `match> match` ）	`match2` 代码>（使用使用/ `else` ）
switch i8％0，label％bb1 [ i8 0，标签％bb3 i8 1，标签％bb4 i8 2，标签％bb5 i8 3，标签％bb6 i8 4，标签％bb7 i8 5，标签％bb8 i8 6，标签％bb2 这是给 BB1：无法到达	开关i8％0，标签％bb14 [ i8 0，标签％bb1 i8 1，标签％bb3 i8 2，标签％bb5 i8 3，标签％BB7 i8 4，标签％bb9 i8 5，标签％bb11 i8 6，标签％bb13 这是给 BB14： ;呼叫核心:: panicking :: panic 下

 switch i8％0，label％bb1 [
    i8 0，标签％bb3
    i8 1，标签％bb4
    i8 2，标签％bb5
    i8 3，标签％bb6
    i8 4，标签％bb7
    i8 5，标签％bb8
    i8 6，标签％bb2
  这是给

BB1：
  无法到达

开关i8％0，标签％bb14 [
    i8 0，标签％bb1
    i8 1，标签％bb3
    i8 2，标签％bb5
    i8 3，标签％BB7
    i8 4，标签％bb9
    i8 5，标签％bb11
    i8 6，标签％bb13
  这是给

BB14：
;呼叫核心:: panicking :: panic
   下

在两种情况代码>无法实现的（相当于Rust的 Unrackable_unchecked！（）），而使用 if / else ，则使用，无法实现的！ panic（）），并且由于这是LLVM所拥有的所有信息，因此它不会优化 Unterach！（）从最终程序中调用。

尽管如此，在可执行文件中拥有死亡代码并不重要 - 如果不运行，它不会占用太多时间 - 但是在此七元素 enum 情况下，由此产生的可执行文件是如果/ else 案例，>在中会略慢。编译X86_64（作为一个常用的示例），大多数生成的汇编代码都是相同的，除了行标签的名称，但函数开始附近的一个小部分（我手动更改了空格以显示差异），以显示差异）：

match1 （使用 match ） match2 （使用如果/ else> else ）

`match1` （使用 `match` ）	`match2` （使用如果/ `else> else` ）
match1： subq $ 56，％rsp movzbl％dil，％eax Leaq .ljti0_0（％RIP），％rcx MOVSLQ（％RCX，％RAX，4），％rax ADDQ％RCX，％rax jmpq *％rax	match2： subq $ 56，％rsp CMPB $ 6，％DIL JA .lbb1_9 movzbl％dil，％eax leaq .ljti1_0（％RIP），％rcx MOVSLQ（％RCX，％RAX，4），％rax ADDQ％RCX，％rax JMPQ *％rax

 match1：
    subq $ 56，％rsp


    movzbl％dil，％eax
    Leaq .ljti0_0（％RIP），％rcx
    MOVSLQ（％RCX，％RAX，4），％rax
    ADDQ％RCX，％rax
    jmpq *％rax

 match2：
    subq $ 56，％rsp
    CMPB $ 6，％DIL
    JA .lbb1_9
    movzbl％dil，％eax
    leaq .ljti1_0（％RIP），％rcx
    MOVSLQ（％RCX，％RAX，4），％rax
    ADDQ％RCX，％rax
    JMPQ *％rax

代码正在测试，以查看枚举判别物的值是否高于6（ cmpb $ 6，％dil; ja ，即“比较，然后跳跃，如果超过”），如果是这样，正在执行跳跃。当然，作为7个元素的枚举（其内部酌处量为0到6（包含在内），它不能在6上方具有判别物，因此该测试永远不会通过 - 但是，只要函数是，该测试仍将进行打电话给，因此它的运行速度会稍慢一些，因为测试本身将需要一段时间。请注意，Rust知道，这种枚举的歧视性永远无法超过6-但 llvm ，因此它不能安全地将代码块声明为无法到达。

（

）大多数情况并不重要，但是编译器必须做更多的工作来处理/ else 链条 - 这是对该公司的不自然构造编译器要处理（例如，它不能从句法上确定最终 else 块是无法实现的，并且必须进行静态分析才能解决）。编译器希望使用 Match 块 - 如果平台上的快速，则可以通过单个测试直接跳至正确的代码行，或以一系列测试进行测试否则，它的选择 - 实际上，它将在内部尝试重写/ else 链条中的链接链接到 match> match block。但这在编译中已经很晚了，在这一点上，编译器可能已经失去了对判别物所能拥有的一组价值的确切了解。

简单地编写匹配块直接将使生活变得更加轻松 - 它对阅读代码的人类程序员和compiiler更清楚地表达了您的意图，这将需要做更少的工作以了解这是 enum 的所有选项的详尽匹配。由于编译器正在做更多的工作来处理/ else 链条，因此增加了某些问题会导致代码不完美地优化代码的机会。（我怀疑具有 Match 的版本可能还会为您提供略快的编译时间。）

The compiler aims to produce the same code in each of the two cases, when compiling in release mode. However, the match approach is easier for the compiler to deal with, and in fact there are some examples where the compiler is unable to produce optimal code for the if/else chain.

How this sort of code is supposed to be compiled

The Rust compiler is compiled of two halves. One half, written by the Rust developers, checks that your source code is valid (e.g. by enforcing the borrow checker rules), and then compiles it into LLVM IR; the second half is LLVM, which is a compiler backend shared between multiple different languages, that takes the LLVM IR and produces an executable.

In the case of the code you've written, a) it doesn't matter which order the tests for the various sorts of Expr are written in, and b) it's guaranteed that at least one of them will match. Those are pieces of information that give LLVM a lot of flexibility as to how to express the match in an executable (and at least on the processor I tested, LLVM's choice is to use an array of possible jump locations, and index into them using the discriminant of your enum, i.e. it is actually performing all the tests in parallel).

When you write the match example, the Rust to LLVM IR part of the compilation understands all that information itself. It writes LLVM IR that says "please jump to one of these four locations based on the value of the enum discriminant", using LLVM IR's switch statement which is comparable to Rust's match, and the IR looks something like this (note the "unreachable" in the first line, specifying that the default case of the match is unreachable – the compiler has determined that no options other than these four are possible):

  switch i64 %0, label %default.unreachable [
    i64 0, label %bb3
    i64 1, label %bb4
    i64 2, label %bb5
    i64 3, label %bb2
  ]

With the if/else example, then in earlier stages of compilation, the compiler doesn't understand that doing this with a single test is possible, and it writes out the equivalent of four if statements, each with its own test (even though, technically, only three of the tests are actually necessary because if the first three fail the fourth will always pass). If you compile in debug mode, then the resulting executable actually does each of those tests separately.

What about if you compile in release mode? Well, the compiler has a number of optimizations (running only in release mode) that are designed to detect less efficient code and replace it with more efficient code. In this case, the optimizations notice that the entire chain is simply a check on a single integer (the enum discriminant). So the optimizations end up rewriting the if/else chain into a single switch block, producing almost the same IR as you would get with the match case. In many cases, this means that you get identical code either way.

A case where it does make a difference

Unfortunately, there are some cases where the compiler does fail to handle this situation and produces slightly suboptimal code. Let's say we have an enum with seven options, each of which has no fields, and try the match and if/else chain on it:

pub enum ExampleEnum {
    A, B, C, D, E, F, G
}

pub fn match1(e: ExampleEnum) {
    match e { 
        ExampleEnum::A => panic!("a"),
        ExampleEnum::B => panic!("b"),
        ExampleEnum::C => panic!("c"),
        ExampleEnum::D => panic!("d"),
        ExampleEnum::E => panic!("e"),
        ExampleEnum::F => panic!("f"),
        ExampleEnum::G => panic!("g"),
    }
}

pub fn match2(e: ExampleEnum) {
    if let ExampleEnum::A = e {
        panic!("a")
    } else if let ExampleEnum::B = e {
        panic!("b")
    } else if let ExampleEnum::C = e {
        panic!("c")
    } else if let ExampleEnum::D = e {
        panic!("d")
    } else if let ExampleEnum::E = e {
        panic!("e")
    } else if let ExampleEnum::F = e {
        panic!("f")
    } else if let ExampleEnum::G = e {
        panic!("g")
    } else {
        unreachable!()
    }
}

Playground link

In release mode, we'd expect these to produce the same code. Unfortunately, with this many options, the compiler loses track of the fact that the final else block is unreachable while it's collapsing the whole if/else chain into a single switch, so the LLVM IR looks slightly different in the two cases:

match1 (using match) match2 (using if/else)

`match1` (using `match`)	`match2` (using `if`/`else`)
switch i8 %0, label %bb1 [ i8 0, label %bb3 i8 1, label %bb4 i8 2, label %bb5 i8 3, label %bb6 i8 4, label %bb7 i8 5, label %bb8 i8 6, label %bb2 ] bb1: unreachable	switch i8 %0, label %bb14 [ i8 0, label %bb1 i8 1, label %bb3 i8 2, label %bb5 i8 3, label %bb7 i8 4, label %bb9 i8 5, label %bb11 i8 6, label %bb13 ] bb14: ; call core::panicking::panic tail call void @_ZN4core9panicking5panic…

  switch i8 %0, label %bb1 [
    i8 0, label %bb3
    i8 1, label %bb4
    i8 2, label %bb5
    i8 3, label %bb6
    i8 4, label %bb7
    i8 5, label %bb8
    i8 6, label %bb2
  ]

bb1:
  unreachable

  switch i8 %0, label %bb14 [
    i8 0, label %bb1
    i8 1, label %bb3
    i8 2, label %bb5
    i8 3, label %bb7
    i8 4, label %bb9
    i8 5, label %bb11
    i8 6, label %bb13
  ]

bb14:
; call core::panicking::panic
  tail call void @_ZN4core9panicking5panic…

The compiler has produced a switch block in both cases, but the match specifies the default case as a branch to the LLVM intrinsic unreachable (the equivalent of Rust's unreachable_unchecked!()), whereas with the if/else, the unreachable!() at the end hasn't been proven to be necessarily unreachable, so the default case branches to Rust's unreachable!() macro (which translates into a call to panic()), and because that's all the information that LLVM has, it doesn't optimize the unreachable!() call out of the final program.

Still, having dead code in an executable mostly doesn't matter – if it doesn't run, it isn't taking up much time – but in this seven-element enum case, the resulting executable is going to be marginally slower in the if/else case. Compiling for x86_64 (as a commonly used example), most of the generated assembly code is the same except for the names of line labels, but one small portion near the start of the function differs (I manually changed the whitespace to show the difference):

match1 (using match) match2 (using if/else)

`match1` (using `match`)	`match2` (using `if`/`else`)
match1: subq $56, %rsp movzbl %dil, %eax leaq .LJTI0_0(%rip), %rcx movslq (%rcx,%rax,4), %rax addq %rcx, %rax jmpq *%rax	match2: subq $56, %rsp cmpb $6, %dil ja .LBB1_9 movzbl %dil, %eax leaq .LJTI1_0(%rip), %rcx movslq (%rcx,%rax,4), %rax addq %rcx, %rax jmpq *%rax

match1:
    subq   $56, %rsp


    movzbl %dil, %eax
    leaq   .LJTI0_0(%rip), %rcx
    movslq (%rcx,%rax,4), %rax
    addq   %rcx, %rax
    jmpq   *%rax

match2:
    subq   $56, %rsp
    cmpb   $6, %dil
    ja     .LBB1_9
    movzbl %dil, %eax
    leaq   .LJTI1_0(%rip), %rcx
    movslq (%rcx,%rax,4), %rax
    addq   %rcx, %rax
    jmpq   *%rax

The code is testing to see whether the enum discriminant has a value above 6 (cmpb $6, %dil; ja, i.e. "compare, then jump if above"), and if so, is performing a jump. Of course, as a 7-element enum (whose discrminants are internally numbered 0 to 6 inclusive) it can't have a discriminant above 6, so this test will never pass – but the test is still going to be made whenever the function is called, and so it will run very slightly slower, because the test itself will take some amount of time. (Note that Rust knows that a discriminant of this sort of enum can never go above 6 – but LLVM doesn't, so it can't safely declare the block of code to be unreachable.)

Conclusions

In most cases it isn't going to matter, but the compiler has to do a lot more work to handle the if/else chain – it's something of an unnatural construct for the compiler to handle (e.g. it can't syntactically determine that the final else block is unreachable, and has to do a static analysis to work that out). The compiler would prefer to work with the match block – that allows it to jump directly to the correct line of code with a single test if that's fast on the platform, or to perform the tests in a sequence of its choice otherwise – and in fact it will internally attempt to rewrite the if/else chain into a match block. But that's done quite late in the compile, by which point it's possible that the compiler has lost track of exactly what is known about the set of values that the discriminant can have.

Simply writing the match block directly is going to make life a lot easier – it expresses your intent more clearly both to human programmers reading the code, and to the compiiler, which will need to do less work to understand that this is an exhaustive match on all options of the enum. Because the compiler is doing more work to handle the if/else chain, it increases the chance that something will go wrong which causes the code to be optimized imperfectly. (I suspect that the version with match will probably also give you marginally faster compile times.)

回复收藏 0 原文

~没有更多了~

关于作者

早茶月光

暂无简介

文章

28 人气

关注发私信

友情链接

文江博客

与/Rust中的其他情况相比，在图案上匹配的速度是否快？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

应该如何编译此类代码

不幸的是，在这种情况下

（

How this sort of code is supposed to be compiled

A case where it does make a difference

Conclusions

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

与/Rust中的其他情况相比，在图案上匹配的速度是否快？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

应该如何编译此类代码

不幸的是，在这种情况下

（

How this sort of code is supposed to be compiled

A case where it does make a difference

Conclusions

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。