匹配
表达式 在高层实现?为了让编译器知道如何将某些代码片段定向到一个分支与另一个分支,并在编译时弄清楚它,在幕后会发生什么?我不明白如果不存储运行时使用的类型信息,这怎么可能。
像这样的例子:
fn tree_weight_v1(t: BinaryTree) -> i32 {
match t {
BinaryTree::Leaf(payload) => payload,
BinaryTree::Node(left, payload, right) => {
tree_weight_v1(*left) + payload + tree_weight_v1(*right)
}
}
}
/// Returns tree that Looks like:
///
/// +----(4)---+
/// | |
/// +-(2)-+ [5]
/// | |
/// [1] [3]
///
fn sample_tree() -> BinaryTree {
let l1 = Box::new(BinaryTree::Leaf(1));
let l3 = Box::new(BinaryTree::Leaf(3));
let n2 = Box::new(BinaryTree::Node(l1, 2, l3));
let l5 = Box::new(BinaryTree::Leaf(5));
BinaryTree::Node(n2, 4, l5)
}
#[test]
fn tree_demo_1() {
let tree = sample_tree();
assert_eq!(tree_weight_v1(tree), (1 + 2 + 3) + 4 + 5);
}
通过运行时类型信息,我的意思是你实际上存储了一个指向类型定义的指针或类似的东西,所以当运行时执行到达 match
时,它只是检查 value.type ==给定_模式_表达式
或它的一些变体。据我了解,事实并非如此。也就是说,Rust 不会在运行时将类型与结构/记录一起存储。根据我的理解,它是在编译时计算的。
如果我在每个结构/记录上存储.type
,我可能可以实现这种功能。然后您只需查找类型并查看是否匹配即可。但是,我看不到在编译器级别实现此功能的方法,因此它在运行时提前计算出匹配。
我认为它发生在编译时,因为像 是否可以对 Rust 中的泛型进行编译时类型检查? 或 运行时特征实现检查?,以及许多其他帖子似乎暗示了一切发生在编译时,除了极少数情况下您可以选择加入运行时检查。 (这是我这几天总结几十篇文章的理解)。或者另一个引用是:
确实需要检查给定的二叉树是叶子还是节点,但编译器静态地确保完成此类检查:您不能意外地将叶子的数据解释为节点,反之亦然。
对我来说,编译器静态地计算出我上面发布的 BinaryTree 案例中的模式匹配。不知何故。
将模式匹配与类型检查集成是一篇关于编程语言的随机帖子这与我的理解一致,您需要运行时类型信息来完成此操作。所以我很困惑。
如果类型是模式,则您需要能够在运行时确定值的类型。一般来说,这需要运行时类型信息(许多语言故意删除这些信息);但它还需要一些包含这两种情况的超类型(许多语言故意没有)。
为 Rust
How is the match
expression implemented at a high level? What happens under the hood for the compiler to know how to direct certain strains of code to one branch vs. the other, figuring it out at compile time? I don't see how this is possible without storing type information for use at runtime.
Something like this example:
fn tree_weight_v1(t: BinaryTree) -> i32 {
match t {
BinaryTree::Leaf(payload) => payload,
BinaryTree::Node(left, payload, right) => {
tree_weight_v1(*left) + payload + tree_weight_v1(*right)
}
}
}
/// Returns tree that Looks like:
///
/// +----(4)---+
/// | |
/// +-(2)-+ [5]
/// | |
/// [1] [3]
///
fn sample_tree() -> BinaryTree {
let l1 = Box::new(BinaryTree::Leaf(1));
let l3 = Box::new(BinaryTree::Leaf(3));
let n2 = Box::new(BinaryTree::Node(l1, 2, l3));
let l5 = Box::new(BinaryTree::Leaf(5));
BinaryTree::Node(n2, 4, l5)
}
#[test]
fn tree_demo_1() {
let tree = sample_tree();
assert_eq!(tree_weight_v1(tree), (1 + 2 + 3) + 4 + 5);
}
By runtime type information, I mean you literally store a pointer to the type definition or something like that, so under the hood when the runtime execution reaches the match
, it simply checks value.type == given_pattern_expression
or some variant of that. From my understanding, this is not the case. That is, Rust doesn't store the type with the structs/records in the runtime. From my understanding, somehow it is computed at compile time.
I could probably implement this sort of feature if I stored the .type
on each struct/record. Then you just simply look up the type and see if it matches. However, I cannot see a way to implement this at the compiler level, so it figures out matching in advance of runtime.
I think it happens at compile time because of posts like Is it possible to do a compile time type check on a generic in Rust?, or Runtime trait implementation checking?, amongst many other posts which seems to suggest everything happens at compile time, except for a tiny few cases where you can opt-into runtime checking. (this is my understanding from summarizing dozens of articles the past few days). Or another quote is:
One does need to check whether a given BinaryTree is a Leaf or is a Node, but the compiler statically ensures such checks are done: you cannot accidentally interpret the data of a Leaf as if it were a Node, nor vice versa.
That to me says the compiler statically figures out the pattern matching in the BinaryTree
case I posted above. Somehow.
Integrating Pattern Matching with Type Checking is a random post about programming languages that agrees with my understanding, that you need runtime type information to accomplish this. So I am confused.
If types are patterns, you need to be able to determine the type of a value at runtime. In general, this requires runtime type-information (which many languages intentionally erase); but it also requires some supertype that contains the two cases (which many languages intentionally do not have).
Building a runtime reflection system for Rust ????️ (Part 1) also explains how Rust doesn't have runtime reflection, which aids in my thinking that everything happens at compile time.
https://oswalt.dev/2021/06/polymorphism-in-rust/
发布评论
评论(2)
match
表达式不需要需要运行时类型信息;由于match
只接受单个已知类型的单个表达式,根据定义,它可以利用编译时信息。另请参阅:
匹配
在编译时与运行时在编译时,每个
match
表达式将被验证为详尽:处理值的所有可能形状。在运行时,代码将确定执行哪个特定的匹配臂。您可以将
match
视为通过奇特的if
-else
链实现的。由于我们人类在沟通时往往不是非常精确,因此某些资源很可能模糊了这两个方面之间的界限。
具体关注枚举
枚举变体不是独立的类型。也就是说,给定枚举
Foo
,Foo::Bar
不是类型 — 它是Foo
类型的值。这与false
不是类型相同 - 它是bool
类型的值。相同的逻辑适用于i32
(类型)和42
(值)。在大多数情况下,枚举被实现为与每个枚举变体可能的值相对应的大量字节,每个变体的数据彼此分层。这称为联合。
然后在这堆字节旁边添加一个判别式。这是一个整数值,指定该值是哪个变体。添加判别式使其成为标记联合。
概念上类似于此伪 Rust:
另请参阅:
枚举上的匹配在
我不同意它没有有它,但它默认肯定没有它。 Rust 中的运行时类型信息通常会利用
Any< /code>
特征:
匹配时缺少
Any
特征是一个额外的提示,表明不存在运行时类型信息。其他语言
只有当你允许一个值有多种类型时,这才是正确的——Rust 不允许,因为它是一种静态类型语言。在动态类型语言中,您想要使用类型来匹配值,您确实需要拥有一定数量的运行时类型信息。
A
match
expression does not need runtime type information; as amatch
only accepts a single expression of a single known type, by definition it can leverage compile time information.See also:
match
at compile time vs runtimeAt compile time, every
match
expression will be verified to be exhaustive: all possible shapes of the value are handled.At run time, the code will determine which specific match arm is executed. You can think of a
match
as implemented via a fancyif
-else
chain.As we humans tend to be not-extremely-precise when communicating, it's highly likely that some resources blur the line between these two aspects.
Concretely focusing on an enum
Enum variants are not standalone types. That is, given an enum
Foo
,Foo::Bar
is not a type — it's a value of typeFoo
. This is the same as howfalse
is not a type — it's a value of typebool
. The same logic applies fori32
(type) and42
(value).In most cases, enums are implemented as a sea of bytes that correspond to the values each enum variant might be, with each variant's data layered on top of each other. This is known as a union.
Then a discriminant is added next to this soup of bytes. This is an integer value that specifies which variant the value is. Adding the discriminant makes it into a tagged union.
Matching on an enum is conceptually similar to this pseudo-Rust:
See also:
Reflection
I wouldn't agree that it doesn't have it, but it certainly doesn't have it by default. Runtime type information in Rust will usually leverage the
Any
trait:The absence of the
Any
trait when matching is an additional hint that no runtime type information is present.Other languages
This is only true if you allow a value to be multiple types — Rust does not as it's a statically-typed language. In a dynamically-typed language where you want to match a value using a type, you would indeed need to have some amount of runtime type information.
Rust 使用的可以保存数据的枚举类型通常被称为“标记联合”。一般概念是它们或多或少通过应用区分标签以及可能值的联合来工作。
Rust 中的标记联合
例如,这里我使用匹配在 Rust 中打印一个枚举值:
内存中的标记联合
下面是如何使用没有标记联合的语言(如 C)做同样的事情:
现在,我可能犯了一个错误或我的 C 版本中有两个,有人可能能够指出一些改进,但我希望它通常能够显示标记联合如何在获取数据时区分变体。
Rust 中的标记联合
我不太了解 Rust 中如何实现标记联合,但很容易想象,Rust 可能会在这些系统上进行扩展,以进一步提高匹配效率。例如,如果您有一个枚举类型嵌套在另一个枚举类型中,您可以将它们的标签合并在一起,这样您就不需要进行多次比较(例如:编译器可能会压缩
Option> )
到OptionResultAB
中,只需要一个标签)。另一个众所周知的例子是
Option>
与*mut T
具有相同的大小,因为 null 可以用作None
> 价值。还有一些其他类似的情况,例如函数指针。虽然我只是推测 Rust 中标记联合的形式,但
我确实找到了一个更好的研究解释,它很好地解释了这个概念:
The style of enums Rust uses which can hold data are generally referred to as "tagged unions". The general concept is they more or less work by applying a distinguishing tag along with the union of possible values.
Tagged Unions in Rust
For example, here I print an enum value in Rust using a match:
Tagged Unions in Memory
And here is how you might do the same thing using a language without tagged unions like C:
Now, I probably made a mistake or two in my C version and someone will probably be able to point out some improvements, but I hope it generally shows how a tagged union can distinguish between variants when fetching data.
Tagged Unions in Rust
I don't know much about how tagged unions are implemented in rust, but it is easy to imagine that rust could likely expand on these systems to further increase the efficiency of the match. For instance if you have one enum type nested in another you may be able to merge their tags together so you don't need to do multiple comparisons (Ex: The compiler might compress
Option<Result<A, B>>
into aOptionResultAB
which only requires a single tag).Another well known example is how
Option<NonNull<T>>
has the same size as*mut T
since null can be used as theNone
value. There are also a few other similar cases like function pointers.References
While I am just speculating on the form of tagged unions in Rust, I did find a better researched explanation which does a good job explaining the concept: