为什么我不能在同一个结构中存储值和对该值的引用?
我有一个值,我想存储该值和对的引用 我自己的类型中该值内的某些内容:
struct Thing {
count: u32,
}
struct Combined<'a>(Thing, &'a u32);
fn make_combined<'a>() -> Combined<'a> {
let thing = Thing { count: 42 };
Combined(thing, &thing.count)
}
有时,我有一个值,我想存储该值和对 该值在同一结构中:
struct Combined<'a>(Thing, &'a Thing);
fn make_combined<'a>() -> Combined<'a> {
let thing = Thing::new();
Combined(thing, &thing)
}
有时,我什至没有引用该值,但我得到了 相同的错误:
struct Combined<'a>(Parent, Child<'a>);
fn make_combined<'a>() -> Combined<'a> {
let parent = Parent::new();
let child = parent.child();
Combined(parent, child)
}
在每种情况下,我都会收到一个错误,其中一个值“确实 活得不够长”。这个错误是什么意思?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
导致非常相似的编译器消息的一个稍微不同的问题是对象生命周期依赖性,而不是存储显式引用。一个例子是 ssh2 库。当开发比测试项目更大的东西时,很容易尝试将从该会话获取的
Session
和Channel
并排放入一个结构中,从而隐藏实现细节用户。但是,请注意Channel
定义有其类型注释中的'sess
生命周期,而Session
没有。这会导致与生命周期相关的类似编译器错误。
以非常简单的方式解决此问题的一种方法是在调用者外部声明
Session
,然后用生命周期注释结构内的引用,类似于 这篇 Rust 用户论坛帖子在封装时讨论了同样的问题SFTP。这看起来不太优雅,也可能并不总是适用 - 因为现在您有两个实体需要处理,而不是您想要的一个!结果是租赁箱或owning_ref crate 也是这个问题的解决方案。让我们考虑一下 owning_ref,它有专门用于此目的的特殊对象:
OwningHandle
。为了避免底层对象移动,我们使用Box
在堆上分配它,这给了我们以下可能的解决方案:这段代码的结果是我们不能使用
Session
code> 不再存在,但它与我们将使用的Channel
一起存储。因为OwningHandle
对象取消引用Box
,而Box
取消引用Channel
,因此在将其存储在结构中时,我们如此命名。 注意:这只是我的理解。我怀疑这可能不正确,因为它似乎非常接近 讨论OwningHandle
不安全。这里一个奇怪的细节是,
Session
在逻辑上与TcpStream
具有类似的关系,就像Channel
与Session
的关系一样,但是它的所有权没有被占用,并且没有类型注释围绕这样做。相反,由用户来处理这个问题,如 握手方法说:因此,对于
TcpStream
的使用,完全由程序员来确保代码的正确性。借助OwningHandle
,可以使用unsafe {}
块来关注“危险魔法”发生的位置。关于此问题的进一步、更高级别的讨论在此 Rust 用户论坛帖子 - 其中包括一个不同的示例及其使用租赁箱的解决方案,该箱不包含不安全的块。
A slightly different issue which causes very similar compiler messages is object lifetime dependency, rather than storing an explicit reference. An example of that is the ssh2 library. When developing something bigger than a test project, it is tempting to try to put the
Session
andChannel
obtained from that session alongside each other into a struct, hiding the implementation details from the user. However, note that theChannel
definition has the'sess
lifetime in its type annotation, whileSession
doesn't.This causes similar compiler errors related to lifetimes.
One way to solve it in a very simple way is to declare the
Session
outside in the caller, and then for annotate the reference within the struct with a lifetime, similar to the answer in this Rust User's Forum post talking about the same issue while encapsulating SFTP. This will not look elegant and may not always apply - because now you have two entities to deal with, rather than one that you wanted!Turns out the rental crate or the owning_ref crate from the other answer are the solutions for this issue too. Let's consider the owning_ref, which has the special object for this exact purpose:
OwningHandle
. To avoid the underlying object moving, we allocate it on the heap using aBox
, which gives us the following possible solution:The result of this code is that we can not use the
Session
anymore, but it is stored alongside with theChannel
which we will be using. Because theOwningHandle
object dereferences toBox
, which dereferences toChannel
, when storing it in a struct, we name it as such. NOTE: This is just my understanding. I have a suspicion this may not be correct, since it appears to be quite close to discussion ofOwningHandle
unsafety.One curious detail here is that the
Session
logically has a similar relationship withTcpStream
asChannel
has toSession
, yet its ownership is not taken and there are no type annotations around doing so. Instead, it is up to the user to take care of this, as the documentation of handshake method says:So with the
TcpStream
usage, is completely up to the programmer to ensure the correctness of the code. With theOwningHandle
, the attention to where the "dangerous magic" happens is drawn using theunsafe {}
block.A further and a more high-level discussion of this issue is in this Rust User's Forum thread - which includes a different example and its solution using the rental crate, which does not contain unsafe blocks.
我发现
Arc
(只读)或Arc
(带锁定的读写)模式有时是性能和代码复杂性之间非常有用的权衡(主要是由生命周期注释引起的)。用于只读访问的 Arc:
用于读写访问的 Arc + Mutex:
另请参阅
RwLock
(何时或为何应使用互斥量而不是 RwLock?)I've found the
Arc
(read-only) orArc<Mutex>
(read-write with locking) patterns to be sometimes quite useful tradeoff between performance and code complexity (mostly caused by lifetime-annotation).Arc for read-only access:
Arc + Mutex for read-write access:
See also
RwLock
(When or why should I use a Mutex over an RwLock?)该答案应该作为已接受答案中提到的板条箱的示例集合。
停下来听听!很可能,您并不真正需要自我引用的箱子。正如您从下面的示例中看到的,它们很快就会变得非常笨重。此外,健全性漏洞仍在被发现,维护这样的板条箱是一个持续的负担。
如果您完全控制所涉及的类型,则更喜欢没有自我参照的设计。使用自引用板条箱的唯一原因是,如果您获得了某个具有 API 的外部库,迫使您进行自引用。
ouroboros
最古老的仍然维护的箱子,有很多下载。
self_cell
可以看到它和
ouroboros
非常相似 ,因此其灵活性受到限制:yoke
,但它是一个声明性宏 这个箱子的一大缺点是它不能安全地与第三方类型一起使用。引用其他类型的类型必须实现特定特征 (
Yokeable< /code>
),虽然您可以为您的类型派生它,但对于外国类型,您就不走运了。突变也是有限的,因为最初的目的是用于零拷贝反序列化。尽管如此,这仍然是一个有用的箱子:
nolife
创建此箱子是为了建议自引用结构的新方法,基于编译器为您生成的异步块的自引用。
This answer is supposed to serve as a collection of examples for the crates mentioned in the accepted answer.
Stop and listen! Chances are, you don't really need self-referential crates. As you can see from the examples below, they can get real unwieldy quickly. In addition, soundness holes are still being discovered, and maintaining such crate is a constant burden.
If you fully control the types involved, prefer a design with no self-referentiality. The only reason to use a self-referential crate is if you are given with some external library that has an API that forces you to be self-referential.
ouroboros
The oldest still maintained crate, with many downloads.
self_cell
You can see it is very similar to
ouroboros
, but it is a declarative macro, so its flexibility is limited:yoke
The big disadvantage of this crate is that it cannot be used with third-party types safely. The type that refers the other type must implement a specific trait (
Yokeable
), and while you can derive it for your types, for foreign types you are out of luck. Mutation is also limited, since original intention was to be used for zero-copy deserialization. Still, this can be a useful crate:nolife
This crate was created to suggest a new approach to self-referential structs, based on the self-referentiality of async blocks that the compiler generates for you.
作为 Rust 的新手,我遇到了与上一个示例类似的情况:
最后,我使用这种模式解决了它:
这远非通用解决方案!但它在我的情况下有效,并且只需要使用上面的
main_simple
模式(而不是main_complex
变体),因为在我的情况下,“父”对象只是临时的东西(数据库“客户端”对象),我必须构造该对象以传递给“子”对象(数据库“事务”对象),以便我可以运行一些数据库命令。无论如何,它完成了我需要的封装/样板简化(因为我有许多需要创建事务/“子”对象的函数,现在他们需要的是通用锚对象创建行),同时避免需要使用一个全新的库。
这些是我知道可能相关的库:
但是,我扫描了它们,它们所有似乎都存在这样或那样的问题(多年来没有更新,提出了多个不合理的问题/担忧等),所以我在使用它们时犹豫不决。
因此,虽然这不是通用的解决方案,但我想我会为具有类似用例的人提及它:
start_transaction
函数)As a newcomer to Rust, I had a case similar to your last example:
In the end, I solved it by using this pattern:
This is far from a universal solution! But it worked in my case, and only required usage of the
main_simple
pattern above (not themain_complex
variant), because in my case the "parent" object was just something temporary (a database "Client" object) that I had to construct to pass to the "child" object (a database "Transaction" object) so I could run some database commands.Anyway, it accomplished the encapsulation/simplification-of-boilerplate that I needed (since I had many functions that needed creation of a Transaction/"child" object, and now all they need is that generic anchor-object creation line), while avoiding the need for using a whole new library.
These are the libraries I'm aware of that may be relevant:
However, I scanned through them, and they all seem to have issues of one kind or another (not being updated in years, having multiple unsoundness issues/concerns raised, etc.), so I was hesitant to use them.
So while this isn't as generic of a solution, I figured I would mention it for people with similar use-cases:
start_transaction
function)让我们看一下这个的简单实现:
这将失败并出现错误:
要完全理解此错误,您必须考虑如何
值在内存中表示,以及当您移动时会发生什么
这些价值观。让我们用一些假设来注释
Combined::new
显示值所在位置的内存地址:
child
会发生什么?如果该值只是像parent
那样移动是,那么它将引用不再保证的内存
其中有一个有效的值。允许存储任何其他代码段
内存地址 0x1000 处的值。访问该内存,假设它是
整数可能会导致崩溃和/或安全错误,并且是其中之一
Rust 可防止的主要错误类别。
这正是生命周期要避免的问题。一生是一个
一些元数据可以让你和编译器知道一个
值将在其当前内存位置有效。那是一个
重要的区别,因为这是 Rust 新手常犯的错误。
Rust 生命周期不是对象被创建的时间间隔
创建和销毁的时间!
打个比方,这样想:在人的一生中,他们会
驻留在许多不同的位置,每个位置都有不同的地址。一个
Rust 生命周期与您当前居住的地址有关,
不是关于你将来什么时候会死(尽管也会死)
更改您的地址)。每次你搬家都是相关的,因为你的
地址不再有效。
同样重要的是要注意,生命周期不会更改您的代码;你的
代码控制生命周期,你的生命周期并不控制代码。这
精辟的说法是“生命是描述性的,而不是规定性的”。
让我们用一些我们将使用的行号来注释
Combined::new
强调生命周期:
parent
的具体生命周期是从 1 到 4,包含这两个值(我将表示为
[1,4]
)。child
的具体生命周期为[2,4]
,并且返回值的具体生命周期是
[4,5]
。它是可能有从零开始的具体生命周期 - 这将是
表示函数或其他东西的参数的生命周期
存在于街区之外。
请注意,
child
本身的生命周期是[2,4]
,但它引用为生命周期为
[1,4]
的值。这很好,只要引用值先于被引用值失效。这
当我们尝试从块返回
child
时,就会出现问题。这会“过度延长”寿命超出其自然长度。
这个新知识应该可以解释前两个例子。第三个
需要查看
Parent::child
的实现。机会是,它看起来像这样:
这使用生命周期省略来避免编写显式泛型
生命周期参数。。它相当于:
在这两种情况下,该方法都表示
Child
结构将是返回已用具体生命周期参数化的
自我
。换句话说,Child
实例包含一个引用到创建它的
Parent
,因此不能活得比这个更长父
实例。这也让我们认识到,我们的行为确实出了问题。
创建函数:
尽管您更有可能看到以不同形式编写的函数:
在这两种情况下,都没有通过
争论。这意味着
Combined
的生命周期将是参数化 with 不受任何限制 - 它可以是任何东西
呼叫者希望如此。这是无意义的,因为调用者
可以指定
'static
生命周期,但无法满足该要求健康)状况。
我该如何修复它?
最简单和最推荐的解决方案是不要尝试将
这些项目在同一结构中组合在一起。通过这样做,您的
结构嵌套将模仿代码的生命周期。场所类型
将数据一起放入一个结构中,然后提供方法
允许您根据需要获取引用或包含引用的对象。
有一种特殊情况,生命周期跟踪过于热心:
当你有东西放在堆上时。当您使用
例如,
Box
。在这种情况下,移动的结构包含一个指向堆的指针。指定值将保持不变
稳定,但是指针本身的地址会移动。在实践中,
这并不重要,因为您始终遵循指针。
有些板条箱提供了表示这种情况的方法,但它们
要求基地址永远不会移动。这排除了变异
向量,这可能会导致重新分配和移动
堆分配的值。
租赁(不再维护或支持)owning_ref(具有 多个健全性问题)通过 Rental 解决的问题示例:
在其他情况下,您可能希望转向某种类型的引用计数,例如使用
Rc
< /a> 或Arc
。更多信息
虽然理论上可以做到这一点,但这样做会带来大量的复杂性和开销。每次移动对象时,编译器都需要插入代码来“修复”引用。这意味着复制结构不再是一个非常便宜的操作,只需移动一些位即可。它甚至可能意味着这样的代码是昂贵的,具体取决于假设的优化器有多好:
程序员可以选择,而不是强制每个移动发生这种情况。通过创建仅在调用它们时才会采用适当引用的方法来实现这种情况。
具有自身引用的类型
在一种特定情况下,您可以创建具有自身引用的类型。不过,您需要使用像
Option
这样的东西来分两步完成:从某种意义上来说,这确实有效,但创建的值受到高度限制 - 它永远不能被移动。值得注意的是,这意味着它不能从函数返回或按值传递给任何东西。构造函数显示了与上面相同的生命周期问题:
如果您尝试使用方法执行相同的代码,您将需要诱人但最终无用的
&'a self
。当涉及到这一点时,此代码会受到更多限制,并且在第一个方法调用后您将收到借用检查器错误:另请参阅:
Pin
怎么样?Pin
,在 Rust 1.33 中稳定,在模块文档中:需要注意的是,“自引用”并不一定意味着使用引用。。事实上,自引用结构的示例< /a> 具体说(强调我的):
自 Rust 1.0 以来就已经存在使用原始指针来实现此行为的能力。事实上,拥有引用和租赁在幕后使用原始指针。
Pin
添加到表中的唯一内容是一种声明给定值保证不会移动的常用方法。另请参阅:
Let's look at a simple implementation of this:
This will fail with the error:
To completely understand this error, you have to think about how the
values are represented in memory and what happens when you move
those values. Let's annotate
Combined::new
with some hypotheticalmemory addresses that show where values are located:
What should happen to
child
? If the value was just moved likeparent
was, then it would refer to memory that no longer is guaranteed to
have a valid value in it. Any other piece of code is allowed to store
values at memory address 0x1000. Accessing that memory assuming it was
an integer could lead to crashes and/or security bugs, and is one of
the main categories of errors that Rust prevents.
This is exactly the problem that lifetimes prevent. A lifetime is a
bit of metadata that allows you and the compiler to know how long a
value will be valid at its current memory location. That's an
important distinction, as it's a common mistake Rust newcomers make.
Rust lifetimes are not the time period between when an object is
created and when it is destroyed!
As an analogy, think of it this way: During a person's life, they will
reside in many different locations, each with a distinct address. A
Rust lifetime is concerned with the address you currently reside at,
not about whenever you will die in the future (although dying also
changes your address). Every time you move it's relevant because your
address is no longer valid.
It's also important to note that lifetimes do not change your code; your
code controls the lifetimes, your lifetimes don't control the code. The
pithy saying is "lifetimes are descriptive, not prescriptive".
Let's annotate
Combined::new
with some line numbers which we will useto highlight lifetimes:
The concrete lifetime of
parent
is from 1 to 4, inclusive (which I'llrepresent as
[1,4]
). The concrete lifetime ofchild
is[2,4]
, andthe concrete lifetime of the return value is
[4,5]
. It'spossible to have concrete lifetimes that start at zero - that would
represent the lifetime of a parameter to a function or something that
existed outside of the block.
Note that the lifetime of
child
itself is[2,4]
, but that it refersto a value with a lifetime of
[1,4]
. This is fine as long as thereferring value becomes invalid before the referred-to value does. The
problem occurs when we try to return
child
from the block. This would"over-extend" the lifetime beyond its natural length.
This new knowledge should explain the first two examples. The third
one requires looking at the implementation of
Parent::child
. Chancesare, it will look something like this:
This uses lifetime elision to avoid writing explicit generic
lifetime parameters. It is equivalent to:
In both cases, the method says that a
Child
structure will bereturned that has been parameterized with the concrete lifetime of
self
. Said another way, theChild
instance contains a referenceto the
Parent
that created it, and thus cannot live longer than thatParent
instance.This also lets us recognize that something is really wrong with our
creation function:
Although you are more likely to see this written in a different form:
In both cases, there is no lifetime parameter being provided via an
argument. This means that the lifetime that
Combined
will beparameterized with isn't constrained by anything - it can be whatever
the caller wants it to be. This is nonsensical, because the caller
could specify the
'static
lifetime and there's no way to meet thatcondition.
How do I fix it?
The easiest and most recommended solution is to not attempt to put
these items in the same structure together. By doing this, your
structure nesting will mimic the lifetimes of your code. Place types
that own data into a structure together and then provide methods that
allow you to get references or objects containing references as needed.
There is a special case where the lifetime tracking is overzealous:
when you have something placed on the heap. This occurs when you use a
Box<T>
, for example. In this case, the structure that is movedcontains a pointer into the heap. The pointed-at value will remain
stable, but the address of the pointer itself will move. In practice,
this doesn't matter, as you always follow the pointer.
Some crates provide ways of representing this case, but they
require that the base address never move. This rules out mutating
vectors, which may cause a reallocation and a move of the
heap-allocated values.
rental(no longer maintained or supported)owning_ref(has multiple soundness issues)Examples of problems solved with Rental:
In other cases, you may wish to move to some type of reference-counting, such as by using
Rc
orArc
.More information
While it is theoretically possible to do this, doing so would introduce a large amount of complexity and overhead. Every time that the object is moved, the compiler would need to insert code to "fix up" the reference. This would mean that copying a struct is no longer a very cheap operation that just moves some bits around. It could even mean that code like this is expensive, depending on how good a hypothetical optimizer would be:
Instead of forcing this to happen for every move, the programmer gets to choose when this will happen by creating methods that will take the appropriate references only when you call them.
A type with a reference to itself
There's one specific case where you can create a type with a reference to itself. You need to use something like
Option
to make it in two steps though:This does work, in some sense, but the created value is highly restricted - it can never be moved. Notably, this means it cannot be returned from a function or passed by-value to anything. A constructor function shows the same problem with the lifetimes as above:
If you try to do this same code with a method, you'll need the alluring but ultimately useless
&'a self
. When that's involved, this code is even more restricted and you will get borrow-checker errors after the first method call:See also:
What about
Pin
?Pin
, stabilized in Rust 1.33, has this in the module documentation:It's important to note that "self-referential" doesn't necessarily mean using a reference. Indeed, the example of a self-referential struct specifically says (emphasis mine):
The ability to use a raw pointer for this behavior has existed since Rust 1.0. Indeed, owning-ref and rental use raw pointers under the hood.
The only thing that
Pin
adds to the table is a common way to state that a given value is guaranteed to not move.See also: