获取对数组内结构的引用
我想修改数组内部结构的字段,而不必设置整个结构。在下面的示例中,我想设置数组中元素 543 的一个字段。我不想复制整个元素(因为复制 MassiveStruct 会损害性能)。
class P
{
struct S
{
public int a;
public MassiveStruct b;
}
void f(ref S s)
{
s.a = 3;
}
public static void Main()
{
S[] s = new S[1000];
f(ref s[543]); // Error: An object reference is required for the non-static field, method, or property
}
}
有没有办法在 C# 中做到这一点?或者我是否总是必须将整个结构从数组中复制出来,修改副本,然后将修改后的副本放回数组中。
I want to modify a field of a struct which is inside an array without having to set entire struct. In the example below, I want to set one field of element 543 in the array. I don't want to have to copy entire element (because copying MassiveStruct would hurt performance).
class P
{
struct S
{
public int a;
public MassiveStruct b;
}
void f(ref S s)
{
s.a = 3;
}
public static void Main()
{
S[] s = new S[1000];
f(ref s[543]); // Error: An object reference is required for the non-static field, method, or property
}
}
Is there a way to do it in C#? Or do I always have to copy entire struct out of array, modify the copy, and then put the modified copy back into array.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
[编辑 2017: 请参阅本文末尾有关 C#7 的重要评论]
经过多年与这个确切问题的斗争,我我将总结我发现的一些技术和解决方案。抛开风格品味不谈,结构数组实际上是C#中可用的内存批量存储方法。如果您的应用确实在高吞吐量条件下处理数百万个中等大小的对象,则没有其他托管替代方案。
我同意 @kaalus 的观点,即对象头和 GC 压力可以快速增加;然而,在解析和/或生成冗长的自然语言句子时,我的 NLP 语法处理系统可以在不到一分钟的时间内处理 8-10 GB(或更多)的结构分析。提示合唱:“C# 不适合此类问题...”、“切换到汇编语言...”、“用电线包装 FPGA...,” 等等。
好吧,让我们运行一些测试。首先,全面了解值类型(
struct
)管理问题和class
与struct
权衡最佳点。当然还有装箱、固定/不安全代码、固定缓冲区、GCHandle、IntPtr 等等,但在我看来,最重要的是明智地使用托管指针(又名“内部指针”)。您对这些主题的掌握还包括了解以下事实:如果您碰巧在
struct
中包含对托管类型的一个或多个引用(而不只是 blittable 原语),那么您可以选择访问带有不安全
指针的struct
大大减少了。对于我将在下面提到的托管指针方法来说,这不是问题。因此,一般来说,包含对象引用就很好,并且对于本次讨论不会有太大改变。哦,如果您确实需要保留您的
不安全
访问权限,您可以在“正常”模式下使用GCHandle
在您的结构中无限期地存储对象引用。幸运的是,将 GCHandle 放入结构中不会触发不安全访问禁止。 (请注意,GCHandle
本身就是一种值类型,您甚至可以定义并使用...等等。作为一种值类型,GCHandle 本身直接映像到您的结构中,但显然它引用的 GC 实例不在堆中,不包含在数组的物理布局中。请注意,GCHandle 不必处于“固定”模式。在 GCHandle 上,不过,请注意它的复制语义,因为如果您最终不
Free
分配的每个 GCHandle ,就会出现内存泄漏。@Ani 提醒我们,有些人认为是可变的
struct<。 /code> 实例“邪恶”,但事实上它们容易发生,这就是问题所在。事实上,OP 的示例......
...准确地说明了我们想要实现的目标:访问我们的数据记录原位。(注意:引用类型“
class
”实例数组的语法具有相同的外观,但在本文中我们仅专门讨论 >用户定义值类型的非锯齿状数组 em> 这里。)对于我自己的程序,我通常如果我遇到一个过大的 blittable 结构,该结构(意外地)完全从其数组存储行中成像,请认为这是一个严重的错误:就您的 struct 可以或应该有多大(宽)而言,它没关系,因为您要小心,永远不要让
struct
执行上一个示例中所示的操作,即,将 in-toto 迁移出它的嵌入数组。事实上,这指出了整篇文章的一个基本前提:不幸的是,C# 语言无法系统地标记或禁止违反此规则的代码,因此这里的成功通常取决于仔细的编程纪律。
由于我们的“巨型结构”永远不会从其数组中成像,因此它们实际上只是内存上的模板。换句话说,正确的想法是将
struct
视为覆盖数组元素。我们总是将它们视为空洞的“内存模板”,而不是可转移或便携式封装器或数据容器。对于数组绑定的“巨型”值类型,我们永远不想调用“struct
”最存在的特征,即按值传递。示例:
这里我们覆盖 6 个
int
,每个“记录”总共 24 个字节。您需要考虑并了解包装选项以获得适合对齐的尺寸。但过多的填充会减少您的内存预算:因为更重要的考虑因素是非 LOH 对象的 85,000 字节限制。确保您的记录大小乘以预期行数不超过此限制。因此,对于此处给出的示例,最好建议您将
rec
数组保持在每行不超过 3,000 行。希望您的应用程序可以围绕这个最佳点进行设计。当您记住每一行都将是一个单独的垃圾收集对象,而不仅仅是一个数组时,这并不是那么限制。您已将对象扩散减少了三个数量级,这对于一天的工作来说非常好。因此,这里的 .NET 环境强烈地给我们带来了一个相当具体的约束:似乎如果您将应用程序的内存设计目标定为 30-70 KB 范围内的整体分配,那么您确实可以摆脱大量的内存分配,事实上,您反而会受到一组棘手的性能瓶颈(即硬件内存总线上的带宽)的限制。现在,您拥有一个 .NET 引用类型(数组),在物理上连续的表格存储中包含 3,000 个 6 元组。首先也是最重要的,我们必须非常小心,永远不要“拾取”其中一个结构。正如乔恩·斯基特(Jon Skeet)上面指出的那样,“大规模结构的性能通常比类差”,这是绝对正确的。没有比随意抛出丰富的值类型更好的方法来瘫痪你的内存总线了。
因此,让我们利用结构数组的一个不常提到的方面:整个数组的所有行的所有对象(以及这些对象或结构的字段)始终初始化为其默认值。您可以开始在数组中的任何位置、任何行或列(字段)中一次插入一个值。您可以将某些字段保留为默认值,或者替换相邻字段而不干扰中间的字段。堆栈驻留(局部变量)结构在使用前所需的烦人的手动初始化已经一去不复返了。
有时很难维护逐个字段的方法,因为 .NET 总是试图让我们在整个
new
'd-up 结构中进行爆炸 - 但对我来说,这种所谓的“初始化” “只是以不同的形式违反了我们的禁忌(反对从数组中取出整个结构)。现在我们到达问题的关键了。显然,就地访问表格数据可以最大限度地减少数据洗牌的繁琐工作。但这往往是一个不方便的麻烦。由于边界检查,.NET 中的数组访问可能会很慢。那么如何维护一个指向数组内部的“工作”指针,以避免系统不断地重新计算索引偏移量呢?
评估
让我们评估用于操作值类型数组存储行中的各个字段的五种不同方法的性能。下面的测试旨在测量集中访问位于某个数组索引处的结构体数据字段的效率,原位,即“它们所在的位置”,而无需提取或重写整个结构体(数组元素)。在所有其他因素保持不变的情况下,比较了五种不同的访问方法。
这五种方法如下:
List其中 T: struct
Liststruct
元素的字段代码>)。__makeref
C# 语言关键字。ref
关键字的委托托管指针在给出 C# 测试结果之前,先介绍一下测试工具的实现。这些测试在 .NET 4.5 上运行,这是一个在 x64、Workstation gc 上运行的 AnyCPU 版本。 (请注意,因为测试对分配和取消分配数组本身的效率不感兴趣,所以上面提到的 LOH 考虑因素不适用。)
因为为每个特定方法实现测试的代码片段很长,我先给出结果。时间是“滴答作响”;越低意味着越好。
我很惊讶这些结果如此明确。
TypedReferences
是最慢的,可能是因为它们与指针一起拖拽类型信息。考虑到精心设计的“普通”版本的 IL 代码的重要性,它的性能出奇的好。模式转换似乎会损害不安全的代码,以至于您确实必须证明、计划和测量要部署它的每个位置。但是,通过利用函数参数传递中的
ref
关键字来指向数组的内部部分,可以轻松实现最快的时间,从而消除“每个字段访问” “数组索引计算。也许我的测试设计偏向于这一点,但测试场景代表了我的应用程序中的经验使用模式。这些数字让我感到惊讶的是,保持在托管模式下的优势(同时还拥有指针)并没有因为必须调用函数或通过委托进行调用而被取消。
最快的获胜者
:(也许也是最简单的?)
但它的缺点是您无法在程序中将相关逻辑保持在一起:函数的实现分为两个 C# 函数,f和test_f。
我们只需牺牲一点点性能就可以解决这个特殊问题。下一个与前面的基本相同,但将其中一个函数作为 lambda 函数嵌入到另一个函数中...
A Close Second
用内联委托替换前面示例中的静态函数需要使用
ref< /code> 参数,这反过来又阻止使用
Func
lambda 语法;相反,您必须使用旧式 .NET 中的显式委托。通过添加此全局声明一次:
...我们可以在整个程序中使用它直接
ref
到数组 rec[] 的元素中,内联访问它们:此外,尽管它可能看起来像每次调用时都会实例化一个新的 lambda 函数,如果您小心的话,这种情况就不会发生:使用此方法时,请确保您没有“关闭”任何局部变量(即,引用以下变量):位于 lambda 函数之外,从其体内),或者执行任何操作否则这将阻止您的委托实例成为静态的。如果局部变量碰巧落入您的 lambda 中,并且 lambda 因而被提升为实例/类,那么您“可能”会注意到它尝试创建 500 万个委托时的差异。
只要让 lambda 函数远离这些副作用,就不会有多个实例;这里发生的情况是,每当 C# 确定 lambda 没有非显式依赖项时,它就会延迟创建(并缓存)静态单例。有点不幸的是,如此剧烈的性能变化作为一种静默优化而被隐藏在我们的视野之外。总的来说,我喜欢这种方法。它快速且整洁——除了奇怪的括号,这里任何一个都不能省略。
其余的
为了完整起见,以下是其余的测试:正常括号加点;类型参考;和不安全的指针。
总结
对于大型 C# 应用程序中的内存密集型工作,使用托管指针直接访问值类型数组元素的字段原位 是要走的路。
如果您真的很重视性能,这可能足以成为使用
C++/CLI
(或CIL
)而不是C#
的理由> 对于应用程序的相关部分,因为这些语言允许您直接在函数体内声明托管指针。在[请参阅下面的 C#7]C#
中,创建托管指针的唯一方法是使用ref
或out
参数声明一个函数,然后被调用者将观察托管指针。因此,要在 C# 中获得性能优势,您必须使用上面显示的(前两个)方法之一。遗憾的是,这些方法部署了将函数拆分为多个部分只是为了访问数组元素。尽管远不如等效的 C++/CLI 代码优雅,但测试表明,即使在 C# 中,对于高吞吐量应用程序,与简单的值类型数组访问相比,我们仍然获得了巨大的性能优势。
[2017 年编辑:虽然总体上可能对本文的劝告具有一定程度的先见之明,但
Visual Studio 2017
中的 C# 7 的发布同时使得上述特定方法......完全过时。简而言之,新的现在可以编写上面的相同测试函数:
请注意,这如何完全消除了对诸如我上面讨论的那些组装的需要。托管指针的更流畅使用避免了“获胜者”中使用的不必要的函数调用,“获胜者”是我评论的性能最佳的方法。因此,新功能的性能只会比上面比较方法的获胜者更好。
讽刺的是,C# 7 还添加了 本地函数,该功能将直接解决我针对上述两个黑客提出的关于封装不良的抱怨。令人高兴的是,仅仅为了访问托管指针而增加专用函数的整个企业现在已经完全没有意义了。
[edit 2017: see important comments regarding C#7 at the end of this post]
After many years of wrestling with this exact problem, I'll summarize the few techniques and solutions I have found. Stylistic tastes aside, arrays of structs are really the o̲n̲l̲y in-memory bulk storage method available in C#. If your app truly processes millions of medium-sized objects under high throughput conditions, there's no other managed alternative.
I agree with @kaalus that object headers and GC pressure can quickly mount; nevertheless my NLP grammar processing system can manipulate 8-10 gigabytes (or more) of structural analyses in less than a minute when parsing and/or generating lengthy natural language sentences. Cue the chorus: “C# isn't meant for such problems...,” “Switch to assembly language...,” “Wire-wrap up an FPGA...,” etc.
Well, instead let's run some tests. First of all, it is critical to have total understanding of the full spectrum of value-type (
struct
) management issues and theclass
vs.struct
tradeoff sweet-spots. Also of course boxing, pinning/unsafe code, fixed buffers,GCHandle,
IntPtr,
and more, but most importantly of all in my opinion, wise use of managed pointers (a.k.a. "interior pointers").Your mastery of these topics will also include knowledge of the fact that, should you happen to include in your
struct
one or more references to managed types (as opposed to just blittable primitives), then your options for accessing thestruct
withunsafe
pointers are greatly reduced. This is not a problem for the managed pointer method I'll mention below. So generally, including object references is fine and doesn't change much regarding this discussion.Oh, and if you do really need to preserve your
unsafe
access, you can use aGCHandle
in 'Normal' mode to store object reference(s) in your struct indefinitely. Fortunately, putting theGCHandle
into your struct does not trigger the unsafe-access prohibition. (Note thatGCHandle
is itself a value-type, and you can even define and go to town with...and so forth. As a value type, the GCHandle itself is imaged directly into your struct, but obviously the GC instances it references are not. They are out in the heap, not included in the physical layout of your array. Notice that the GCHandle does not have to be in "pinned" mode. Finally on GCHandle, beware of its copy-semantics, though, because you'll have a memory leak if you don't eventually
Free
each GCHandle you allocate.@Ani reminds us that some people consider mutable
struct
instances "evil," but it's really the fact that they are accident prone that's the problem. Indeed, the OP's example......illustrates exactly what we're trying to achieve: access our data records in-situ. (Beware: the syntax for an array of reference-type '
class
' instances has identical appearance, but in this article we're specifically discussing only non-jagged arrays of user-defined value-types here.) For my own programs, I generally consider it a severe bug if I encounter an oversized blittable struct that has (accidentally) been wholly imaged out of its array storage row:As far as how big (wide) your
struct
can or should be, it won't matter, because you are going to be careful never to let thestruct
do what was just shown in the previous example, that is, migrate in-toto out of its embedding array. In fact, this points to a fundamental premise of this entire article:Unfortunately, the C# language offers no way to systematically flag or forbid code that violates this rule, so success here generally depends on careful programming discipline.
Since our "jumbo-structs" are never imaged out of their array, they're really just templates over memory. In other words, the right thinking is to conceive of the
struct
as overlaying the array elements. We always think of each as a vacuous "memory template," as opposed to a transferrable or portable encapsulator or data container. For array-bound "jumbo" value-types, we never want to invoke that most existential characteristic of a "struct
", namely, pass-by-value.Example:
Here we overlay 6
int
s for a total of 24 bytes per "record." You'll want to consider and be aware of packing options to obtain an alignment-friendly size. But excessive padding can cut into your memory budget: because a more important consideration is the 85,000 byte limit on non-LOH objects. Make sure your record size multiplied by the expected number of rows does not exceed this limit.So for the example given here, you would be best advised to keep your array of
rec
s to no more 3,000 rows each. Hopefully your application can be designed around this sweet-spot. This is not so limiting when you remember that—alternatively—each row would be a separate garbage-collected object, instead of just the one array. You've cut your object proliferation by a three orders of magnitude, which is pretty good for a day's work. Thus the .NET environment here is strongly steering us with a fairly specific constraint: it seems that if you target your app's memory design towards monolithic allocations in the 30-70 KB range, then you really can get away with lots and lots of them, and in fact you'll instead become limited by a thornier set of performance bottlenecks (namely, bandwidth on the hardware memory bus).So now you have a single .NET reference type (array) with 3,000 6-tuples in physically contiguous tabular storage. First and foremost, we must be super-careful to never "pick up" one of the structs. As Jon Skeet notes above, "Massive structs will often perform worse than classes," and this is absolutely correct. There's no better way to paralyze your memory bus than to start throwing plump value types around willy-nilly.
So let's capitalize on an infrequently-mentioned aspect of the array of structs: All objects (and fields of those objects or structs) of all rows of the entire array are always initialized to their default values. You can start plugging values in, one at a time, in any row or column (field), anywhere in the array. You can leave some fields at their default values, or replace neighbor fields without disturbing one in the middle. Gone is that annoying manual initialization required with stack-resident (local variable) structs before use.
Sometimes it's hard to maintain the field-by-field approach because .NET is always trying to get us to blast in an entire
new
'd-up struct—but to me, this so-called "initialization" is just a violation of our taboo (against plucking the whole struct out of the array), in a different guise.Now we get to the crux of the matter. Clearly, accessing your tabular data in-situ minimizes data-shuffling busywork. But often this is an inconvenient hassle. Array accesses can be slow in .NET, due to bounds-checking. So how do you maintain a "working" pointer into the interior of an array, so as to avoid having the system constantly recomputing the indexing offsets?
Evaluation
Let's evaluate the performance of five different methods for the manipulation of individual fields within value-type array storage rows. The test below is designed to measure the efficiency of intensively accessing the data fields of a struct positioned at some array index, in situ—that is, "where they lie," without extracting or rewriting the entire struct (array element). Five different access methods are compared, with all other factors held the same.
The five methods are as follows:
struct
elements using other collection types (such as aList<T> where T: struct
).__makeref
C# language keyword.ref
keywordBefore I give the C# test results, here's the test harness implementation. These tests were run on .NET 4.5, an AnyCPU release build running on x64, Workstation gc. (Note that, because the test isn't interested the efficiency of allocating and de-allocating the array itself, the LOH consideration mentioned above does not apply.)
Because the code fragments which implement the test for each specific method are long-ish, I'll give the results first. Time is 'ticks;' lower means better.
I was surprised that these results were so unequivocal.
TypedReferences
are slowest, presumably because they lug around type information along with the pointer. Considering the heft of the IL-code for the belabored "Normal" version, it performed surprisingly well. Mode transitions seem to hurt unsafe code to the point where you really have to justify, plan, and measure each place you're going to deploy it.But the hands down fastest times are achieved by leveraging the
ref
keyword in functions' parameter passing for the purpose of pointing to an interior part of the array, thus eliminating the "per-field-access" array indexing computation.Perhaps the design of my test favors this one, but the test scenarios are representative of empirical use patterns in my app. What surprised my about those numbers is that the advantage of staying in managed mode—while having your pointers, too—was not cancelled by having to call a function or invoke through a delegate.
The Winner
Fastest one: (And perhaps simplest too?)
But it has the disadvantage that you can't keep related logic together in your program: the implementation of the function is divided across two C# functions, f and test_f.
We can address this particular problem with only a tiny sacrifice in performance. The next one is basically identical to the foregoing, but embeds one of the functions within the other as a lambda function...
A Close Second
Replacing the static function in the preceding example with an inline delegate requires the use of
ref
arguments, which in turn precludes the use of theFunc<T>
lambda syntax; instead you must use an explicit delegate from old-style .NET.By adding this global declaration once:
...we can use it throughout the program to directly
ref
into elements of array rec[], accessing them inline:Also, although it may look like a new lambda function is being instantiated on each call, this won't happen if you're careful: when using this method, make sure you do not "close over" any local variables (that is, refer to variables which are outside the lambda function, from within its body), or do anything else that will bar your delegate instance from being static. If a local variable happens to fall into your lambda and the lambda thus gets promoted to an instance/class, you'll "probably" notice a difference as it tries to create five million delegates.
As long as you keep the lambda function clear of these side-effects, there won't be multiple instances; what's happening here is that, whenever C# determines that a lambda has no non-explicit dependencies, it lazily creates (and caches) a static singleton. It's a little unfortunate that a performance alternation this drastic is hidden from our view as a silent optimization. Overall, I like this method. It's fast and clutter-free—except for the bizarre parentheses, none of which can be omitted here.
And the rest
For completeness, here are the rest of the tests: normal bracketing-plus-dot; TypedReference; and unsafe pointers.
Summary
For memory-intensive work in large-scale C# apps, using managed pointers to directly access the fields of value-typed array elements in-situ is the way to go.
If you're really serious about performance, this might be enough reason to use
C++/CLI
(orCIL
, for that matter) instead ofC#
for the relevant parts of your app, because those languages allow you to directly declare managed pointers within a function body.In[see C#7 below]C#
, the only way to create a managed pointer is to declare a function with aref
orout
argument, and then the callee will observe the managed pointer. Thus, to get the performance benefits in C#, you have to use one of the (top two) methods shown above.Sadly, these deploy the kludge of splitting a function into multiple parts just for the purpose of accessing an array element. Although considerably less elegant than the equivalent
C++/CLI
code would be, tests indicate that even in C#, for high-throughput applications we still obtain a big performance benefit versus naïve value-type array access.[edit 2017: While perhaps conferring a small degree of prescience to this article's exhortations in general, the release of C# 7 in
Visual Studio 2017
at the same time renders the specific methods described above... entirely obsolete. In short, the new ref locals feature in the language permits you to declare your own managed pointer as a local variable, and use it to consolidate the single array dereferencing operation. So given for example the test structure from above......here is how the same test function from above can now be written:
Notice how this completely eliminates the need for kludges such as those I discussed above. The sleeker use of a managed pointer avoids the unnecessary function call that was used in "the winner," the best-performing methodology of those I reviewed. Therefore, the performance with the new feature can only be better than the winner of methods compared above.
Ironically enough, C# 7 also adds local functions, a feature which would directly solve the complaint about poor encapsulation I raised for two of the aforementioned hacks. Happily enough the whole enterprise of proliferating dedicated functions just for the purpose of gaining access to managed pointers is now completely moot.
唯一的问题是您尝试从静态方法调用实例方法,而没有
P
实例。将
f
设为静态方法(或创建一个P
实例来调用它),就可以了。这都是关于读取编译器错误:)话虽如此,我强烈建议您:
The only problem is that you're trying to call an instance method from a static method, without an instance of
P
.Make
f
a static method (or create an instance ofP
on which to call it) and it'll be fine. It's all about reading the compiler error :)Having said that, I would strongly advise you to:
虽然 Jon Skeet 关于程序无法编译的原因是正确的,但您可以这样做:
...它将直接对数组中的结构而不是副本进行操作。
请注意,这个想法仅适用于数组,其他集合(例如列表)将从索引器获取器中返回一个副本(如果您对结果值尝试类似的操作,则会出现编译器错误)。
另一方面,可变结构被认为是邪恶的。您是否有充分的理由不想将
S
创建为一个类?While Jon Skeet is correct about why your program doesn't compile, you can just do:
...and it will operate directly on the struct in the array rather than on a copy.
Note that this idea works for arrays only, other collections such as lists will return a copy out from the indexer-getter (giving you a compiler error if you try something similar on the resulting value).
On another note, mutable structs are considered evil. Is there a strong reason why you don't want to make
S
a class?您可以尝试使用 转发空结构,但它不会保存实际数据,但仅保留 dataprovider 对象的索引。通过这种方式,您可以存储大量数据,而不会使对象图变得复杂。
我非常确定,在您的情况下,只要您不尝试将其编组到非托管代码中,就可以很容易地将巨型结构替换为转发空结构。
看看这个结构。它可以包含任意数量的数据。诀窍在于您确实将实际数据存储在另一个对象中。通过这种方式,您可以获得引用语义和结构的优点,该结构比类对象消耗更少的内存,并且由于对象图更简单(如果您有许多实例(数百万个)),因此 GC 周期更快。
You could try to use a forwarding empty struct which does not hold the actual data but only keeps an index to a dataprovider object. This way you can store huge amounts of data without complicating the object graph.
I am very certain that it should be quite easy in your case to replace your giant struct with a forwarding emtpy struct as long as you do not try to marshal it into unmanaged code.
Have a look at this struct. It can contain as much data inside it as you wish. The trick is that you do store the actual data in another object. This way you get reference semantics and the advantages of structs which consume less memory than class objects and faster GC cycles due to a simpler object graph (if you have many instances (millions) of them around).