C# 会从聚合结构/类中受益吗?

发布于 2024-10-12 06:00:54 字数 6161 浏览 6 评论 0原文

前言

tl;wr:这是一个讨论。

我知道这个“问题”更多的是一个讨论,因此为什么我将其标记为社区维基。但是,根据 How to Ask 页面,它可能属于此处,因为它与特定的编程相关,而不是讨论经过一小时的研究后,可以在网络上的任何地方找到具体的、与大多数 C# 程序员相关的内容,并且符合主题。此外,问题本身就是为了获得答案,无论我的偏见如何,我都会保持开放的态度:C# 真的会从聚合结构中受益吗?尽管有这个前言,但我会理解这一点关闭,但如果有权限和意图关闭的用户将我重定向到网络上适当的讨论点,我将不胜感激。


简介


缺乏结构可变性

结构 是 C# 中灵活但有争议的类型。它们提供了堆栈分配的值类型组织范式,但不提供其他值类型的不变性。

有些人说结构应该代表值,并且值不会改变(例如 int i = 5;,5 是不可变的),而有些人则将它们视为带有子字段的 OOP 布局。

关于结构不变性的争论 (123) ,当前的解决方案似乎是让程序员强制执行不变性,也尚未解决。

例如,当将结构作为引用访问时,C# 编译器将检测可能的数据丢失(此页面的底部< /a>) 并限制分配。此外,由于结构体构造函数、属性和函数能够执行任何操作,(对于构造函数)在返回控件之前分配所有字段的限制,结构体 不能声明为常量,如果它们仅限于数据表示,那么这将是正确的声明。


结构体的不可变子集,聚合

聚合类(Wikipedia< /a>)是功能有限的严格数据结构,由于缺乏灵活性而注定要提供语法糖。在 C++ 中,它们“没有用户声明的构造函数,没有私有或受保护的非静态数据成员,没有基类,也没有虚函数”。尽管核心概念保持不变,但 C# 中此类类的理论细节在此仍存在争议。

由于聚合结构严格来说是带有标记访问器的数据持有者,因此可以确保它们的不变性(在可能的 C# 上下文中)。与其他纯值类型一样,聚合也不能为 null,除非指定了 null 运算符 (?)。因此,许多非法的结构操作以及一些语法糖都将成为可能。


使用


  1. 聚合可以声明为 const,因为它们的构造函数将被强制执行除了分配字段之外不执行任何操作。
  2. 聚合可以用作方法参数的默认值。
  3. 聚合可以是隐式顺序的,从而促进与本机
  4. 聚合的交互。聚合将是不可变的,强制引用访问不会丢失数据。编译器对此类子字段修改的检测可能会导致完整的、隐式的重新分配。库。

假设语法


根据 C++ 语法,我们可以想象如下内容: (请记住,这是一个社区 wiki,欢迎并鼓励改进)

aggregate Size
{
    int Width;
    int Height;
}

aggregate Vector
{
    // Default values for constructor.
    double X = 0, Y = 0, Z = 0;
}

aggregate Color
{
    byte R, G, B, A = 255;
}

aggregate Bar
{
    int X;
    Qux Qux;
}

aggregate Qux
{
    int X, Y;
}

static class Foo
{
    // Constant is possible.
    const Size Big = new Size(200, 100);

    // Inline constructor.
    const Vector Gravity = { 0, -9.8, 0 };

    // Default value / labeled parameter.
    const Color Fuschia = { 255, 0, 255 };
    const Vector Up = { y: 1 };

    // Sub-aggregate initialization
    const Bar Test = { 20, { 4, 3 } };

    static void SetVelocity(Vector velocity = { 0, 1, 0 }) { ... }
    static void SetGravity(Vector gravity = Foo.Gravity) { ... }

    static void Main()
    {
        Vector v = { 1, 2, 3 };

        double y = v.Y; // Valid.

        v.Y = 5; // Invalid, immutable.
    }
}

隐式(重新)赋值

从今天开始,在 C# 4.0 中分配结构的子字段是有效的:

Vector v = new Vector(1, 2, 3);
v.Z = 5; // Legal in current C#.

但是,有时,编译器可以检测到结构何时被错误地访问为参考文献,并将禁止更改子字段。例如,(示例问题)

//(in a Windows.Forms context)
control.Size.Width = 20; // Illegal in current C#.

作为 Size 是一个属性,struct Size 是一个值类型,我们将编辑实际属性的副本/克隆,在这种情况下这将是无用的。作为 C# 用户,我们倾向于假设大多数内容都是通过引用访问的,尤其是在 OOP 设计中,这会让我们认为这样的调用是合法的(如果 struct Sizestruct Size ,那么它就是合法的)代码>类)。

此外,在访问集合时,编译器还禁止我们修改结构体子字段:(示例问题

List<Vector> vectors = ... // Imagine populated data.
vectors[4].Y = 10; // Illegal in current C#.

关于这些不幸的限制的好消息是,编译器对这种情况做了一半可能的聚合解决方案:检测它们何时发生。另一半是隐式地重新分配具有更改后的值的新聚合。

  • 当在局部范围内时,只需重新分配向量即可。
  • 在外部作用域中时,找到一个 get,如果可以访问匹配的 set 访问器,则重新分配给该访问器。

为此,为了避免混淆,委托必须标记为隐式:

implicit aggregate Vector { ... }
implicit aggregate Size { ... }


// Example 1
{
    Vector v = new Vector(1, 2, 3);
    v.Z = 5; // Legal with implicit aggregates.

    // What is implicitly done:
    v = new Vector(v.X, v.Y, 5); // Local variable, simply reassign.
}

// Example 2
{
    //(in a Windows.Forms context)
    control.Size.Width = 20; // Legal with implicit aggregates.

    // What is implicitly done:
    Size old = control.Size.__get(); // External, MSIL detects a get.
    // If MSIL can find a matching, accessible __set:
    control.Size.__set({ 20, old.Height });
}

// Example 3
{
    List<Vector> vectors = ... // Imagine populated data.
    vectors[4].Y = 10; // Legal with implicit aggregates.

    // What is implicitly done:
    Vector old = vectors[4].__get(); // External, MSIL detects a get.
    // If MSIL can find a matching, accessible __set:
    vectors[4].__set({ old.X, 10, old.Z });
}

// Example 4
{
    Vector The5thVector(List<Vector> vectors) { return vectors[4]; }
    ...
    List<Vector> vectors = ...;
    The5thVector(vectors).Y = 10; // Illegal with implicit aggregates.

    // This is illegal because the compiler cannot find an implicit
    // "set" to match. as it is a function return, not a property or
    // indexer.
}

当然,最后的隐式重新分配只是一种语法简化,可以或可以不被采纳。我只是建议它,因为编译器似乎能够检测到对结构的此类引用访问,并且如果它是聚合,则可以轻松地为程序员转换代码。


摘要

  • 聚合可以有字段;
  • 聚合是值类型;
  • 聚合是不可变的;
  • 聚合在堆栈上分配;
  • 聚合不能继承;
  • 聚合具有顺序布局;
  • 聚合有一个顺序的默认构造函数;
  • 聚合不能有用户定义的构造函数;
  • 聚合可以有默认值和标记结构;
  • 聚合可以内联定义;
  • 聚合可以声明为常量;
  • 聚合可以用作默认参数;
  • 除非指定,否则聚合不可为 null (?);

可能:

  • 聚合(可以)被隐式地重新分配;请参阅 Marcelo Cantos 的回复和评论。
  • 聚合(可以)有接口;
  • 聚合(可以)有方法;

缺点

由于聚合不会取代结构,而是另一种组织方案,我找不到太多缺点,但希望 S/O 的 C# 老手能够填充这个 CW 部分。最后一点,请直接回答问题并进行讨论:如本文所述,C# 会对聚合类有利吗?我无论如何都不是 C# 专家,只是 C# 语言的爱好者,并且怀念这个对我来说至关重要的功能。我正在向经验丰富的程序员寻求有关此案例的建议和评论。 我知道存在许多解决方法,并且每天都在积极使用它们,我只是认为它们太常见了,不容忽视。

Foreword

tl;wr: This is a discussion.

I am aware that this "question" is more of a discussion, hence why I will mark it as community wiki. However, according to the How to Ask page, it could belong here, as it is specifically programming related, not discussed anywhere on the web after an hour of research, specific, relevant to most C# programmers, and on-topic. Moreover, the question itself is meant to obtain an answer, for which I'd stay open-minded regardless of my bias: would C# really benefit from aggregate structs? Notwithstanding this foreword, I'd understand this to be closed, but would appreciate if the users with the authority and intention to close redirected me to an appropriate discussion spot on the Web.


Introduction


Lacks of struct mutability

Structs are flexible but debated types in C#. They offer the stack-allocated value type organizational paradigm, but not the immutability of other value types.

Some say structs should represent values, and values do not change (e.g. int i = 5;, 5 is immutable), while some perceive them as OOP layouts with subfields.

The debate on struct immutability (1, 2, 3) , for which the current solution seems to be having the programmer enforce immutability, is also unsolved.

For instance, the C# compiler will detect possible data loss when structs are accessed as a reference (bottom of this page) and restrict assignment. Moreover, since struct constructors, properties and functions are able to do whichever operation, with the limit (for constructors) of assigning all the fields before returning controls, structs cannot be declared as constant, which would be a correct declaration if they were limited to data representation.


Immutable subset of structs, aggregates

Aggregate classes (Wikipedia) are strict data structures with limited functionality, destined to offer syntactic sugar in counterpart for their lack of flexibility. In C++, they have "no user-declared constructors, no private or protected non-static data members, no base classes, and no virtual functions". The theoretical specifics of such classes in C# are herein open for debate, although the core concept remains the same.

Since aggregate structs are strictly data holders with labeled accessors, their immutability (in a possible C# context) would be insured. Aggregates also couldn't be nulled, unless the null operator (?) is specified, as for other pure value types. For this reason, many illegal struct operations would become possible, as well as some syntactic sugar.


Uses


  1. Aggregates could be declared as const, since their constructor would be enforced to do nothing but assign the fields.
  2. Aggregates could be used as default values for method parameters.
  3. Aggregates could be implicitly Sequential, facilitating interaction with native
  4. Aggregates would be immutable, enforcing no data loss for reference access. Compiler detection of such subfield modifications could lead to a complete, implicit reassignment.libraries.

Hypothetical Syntax


Taking from the C++ syntax, we could imagine something along the lines of:
(Remember, this is a community wiki, improvement is welcome and encouraged)

aggregate Size
{
    int Width;
    int Height;
}

aggregate Vector
{
    // Default values for constructor.
    double X = 0, Y = 0, Z = 0;
}

aggregate Color
{
    byte R, G, B, A = 255;
}

aggregate Bar
{
    int X;
    Qux Qux;
}

aggregate Qux
{
    int X, Y;
}

static class Foo
{
    // Constant is possible.
    const Size Big = new Size(200, 100);

    // Inline constructor.
    const Vector Gravity = { 0, -9.8, 0 };

    // Default value / labeled parameter.
    const Color Fuschia = { 255, 0, 255 };
    const Vector Up = { y: 1 };

    // Sub-aggregate initialization
    const Bar Test = { 20, { 4, 3 } };

    static void SetVelocity(Vector velocity = { 0, 1, 0 }) { ... }
    static void SetGravity(Vector gravity = Foo.Gravity) { ... }

    static void Main()
    {
        Vector v = { 1, 2, 3 };

        double y = v.Y; // Valid.

        v.Y = 5; // Invalid, immutable.
    }
}

Implicit (re)Assignment

As of today, assigning a subfield of a struct in C# 4.0 is valid:

Vector v = new Vector(1, 2, 3);
v.Z = 5; // Legal in current C#.

However, sometimes, the compiler can detect when structs are mistakenly accessed as references, and will forbid changing subfields. For example, (example question)

//(in a Windows.Forms context)
control.Size.Width = 20; // Illegal in current C#.

As Size is a property and struct Size a value type, we would be editing a copy/clone of the actual property, which would be useless in such a case. As C# users, we tend to assume most things are accessed by reference, especially in OOP designs, which would make us think that such a call is legitimate (and it would be, if struct Size were a class).

Moreover, when accessing collections, the compiler also forbids us from modifying a struct subfield: (example question)

List<Vector> vectors = ... // Imagine populated data.
vectors[4].Y = 10; // Illegal in current C#.

The good news about these unfortunate restrictions is that the compiler does half of the possible aggregate solution for such cases: detect when they occur. The other half would be to implicitly reassign a new aggregate with the changed value.

  • When in local scope, simply reassign the vector.
  • When in external scope, locate a get, and if a matching set accessor is accessible, reassign to this one.

For this to be done and in order to avoid confusion, the delegate must be marked as implicit:

implicit aggregate Vector { ... }
implicit aggregate Size { ... }


// Example 1
{
    Vector v = new Vector(1, 2, 3);
    v.Z = 5; // Legal with implicit aggregates.

    // What is implicitly done:
    v = new Vector(v.X, v.Y, 5); // Local variable, simply reassign.
}

// Example 2
{
    //(in a Windows.Forms context)
    control.Size.Width = 20; // Legal with implicit aggregates.

    // What is implicitly done:
    Size old = control.Size.__get(); // External, MSIL detects a get.
    // If MSIL can find a matching, accessible __set:
    control.Size.__set({ 20, old.Height });
}

// Example 3
{
    List<Vector> vectors = ... // Imagine populated data.
    vectors[4].Y = 10; // Legal with implicit aggregates.

    // What is implicitly done:
    Vector old = vectors[4].__get(); // External, MSIL detects a get.
    // If MSIL can find a matching, accessible __set:
    vectors[4].__set({ old.X, 10, old.Z });
}

// Example 4
{
    Vector The5thVector(List<Vector> vectors) { return vectors[4]; }
    ...
    List<Vector> vectors = ...;
    The5thVector(vectors).Y = 10; // Illegal with implicit aggregates.

    // This is illegal because the compiler cannot find an implicit
    // "set" to match. as it is a function return, not a property or
    // indexer.
}

Of course, this last implicit reassignment is only a syntactic simplification which could or could not be adopted. I simply propose it as the compiler seems to be able to detect such reference access to structs and could easily convert the code for the programmer if it was an aggregate.


Summary

  • Aggregates can have fields;
  • Aggregates are value types;
  • Aggregates are immutable;
  • Aggregates are allocated on the stack;
  • Aggregates cannot inherit;
  • Aggregates have a sequential layout;
  • Aggregates have a sequential default constructor;
  • Aggregates cannot have user defined constructors;
  • Aggregates can have default values and labeled constructions;
  • Aggregates can be defined inline;
  • Aggregates can be declared as constant;
  • Aggregates can be used as default parameters;
  • Aggregates are non-nullable unless specified (?);

Possibly:

  • Aggregates (could) be implicitly reassigned; See Marcelo Cantos' reply and comment.
  • Aggregates (could) have interfaces;
  • Aggregates (could) have methods;

Cons

As aggregates wouldn't replace structs but rather be another organizational scheme, I cannot find many cons, but hope that the C# veterans of S/O will be able to populate this CW section. On a last note, please answer the question directly, as well as discussing it: would C# benefit for aggregate classes as described in this post? I am no C# expert in any way, but only an enthusiast of the C# language, and miss this feature which seems crucial to me. I'm seeking advice and comments from experienced programmers regarding this case. I am aware that there are numerous workarounds that exist and actively use them everyday, I simply think that they are too common to be ignored.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

素罗衫 2024-10-19 06:00:54

我希望结构体首先是用类似于您提出的语义的东西来定义的。

然而,我们仍然坚持现有的东西,我认为我们不太可能在 CLR 中引入一种全新的“类型”。引入一种新类型意味着将其引入每种 .NET 语言,而不仅仅是 C#,这是一个很大的变化。

我认为更有可能的是 - 请记住,当我谈论假设的、未宣布的未来产品的假设语言功能时,这些产品不存在且可能永远不存在,我这样做仅用于娱乐目的 ——我们会找到一些方法来对类和结构进行更好的不变性注释和强制执行。无论所讨论的类型是值类型还是引用类型,编译器都可以更好地强制执行不变性并使其更容易以不可变风格进行编程。如果编译器或 CLR 在编译时或 jit 时具有更多已知的不变性保证,那么它也可以更好地优化在多核计算机上运行的代码。

当您苦苦思考您的提案时,您可能需要考虑一个有趣的问题:如果聚合类型具有方法,那么“this”是还是变量?例如:

aggregate Vector
{
    int x, y, z;
    public void M(Action action)
    {
         Console.WriteLine(this.x);
         action();
         Console.WriteLine(this.x);
    }
}
...
Vector v = new Vector(1, 2, 3);
Action action = ()=>{ v = new Vector(4, 5, 6); };
v.M(action);

发生什么事? “this”是否通过传递给M,在这种情况下它会写出两次“1”,或者它是否作为对变量的引用传递,其中如果观察到您所谓的“不可变”类型发生变异? (因为变异的是变量;根据定义,变量是允许变异的,这就是为什么它们被称为“可变的”。)

I wish that structs had been defined with something like your proposed semantics in the first place.

However, we're stuck with what we've got now and I think it is unlikely that we'll ever get a whole new "kind of type" into the CLR. Introducing a new kind of type means introducing it to every .NET language, not just C#, and that's a big change.

I think what is more likely -- and remember, when I talk about hypothetical language features for hypothetical, unannounced future products that don't exist and may never exist, I'm doing so for entertainment purposes only -- is that we'll find some way to make better immutability annotations and enforcements on both classes and structs. The compiler could do a better job of both enforcing immutability and making it easier to program in an immutable style, regardless of whether the type in question is a value type or a reference type. And the compiler or CLR could also potentially do a better job of optimizing code that works on multi core machines if it had more immutability guarantees known at compile time or jit time.

While you are noodling away at your proposal, an interesting question you might want to consider is: if aggregate types have methods, is "this" a value or a variable? For example:

aggregate Vector
{
    int x, y, z;
    public void M(Action action)
    {
         Console.WriteLine(this.x);
         action();
         Console.WriteLine(this.x);
    }
}
...
Vector v = new Vector(1, 2, 3);
Action action = ()=>{ v = new Vector(4, 5, 6); };
v.M(action);

What happens? Does "this" get passed to M by value, in which case it writes out "1" twice, or does it get passed as a reference to the variable, in which case your so-called "immutable" type is observed to mutate? (Because what is mutating is the variable; by definition variables are allowed to mutate, that's why they're called "vary-able".)

也只是曾经 2024-10-19 06:00:54

这会做什么呢?

List<Vector> vectors = ...;
Vector v = vectors[4];
v.Y = 10;

或者这个?

Vector The5thVector(List<Vector> vectors) { return vectors[4]; }
...
List<Vector> vectors = ...;
The5thVector(vectors).Y = 10;

用隐式赋值替换诊断不会让你走得太远。可变结构如此成问题是有原因的,并且简单地声明一个新概念(聚合)并不能解决任何这些问题。

最好的解决方案是首先禁止语言中存在可变结构。第二个最佳解决方案是表现得好像他们不被允许一样。结构应该很小并且独立,这消除了使它们不可变的任何缺点。

What would this do?

List<Vector> vectors = ...;
Vector v = vectors[4];
v.Y = 10;

or this?

Vector The5thVector(List<Vector> vectors) { return vectors[4]; }
...
List<Vector> vectors = ...;
The5thVector(vectors).Y = 10;

Replacement of diagnostics with implicit assignment won't get you very far. There's a reason mutable structs are so problematic, and simply declaring a new concept, aggregates, won't fix any of these problems.

The best solution would have been to disallow mutable structs in the language in the first place. The second best solution is to behave as if they were disallowed. Structs are supposed to be small and self-contained, which eliminates any disadvantages to making them immutable.

倒带 2024-10-19 06:00:54

不,它不会受益。无论如何,结构作为可变类型更好。

首先......“隐式重新分配的不变性”实际上只是“低效的可变性”。

给定一个“Point”结构,如果您只想更改 X 的值,为什么要强制重写整个内存结构呢?仅仅单独覆盖 X 比用新值覆盖 X 并用当前值毫无意义地覆盖 Y 更有效。这样的计划不会有任何好处。

老实说,整个可变性主题只是一个视角问题。只有在将复杂对象作为一个整体引用时,讨论可变性才有意义,并询问其各个部分是否会改变值,同时维护对整个对象的引用。

例如,将字符串称为不可变是有意义的,因为您将其称为表示字符集合的特定内存块,从引用它的任何事物的角度来看,其中字符不会更改值。另一方面,int 结构是可变的,因为它的值可以通过简单的赋值来更改,并且对 int 结构的任何引用(指针)都会看到这些更改。

至于结构或聚合方法中的“this”,当然它应该始终引用结构/聚合在堆栈上的内存位置,因此通过匿名方法和委托更改结构值的更新应该被反映并视为可变的。总而言之,可变性在基本变量级别上是一个好主意,而不变性最好在更高级别上处理,其中表示复杂对象并且显式编码“不可变”行为。

No, it would not benefit. Structs are better as mutable types anyway.

First of all... "Immutability with implicit reassignment" is really just "inefficient mutability".

Given a "Point" structure, if you intend to change only the value of X, why force a rewrite of the entire memory structure? Just overwriting X alone is more efficient than overwriting X with a new value and pointlessly overwriting Y with its current value. There would be no benefit to such a scheme.

Honestly, the whole topic of mutability is a matter of perspective. It really only makes sense to talk about mutability when referring to a complex object as a whole, and asking whether its individual pieces change value while maintaining references to the object as a whole.

For example, it makes sense to call a string immutable, because you refer to it as particular block of memory representing a collection of characters, in which the characters don't change value from the perspective of anything that has a reference to it. An int struct, on the other hand, is mutable, because it's value can be changed by a simple assignment, and any references (pointers) to the int struct will see those changes.

As for "this" in struct or aggregate methods, of course it should refer to the struct/aggregate's memory location on stack at all times, so updates via anonymous methods and delegates that change the struct's value, should be reflected and seen as mutable. To summarize, mutability is a good idea at a fundamental variable level, and immutability is best handled at a higher level where complex objects are represented and the "immutable" behavior is explicitly coded.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文