如何处理返回结构的不可变性？

发布于 2024-09-17 05:20:29 字数 973 浏览 8 评论 0原文

我正在编写一个具有巨大二维“单元”数组的游戏。一个单元仅占用 3 个字节。我还有一个名为 CellMap 的类，其中包含作为私有字段的 2D 数组，并提供通过公共索引器对其进行访问。

分析表明，性能问题是由过多 Cell 对象的垃圾回收引起的。所以我决定让 Cell 成为一个结构（它是一个类）。

但现在这样的代码不起作用：

cellMap[x, y].Population++;

我可以想到很多选择，但我并不真正喜欢其中任何一个。

将数组公开，并写入 cellMap.Data[x, y].Population = 5;
停止使用 CellMap 类，只需使用直接二维数组。但 CellMap 非常方便，因为它实现了自己优化的序列化，并且公开了比编写 cellMap.GetLength(0)< 更方便的 Width 和 Height 属性。 /code>
使单元格不可变。但是代码会是什么样子呢？ cellMap[x, y] = IncrementCellPopulation(cellMap[x, y])？非常详细。
几个实用函数，例如 cellMap.SetPopulationAt(x, y, 5)
在拥有 CellMap 的每个类中，添加一个实用属性，例如private Cell[,] CellData { get { return this.CellMap.GetInternalArray(); } }，那么我的代码可能看起来像 CellData[x, y].Population++

这个问题传统上是如何解决的？

原文

I'm writing a game that has a huge 2D array of "cells". A cell takes only 3 bytes. I also have a class called CellMap, which contains the 2D array as a private field, and provides access to it via a public indexer.

Profiling showed that a performance problem is caused by garbage collection of too many Cell objects. So I decided to make Cell a struct (it was a class).

But now code like this doesn't work:

cellMap[x, y].Population++;

I can think of many options, but I don't really like any of them.

Make the array public, and write cellMap.Data[x, y].Population = 5;
Stop using a CellMap class, and just use a 2D array directly. But CellMap is very convenient because it implements its own optimized serialization, and it exposes Width and Height properties that are more convenient than writing cellMap.GetLength(0)
Make Cell immutable. But then how would the code look? cellMap[x, y] = IncrementCellPopulation(cellMap[x, y])? Very verbose.
A couple of utility functions like cellMap.SetPopulationAt(x, y, 5)
In every class that owns a CellMap, add a utility property like private Cell[,] CellData { get { return this.CellMap.GetInternalArray(); } }, so then my code can look like CellData[x, y].Population++

How is this problem traditionally solved?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

拿命拼未来 2024-09-24 05:21:12

Eric Lippert 的方法很好，但我建议间接访问器使用基类而不是接口。以下程序演示了一个类，其行为类似于稀疏点数组。如果人们从不保留任何 PointRef(*) 类型的项目，那么事情应该会顺利进行。说：

  MyPointHolder(123) = somePoint

或者

  MyPointHolder(123).thePoint = somePoint

都会创建一个临时 pointRef 对象（一种情况下是 pointRef.onePoint；另一种情况下是 pointHolder.IndexedPointRef），但扩大类型转换可以维护值语义。当然，如果（1）值类型上的方法可以标记为修改器，并且（2）编写通过属性访问的结构的字段可以自动读取属性，编辑临时结构，并写入，那么事情会容易得多它回来了。这里使用的方法是有效的，尽管我不知道有什么方法可以使其通用。

(*) PointRef 类型的项只能由属性返回，并且绝不应存储在变量中或用作除将转换为 Point 的 setter 属性之外的任何内容的参数。

MustInherit Class PointRef
    Public MustOverride Property thePoint() As Point
    Public Property X() As Integer
        Get
            Return thePoint.X
        End Get
        Set(ByVal value As Integer)
            Dim mypoint As Point = thePoint
            mypoint.X = value
            thePoint = mypoint
        End Set
    End Property
    Public Property Y() As Integer
        Get
            Return thePoint.X
        End Get
        Set(ByVal value As Integer)
            Dim mypoint As Point = thePoint
            mypoint.Y = value
            thePoint = mypoint
        End Set
    End Property
    Public Shared Widening Operator CType(ByVal val As Point) As PointRef
        Return New onePoint(val)
    End Operator
    Public Shared Widening Operator CType(ByVal val As PointRef) As Point
        Return val.thePoint
    End Operator
    Private Class onePoint
        Inherits PointRef

        Dim myPoint As Point

        Sub New(ByVal pt As Point)
            myPoint = pt
        End Sub

        Public Overrides Property thePoint() As System.Drawing.Point
            Get
                Return myPoint
            End Get
            Set(ByVal value As System.Drawing.Point)
                myPoint = value
            End Set
        End Property
    End Class
End Class


Class pointHolder
    Dim myPoints As New Dictionary(Of Integer, Point)
    Private Class IndexedPointRef
        Inherits PointRef

        Dim ref As pointHolder
        Dim index As Integer
        Sub New(ByVal ref As pointHolder, ByVal index As Integer)
            Me.ref = ref
            Me.index = index
        End Sub
        Public Overrides Property thePoint() As System.Drawing.Point
            Get
                Dim mypoint As New Point(0, 0)
                ref.myPoints.TryGetValue(index, mypoint)
                Return mypoint
            End Get
            Set(ByVal value As System.Drawing.Point)
                ref.myPoints(index) = value
            End Set
        End Property
    End Class

    Default Public Property item(ByVal index As Integer) As PointRef
        Get
            Return New IndexedPointRef(Me, index)
        End Get
        Set(ByVal value As PointRef)
            myPoints(index) = value.thePoint
        End Set
    End Property

    Shared Sub test()
        Dim theH1, theH2 As New pointHolder
        theH1(5).X = 9
        theH1(9).Y = 20
        theH2(12).X = theH1(9).Y
        theH1(20) = theH2(12)
        theH2(12).Y = 6
        Dim h5, h9, h12, h20 As Point
        h5 = theH1(5)
        h9 = theH1(9)
        h12 = theH2(12)
        h20 = theH1(20)
    End Sub
End Class

Eric Lippert's approach is good, but I would suggest using a base class rather than an interface for the indirect accessor. The following program demonstrates a class which acts like a sparse array of points. Provided that one never persists any item of type PointRef(*), things should work beautifully. Saying:

  MyPointHolder(123) = somePoint

  MyPointHolder(123).thePoint = somePoint

will both create a temporary pointRef object (a pointRef.onePoint in one case; a pointHolder.IndexedPointRef in the other) but the widening typecasts work to maintain value semantics. Of course, things would have been much easier if (1) methods on value types could be marked as mutators, and (2) writing a field of a structure accessed via property would could automatically read the property, edit the temporary structure, and write it back. The approach used here works, though alas I don't know any way to make it generic.

(*) Items of type PointRef should only be returned by properties, and should never be stored in a variable or used as parameters to anything other than a setter property which will convert to a Point.

MustInherit Class PointRef
    Public MustOverride Property thePoint() As Point
    Public Property X() As Integer
        Get
            Return thePoint.X
        End Get
        Set(ByVal value As Integer)
            Dim mypoint As Point = thePoint
            mypoint.X = value
            thePoint = mypoint
        End Set
    End Property
    Public Property Y() As Integer
        Get
            Return thePoint.X
        End Get
        Set(ByVal value As Integer)
            Dim mypoint As Point = thePoint
            mypoint.Y = value
            thePoint = mypoint
        End Set
    End Property
    Public Shared Widening Operator CType(ByVal val As Point) As PointRef
        Return New onePoint(val)
    End Operator
    Public Shared Widening Operator CType(ByVal val As PointRef) As Point
        Return val.thePoint
    End Operator
    Private Class onePoint
        Inherits PointRef

        Dim myPoint As Point

        Sub New(ByVal pt As Point)
            myPoint = pt
        End Sub

        Public Overrides Property thePoint() As System.Drawing.Point
            Get
                Return myPoint
            End Get
            Set(ByVal value As System.Drawing.Point)
                myPoint = value
            End Set
        End Property
    End Class
End Class


Class pointHolder
    Dim myPoints As New Dictionary(Of Integer, Point)
    Private Class IndexedPointRef
        Inherits PointRef

        Dim ref As pointHolder
        Dim index As Integer
        Sub New(ByVal ref As pointHolder, ByVal index As Integer)
            Me.ref = ref
            Me.index = index
        End Sub
        Public Overrides Property thePoint() As System.Drawing.Point
            Get
                Dim mypoint As New Point(0, 0)
                ref.myPoints.TryGetValue(index, mypoint)
                Return mypoint
            End Get
            Set(ByVal value As System.Drawing.Point)
                ref.myPoints(index) = value
            End Set
        End Property
    End Class

    Default Public Property item(ByVal index As Integer) As PointRef
        Get
            Return New IndexedPointRef(Me, index)
        End Get
        Set(ByVal value As PointRef)
            myPoints(index) = value.thePoint
        End Set
    End Property

    Shared Sub test()
        Dim theH1, theH2 As New pointHolder
        theH1(5).X = 9
        theH1(9).Y = 20
        theH2(12).X = theH1(9).Y
        theH1(20) = theH2(12)
        theH2(12).Y = 6
        Dim h5, h9, h12, h20 As Point
        h5 = theH1(5)
        h9 = theH1(9)
        h12 = theH2(12)
        h20 = theH1(20)
    End Sub
End Class

回复收藏 0 原文

日裸衫吸 2024-09-24 05:21:07

封装您希望 CellMap 执行的操作，并仅允许通过 IncrementPopupation(int x, int y) 等适当的方法访问实际数组。在大多数情况下，将数组（或任何变量）公开是一种严重的代码味道，就像在 .NET 中返回数组一样。

出于性能考虑，考虑使用一维数组；这些在 .NET 中要快得多。

回复收藏 0 原文

看透却不说透 2024-09-24 05:21:02

如果您的单元格地图实际上是“稀疏”的，也就是说，如果有很多相邻单元格没有值或有默认值，我建议您不要为这些创建单元格对象。只为实际上具有某些非默认状态的单元格创建对象。（这可能会大量减少单元格总数，从而减轻垃圾收集器的压力。）

这种方法当然需要您找到一种新的方法来存储单元格图。您必须摆脱将单元格存储在数组中（因为它们并不稀疏），并采用不同类型的数据结构，可能是树。

例如，您可以将地图细分为多个统一区域，以便可以将任何单元格坐标转换为相应的区域。（您可以根据相同的想法将每个区域进一步细分为子区域。）然后，您可以为每个区域创建一个搜索树，其中单元格坐标充当树中的键。

这样的方案将允许您仅存储所需的单元格，同时仍然提供对地图中任何单元格的快速访问。如果在树中没有找到某个指定坐标处的单元格，则可以假定它是默认单元格。

回复收藏 0 原文

臻嫒无言 2024-09-24 05:20:58

6.在改变值的方法中使用 ref 参数，将其称为 IncrementCellPopulation(ref cellMap[x, y])

回复收藏 0 原文

夢归不見 2024-09-24 05:20:54

如果您想让 Cell 不可变 - 就像您应该在它是一个结构时一样 - 那么一个好的技术是创建一个工厂，它是 Cell 上的实例方法：

struct C
{
    public int Foo { get; private set; }
    public int Bar { get; private set; }
    private C (int foo, int bar) : this()
    {
        this.Foo = foo;
        this.Bar = bar;
    }
    public static C Empty = default(C);
    public C WithFoo(int foo)
    {
        return new C(foo, this.Bar);
    }
    public C WithBar(int bar)
    {
        return new C(this.Foo, bar);
    }
    public C IncrementFoo()
    {
        return new C(this.Foo + 1, bar);
    }
    // etc
}
...
C c = C.Empty;
c = c.WithFoo(10);
c = c.WithBar(20);
c = c.IncrementFoo();
// c is now 11, 20

所以您的代码将类似于

map[x,y] = map[x,y].IncrementPopulation();

但是，我想这可能是一条死胡同；最好从一开始就不要有那么多细胞，而不是试图优化一个有数千个细胞的世界。我会写另一个答案关于这一点。

If you want to make Cell immutable - as you should if it is a struct - then a good technique is to make a factory that is an instance method on the Cell:

struct C
{
    public int Foo { get; private set; }
    public int Bar { get; private set; }
    private C (int foo, int bar) : this()
    {
        this.Foo = foo;
        this.Bar = bar;
    }
    public static C Empty = default(C);
    public C WithFoo(int foo)
    {
        return new C(foo, this.Bar);
    }
    public C WithBar(int bar)
    {
        return new C(this.Foo, bar);
    }
    public C IncrementFoo()
    {
        return new C(this.Foo + 1, bar);
    }
    // etc
}
...
C c = C.Empty;
c = c.WithFoo(10);
c = c.WithBar(20);
c = c.IncrementFoo();
// c is now 11, 20

So your code would be something like

map[x,y] = map[x,y].IncrementPopulation();

However, I think this is possibly a blind alley; it might be better to simply not have so many Cells around in the first place, rather than trying to optimize a world where there are thousands of them. I'll write up another answer on that.

回复收藏 0 原文

↙厌世 2024-09-24 05:20:48

所以这里实际上有两个问题。您实际上问了一个问题：有哪些技术可以处理结构应该是不可变的这一事实，因为它们是按值复制的，但您想要改变一个。然后还有一个问题激发了这个问题，即“我怎样才能使我的程序的性能可以接受？”

我的另一个答案解决了第一个问题，但第二个问题也很有趣。

首先，如果探查器实际上已确定性能问题是由于单元格的垃圾收集造成的，那么将单元格转换为结构体可能会有所帮助。也有可能它根本没有帮助，而且有可能这样做会让情况变得更糟。

您的单元格不包含任何引用类型；我们知道这一点是因为你说过它们只有三个字节。如果阅读本文的其他人认为他们可以通过将类转换为结构来进行性能优化，那么它可能根本没有帮助，因为该类可能包含引用类型的字段，在这种情况下垃圾收集器仍然必须收集每个实例，即使它变成了值类型。里面的引用类型也需要收集！如果 Cell 仅包含值类型（显然确实如此），出于性能原因，我建议尝试此操作。

这可能会让情况变得更糟，因为值类型并不是万能的。他们也有成本。值类型的复制通常比引用类型更昂贵（引用类型几乎总是寄存器的大小，几乎总是在适当的内存边界上对齐，因此芯片针对复制它们进行了高度优化）。并且值类型始终被复制。

现在，在您的情况下，您有一个比引用更小的结构；引用通常是四个或八个字节。你将它们放入一个数组中，这意味着你正在将数组打包；如果有一千个，则需要三千字节。这意味着每四个结构中就有三个未对齐，这意味着需要更多时间（在许多芯片架构上）从数组中获取值。您可能会考虑测量将结构填充到四个字节的影响，看看这是否会产生影响，前提是您仍将它们保留在数组中，这引出了我的下一点。 ..

Cell 抽象可能只是一个糟糕的抽象用于存储有关大量细胞的数据。如果问题是 Cell 是类，您要保留数千个 Cell 的数组，并且收集它们的成本很高，那么除了将 Cell 制作为结构体之外，还有其他解决方案。例如，假设一个 Cell 包含两个字节的 Population 和一个字节的 Color。这就是 Cell 的机制，但肯定不是您想要向用户公开的界面。 您的机制没有理由必须使用与接口相同的类型。因此，您可以按需制造 Cell 类的实例：

interface ICell
{
   public int Population { get; set; }
   public Color Color { get; set; }
}
private class CellMap
{
    private ushort[,] populationData; // Profile the memory burden vs speed cost of ushort vs int
    private byte[,] colorData; // Same here. 
    public ICell this[int x, int y] 
    {
        get { return new Cell(this, x, y); }
    }

    private sealed class Cell : ICell
    {
        private CellMap map;
        private int x;
        private int y;
        public Cell(CellMap map, int x, int y)
        {
            this.map = map; // etc
        }
        public int Population  
        {
            get { return this.map.populationData[this.x, this.y]; } 
            set { this.map.populationData[this.x, this.y] = (ushort) value; } 
        }

等等。 按需制造电池。如果它们的寿命很短，它们几乎会立即被收集。 CellMap 是一个抽象，因此使用抽象来隐藏混乱的实现细节。

使用这种架构，您不会遇到任何垃圾收集问题，因为您几乎没有活动的 Cell 实例，但您仍然可以说

map[x,y].Population++;

没问题，因为第一个索引器创建了一个不可变的对象，它知道如何更新地图的状态。 Cell 不需要是可变的；请注意，Cell 类是完全不可变的。（哎呀，这里的 Cell 可能是一个结构体，当然，将其转换为 ICell 无论如何都会将其装箱。）映射是可变的，并且单元格会改变映射用户。

So there are actually two problems here. There's the question you actually asked: what are techniques to deal with the fact that structs ought to be immutable because they are copied by value, but you want to mutate one. And then there's the question which is motivating this one, which is "how can I make the performance of my program acceptable?"

My other answer addresses the first question, but the second question is interesting as well.

First off, if the profiler has actually identified that the performance problem is due to garbage collection of cells, then it is possible that making cell into a struct will help. It is also possible that it will not help at all, and it is possible that doing so will make it worse.

Your cells do not contain any reference types; we know this because you've said they are only three bytes. If someone else reading this is thinking that they could make a performance optimization by turning a class into a struct then it might not help at all because the class might contain a field of reference type, in which case the garbage collector still has to collect every instance, even if it is turned into a value type. The reference types in it need to be collected too! I would only recommend attempting this for performance reasons if Cell contains only value types, which apparently it does.

It might make it worse because value types are not a panacea; they have costs too. Value types are often more expensive to copy than reference types (which are pretty much always the size of a register, almost always aligned on the appropriate memory boundary, and therefore the chip is highly optimized for copying them). And value types are copied all the time.

Now, in your case you have a struct which is smaller than a reference; references are four or eight bytes typically. And you're putting them in an array, which means that you are packing the array down; if you have a thousand of them, it'll take three thousand bytes. Which means that three out of every four structs in there are misaligned, meaning more time (on many chip architectures) to get the value out of the array. You might consider measuring the impact of padding your struct out to four bytes to see if that makes a difference, provided you're still going to keep them in an array, which brings me to my next point...

The Cell abstraction might simply be a bad abstraction for the purpose of storing data about lots of cells. If the problem is that Cells are classes, you're keeping an array of thousands of Cells, and collecting them is expensive, then there are solutions other than making Cell into a struct. Suppose for example that a Cell contains two bytes of Population and one byte of Color. That is the mechanism of Cell, but surely that is not the interface you want to expose to the users. There is no reason why your mechanism has to use the same type as the interface. And therefore you could manufacture instances of the Cell class on demand:

interface ICell
{
   public int Population { get; set; }
   public Color Color { get; set; }
}
private class CellMap
{
    private ushort[,] populationData; // Profile the memory burden vs speed cost of ushort vs int
    private byte[,] colorData; // Same here. 
    public ICell this[int x, int y] 
    {
        get { return new Cell(this, x, y); }
    }

    private sealed class Cell : ICell
    {
        private CellMap map;
        private int x;
        private int y;
        public Cell(CellMap map, int x, int y)
        {
            this.map = map; // etc
        }
        public int Population  
        {
            get { return this.map.populationData[this.x, this.y]; } 
            set { this.map.populationData[this.x, this.y] = (ushort) value; } 
        }

and so on. Manufacture the cells on demand. They will almost immediately be collected if they are short-lived. CellMap is an abstraction, so use the abstraction to hide the messy implementation details.

With this architecture you don't have any garbage collection problems because you have almost no live Cell instances, but you can still say

map[x,y].Population++;

no problem, because the first indexer manufactures an immutable object which knows how to update the state of the map. The Cell doesn't need to be mutable; notice that the Cell class is completely immutable. (Heck, the Cell could be a struct here, though of course casting it to ICell would just box it anyway.) It is the map which is mutable, and the cell mutates the map for the user.

回复收藏 0 原文

~没有更多了~