多维数组“[,]”之间的差异 以及数组“[][]”的数组 在 C# 中?

发布于 2024-07-15 02:46:42 字数 114 浏览 8 评论 0原文

C# 中多维数组 double[,] 和数组数组 double[][] 之间有什么区别?

有什么区别吗?
每一种的最佳用途是什么?

What are the differences between multidimensional arrays double[,] and array of arrays double[][] in C#?

If there is a difference?
What is the best use for each one?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

等风也等你 2024-07-22 02:46:42

数组的数组(锯齿状数组)比多维数组更快,并且可以更有效地使用。 多维数组有更好的语法。

如果您使用锯齿状和多维数组编写一些简单的代码,然后使用 IL 反汇编器检查编译后的程序集,您将看到锯齿状(或单维)数组的存储和检索是简单的 IL 指令,而多维数组的相同操作是方法调用总是比较慢。

考虑以下方法:

static void SetElementAt(int[][] array, int i, int j, int value)
{
    array[i][j] = value;
}

static void SetElementAt(int[,] array, int i, int j, int value)
{
    array[i, j] = value;
}

它们的 IL 如下:

.method private hidebysig static void  SetElementAt(int32[][] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       7 (0x7)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldelem.ref
  IL_0003:  ldarg.2
  IL_0004:  ldarg.3
  IL_0005:  stelem.i4
  IL_0006:  ret
} // end of method Program::SetElementAt

.method private hidebysig static void  SetElementAt(int32[0...,0...] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       10 (0xa)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldarg.2
  IL_0003:  ldarg.3
  IL_0004:  call       instance void int32[0...,0...]::Set(int32,
                                                           int32,
                                                           int32)
  IL_0009:  ret
} // end of method Program::SetElementAt

使用锯齿状数组时,您可以轻松执行行交换和行调整大小等操作。 也许在某些情况下,使用多维数组会更安全,但即使 Microsoft FxCop 也告诉您,当您使用它来分析项目时,应该使用锯齿状数组而不是多维数组。

Array of arrays (jagged arrays) are faster than multi-dimensional arrays and can be used more effectively. Multidimensional arrays have nicer syntax.

If you write some simple code using jagged and multidimensional arrays and then inspect the compiled assembly with an IL disassembler you will see that the storage and retrieval from jagged (or single dimensional) arrays are simple IL instructions while the same operations for multidimensional arrays are method invocations which are always slower.

Consider the following methods:

static void SetElementAt(int[][] array, int i, int j, int value)
{
    array[i][j] = value;
}

static void SetElementAt(int[,] array, int i, int j, int value)
{
    array[i, j] = value;
}

Their IL will be the following:

.method private hidebysig static void  SetElementAt(int32[][] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       7 (0x7)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldelem.ref
  IL_0003:  ldarg.2
  IL_0004:  ldarg.3
  IL_0005:  stelem.i4
  IL_0006:  ret
} // end of method Program::SetElementAt

.method private hidebysig static void  SetElementAt(int32[0...,0...] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       10 (0xa)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldarg.2
  IL_0003:  ldarg.3
  IL_0004:  call       instance void int32[0...,0...]::Set(int32,
                                                           int32,
                                                           int32)
  IL_0009:  ret
} // end of method Program::SetElementAt

When using jagged arrays you can easily perform such operations as row swap and row resize. Maybe in some cases usage of multidimensional arrays will be more safe, but even Microsoft FxCop tells that jagged arrays should be used instead of multidimensional when you use it to analyse your projects.

苍风燃霜 2024-07-22 02:46:42

多维数组创建了一个很好的线性内存布局,而锯齿状数组则意味着几个额外的间接级别。

在锯齿状数组中查找值 jagged[3][6] var jagged = new int[10][5] 的工作方式如下:

  • 查找索引处的元素3(这是一个数组)。
  • 查找该数组中索引 6 处的元素(这是一个值)。

对于本例中的每个维度,都有一个额外的查找(这是一种昂贵的内存访问模式)。

多维数组在内存中线性排列,通过将索引相乘得出实际值。 但是,给定数组 var mult = new int[10,30],该多维数组的 Length 属性返回元素总数,即 10 * 30 = 300。

锯齿状数组的 Rank 属性始终为 1,但多维数组可以具有任意等级。 任何数组的 GetLength 方法都可用于获取每个维度的长度。 对于本例中的多维数组,mult.GetLength(1) 返回 30。对

多维数组建立索引会更快。 例如,给定本例中的多维数组 mult[1,7] = 30 * 1 + 7 = 37,获取索引 37 处的元素。这是一种更好的内存访问模式,因为只有一个内存位置涉及到的,就是数组的基地址。

因此,多维数组分配连续的内存块,而交错数组不必是方形的,例如 jagged[1].Length 不必等于 jagged[2].Length< /code>,这对于任何多维数组都是正确的。

性能

从性能角度来看,多维数组应该更快。 速度要快得多,但由于 CLR 实现非常糟糕,所以速度并没有那么快。

 23.084  16.634  15.215  15.489  14.407  13.691  14.695  14.398  14.551  14.252 
 25.782  27.484  25.711  20.844  19.607  20.349  25.861  26.214  19.677  20.171 
  5.050   5.085   6.412   5.225   5.100   5.751   6.650   5.222   6.770   5.305 

第一行是锯齿状数组的计时,第二行显示多维数组,第三行显示应该是这样的。 该程序如下所示,仅供参考,这是在 Mono 上进行测试的。 (Windows 计时有很大不同,主要是由于 CLR 实现的变化)。

在 Windows 上,锯齿状数组的计时要优越得多,与我自己对多维数组查找的解释大致相同,请参阅“Single()”。 遗憾的是 Windows JIT 编译器实在是太愚蠢了,不幸的是这使得这些性能讨论变得困难,有太多的不一致之处。

这些是我在 Windows 上得到的计时,这里同样的处理,第一行是锯齿状数组,第二行是多维,第三行是我自己的多维实现,请注意与 Mono 相比,这在 Windows 上慢了多少。

  8.438   2.004   8.439   4.362   4.936   4.533   4.751   4.776   4.635   5.864
  7.414  13.196  11.940  11.832  11.675  11.811  11.812  12.964  11.885  11.751
 11.355  10.788  10.527  10.541  10.745  10.723  10.651  10.930  10.639  10.595

源代码:

using System;
using System.Diagnostics;
static class ArrayPref
{
    const string Format = "{0,7:0.000} ";
    static void Main()
    {
        Jagged();
        Multi();
        Single();
    }

    static void Jagged()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var jagged = new int[dim][][];
            for(var i = 0; i < dim; i++)
            {
                jagged[i] = new int[dim][];
                for(var j = 0; j < dim; j++)
                {
                    jagged[i][j] = new int[dim];
                    for(var k = 0; k < dim; k++)
                    {
                        jagged[i][j][k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Multi()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var multi = new int[dim,dim,dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        multi[i,j,k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Single()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var single = new int[dim*dim*dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        single[i*dim*dim+j*dim+k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }
}

A multidimensional array creates a nice linear memory layout while a jagged array implies several extra levels of indirection.

Looking up the value jagged[3][6] in a jagged array var jagged = new int[10][5] works like this:

  • Look up the element at index 3 (which is an array).
  • Look up the element at index 6 in that array (which is a value).

For each dimension in this case, there's an additional look up (this is an expensive memory access pattern).

A multidimensional array is laid out linearly in memory, the actual value is found by multiplying together the indexes. However, given the array var mult = new int[10,30], the Length property of that multidimensional array returns the total number of elements i.e. 10 * 30 = 300.

The Rank property of a jagged array is always 1, but a multidimensional array can have any rank. The GetLength method of any array can be used to get the length of each dimension. For the multidimensional array in this example mult.GetLength(1) returns 30.

Indexing the multidimensional array is faster. e.g. given the multidimensional array in this example mult[1,7] = 30 * 1 + 7 = 37, get the element at that index 37. This is a better memory access pattern because only one memory location is involved, which is the base address of the array.

A multidimensional array therefore allocates a continuous memory block, while a jagged array does not have to be square, e.g. jagged[1].Length does not have to equal jagged[2].Length, which would be true for any multidimensional array.

Performance

Performance wise, multidimensional arrays should be faster. A lot faster, but due to a really bad CLR implementation they are not.

 23.084  16.634  15.215  15.489  14.407  13.691  14.695  14.398  14.551  14.252 
 25.782  27.484  25.711  20.844  19.607  20.349  25.861  26.214  19.677  20.171 
  5.050   5.085   6.412   5.225   5.100   5.751   6.650   5.222   6.770   5.305 

The first row are timings of jagged arrays, the second shows multidimensional arrays and the third, well that's how it should be. The program is shown below, FYI this was tested running Mono. (The Windows timings are vastly different, mostly due to the CLR implementation variations).

On Windows, the timings of the jagged arrays are greatly superior, about the same as my own interpretation of what multidimensional array look up should be like, see 'Single()'. Sadly the Windows JIT-compiler is really stupid, and this unfortunately makes these performance discussions difficult, there are too many inconsistencies.

These are the timings I got on Windows, same deal here, the first row are jagged arrays, second multidimensional and third my own implementation of multidimensional, note how much slower this is on Windows compared to Mono.

  8.438   2.004   8.439   4.362   4.936   4.533   4.751   4.776   4.635   5.864
  7.414  13.196  11.940  11.832  11.675  11.811  11.812  12.964  11.885  11.751
 11.355  10.788  10.527  10.541  10.745  10.723  10.651  10.930  10.639  10.595

Source code:

using System;
using System.Diagnostics;
static class ArrayPref
{
    const string Format = "{0,7:0.000} ";
    static void Main()
    {
        Jagged();
        Multi();
        Single();
    }

    static void Jagged()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var jagged = new int[dim][][];
            for(var i = 0; i < dim; i++)
            {
                jagged[i] = new int[dim][];
                for(var j = 0; j < dim; j++)
                {
                    jagged[i][j] = new int[dim];
                    for(var k = 0; k < dim; k++)
                    {
                        jagged[i][j][k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Multi()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var multi = new int[dim,dim,dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        multi[i,j,k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Single()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var single = new int[dim*dim*dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        single[i*dim*dim+j*dim+k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }
}
轮廓§ 2024-07-22 02:46:42

简单地说,多维数组类似于 DBMS 中的表。
数组的数组(交错数组)可让每个元素保存另一个相同类型的可变长度数组。

因此,如果您确定数据结构看起来像表格(固定行/列),则可以使用多维数组。 锯齿状数组是固定元素& 每个元素可以保存一个可变长度的数组

,例如 伪代码:

int[,] data = new int[2,2];
data[0,0] = 1;
data[0,1] = 2;
data[1,0] = 3;
data[1,1] = 4;

将上面的内容视为一个 2x2 表:

<前><代码>1 | 2
3 | 4

int[][] jagged = new int[3][]; 
jagged[0] = new int[4] {  1,  2,  3,  4 }; 
jagged[1] = new int[2] { 11, 12 }; 
jagged[2] = new int[3] { 21, 22, 23 }; 

将上面的内容视为每行具有可变的列数:

<前><代码> 1 | 2 | 3 | 4
11 | 11 12
21 | 21 22 | 22 23

Simply put multidimensional arrays are similar to a table in DBMS.
Array of Array (jagged array) lets you have each element hold another array of the same type of variable length.

So, if you are sure that the structure of data looks like a table (fixed rows/columns), you can use a multi-dimensional array. Jagged array are fixed elements & each element can hold an array of variable length

E.g. Psuedocode:

int[,] data = new int[2,2];
data[0,0] = 1;
data[0,1] = 2;
data[1,0] = 3;
data[1,1] = 4;

Think of the above as a 2x2 table:

1 | 2
3 | 4
int[][] jagged = new int[3][]; 
jagged[0] = new int[4] {  1,  2,  3,  4 }; 
jagged[1] = new int[2] { 11, 12 }; 
jagged[2] = new int[3] { 21, 22, 23 }; 

Think of the above as each row having variable number of columns:

 1 |  2 |  3 | 4
11 | 12
21 | 22 | 23
吃兔兔 2024-07-22 02:46:42

更新 .NET 6:

随着 .NET 6 的发布,我认为现在是重新审视这个主题的好时机。 我为新的.NET重写了测试代码并运行它,要求每个部分至少运行一秒钟。 该基准测试是在 AMD Ryzen 5600x 上完成的。

结果? 情况很复杂。 似乎单阵列对于较小和大型阵列(<~25x25x25 >~200x200x200)来说性能最佳,而锯齿状阵列在两者之间速度最快。 不幸的是,从我的测试来看,多维是迄今为止最慢的选择。 最好的性能是最快选项的两倍。 但! 这取决于您需要数组的用途,因为在 50^3 立方体上,锯齿状数组可能需要更长的时间来初始化,初始化时间大约是单维数组的 3 倍。 多维只比单维慢一点点。

结论? 如果您需要快速代码,请在将要运行的机器上自行对其进行基准测试。 CPU架构可以完全改变每种方法的相对性能。

数字!

Method name         Ticks/Iteration     Scaled to the best
Array size 1x1x1 (10,000,000 iterations):
Jagged:             0.15                4.28
Single:             0.035               1
Multi-dimensional:  0.77                22

Array size 10x10x10 (25,000 iterations):
Jagged:             15                  1.67
Single:             9                   1
Multi-dimensional:  56                  6.2

Array size 25x25x25 (25,000 iterations):
Jagged:             157                 1.3
Single:             120                 1
Multi-dimensional:  667                 5.56

Array size 50x50x50 (10,000 iterations):
Jagged:             1,140               1
Single:             2,440               2.14
Multi-dimensional:  5,210               4.57

Array size 100x100x100 (10,000 iterations):
Jagged:             9,800               1
Single:             19,800              2
Multi-dimensional:  41,700              4.25

Array size 200x200x200 (1,000 iterations):
Jagged:             161,622             1
Single:             175,507             1.086
Multi-dimensional:  351,275             2.17

Array size 500x500x500 (100 iterations):
Jagged:             4,057.413           1.5
Single:             2,709,301           1
Multi-dimensional:  5,359,393           1.98

不相信我? 自己运行一下并验证一下。

注意:恒定大小似乎给锯齿状数组带来了优势,但不足以改变我的基准测试中的顺序。 我在某些情况下测量到,当使用用户输入的大小来处理锯齿状数组时,性能下降约 7%,对于单个数组没有差异,对于多维数组,差异非常小(约 1% 或更少)。 它在中间最为突出,锯齿状阵列占据主导地位。

    using System.Diagnostics;

const string Format = "{0,7:0.000} ";
const int TotalPasses = 25000;
const int Size = 50;
Stopwatch timer = new();

var functionList = new List<Action> { Jagged, Single, SingleStandard, Multi };

Console.WriteLine("{0,5}{1,20}{2,20}{3,20}{4,20}", "Run", "Ticks", "ms", "Ticks/Instance", "ms/Instance");

foreach (var item in functionList)
{
    var warmup = Test(item);
    var run = Test(item);

    Console.WriteLine($"{item.Method.Name}:");
    PrintResult("warmup", warmup);
    PrintResult("run", run);
    Console.WriteLine();
}

static void PrintResult(string name, long ticks)
{
    Console.WriteLine("{0,10}{1,20}{2,20}{3,20}{4,20}", name, ticks, string.Format(Format, (decimal)ticks / TimeSpan.TicksPerMillisecond), (decimal)ticks / TotalPasses, (decimal)ticks / TotalPasses / TimeSpan.TicksPerMillisecond);
}

long Test(Action func)
{
    timer.Restart();
    func();
    timer.Stop();
    return timer.ElapsedTicks;
}

static void Jagged()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var jagged = new int[Size][][];
        for (var i = 0; i < Size; i++)
        {
            jagged[i] = new int[Size][];
            for (var j = 0; j < Size; j++)
            {
                jagged[i][j] = new int[Size];
                for (var k = 0; k < Size; k++)
                {
                    jagged[i][j][k] = i * j * k;
                }
            }
        }
    }
}

static void Multi()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var multi = new int[Size, Size, Size];
        for (var i = 0; i < Size; i++)
        {
            for (var j = 0; j < Size; j++)
            {
                for (var k = 0; k < Size; k++)
                {
                    multi[i, j, k] = i * j * k;
                }
            }
        }
    }
}

static void Single()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var single = new int[Size * Size * Size];
        for (var i = 0; i < Size; i++)
        {
            int iOffset = i * Size * Size;
            for (var j = 0; j < Size; j++)
            {
                var jOffset = iOffset + j * Size;
                for (var k = 0; k < Size; k++)
                {
                    single[jOffset + k] = i * j * k;
                }
            }
        }
    }
}

static void SingleStandard()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var single = new int[Size * Size * Size];
        for (var i = 0; i < Size; i++)
        {
            for (var j = 0; j < Size; j++)
            {
                for (var k = 0; k < Size; k++)
                {
                    single[i * Size * Size + j * Size + k] = i * j * k;
                }
            }
        }
    }
}

经验教训:始终将 CPU 纳入基准测试中,因为它会产生影响。 这次有吗? 我不知道,但我怀疑可能是这样。


原始答案:

我想对此进行更新,因为在 .NET Core 中多维数组比锯齿状数组更快。 我从 John Leidegren 运行了测试,这些是 .NET Core 2.0 预览版 2 上的结果。我将维度值增加到使后台应用程序的任何可能影响变得不那么明显。

Debug (code optimalization disabled)
Running jagged 
187.232 200.585 219.927 227.765 225.334 222.745 224.036 222.396 219.912 222.737 

Running multi-dimensional  
130.732 151.398 131.763 129.740 129.572 159.948 145.464 131.930 133.117 129.342 

Running single-dimensional  
 91.153 145.657 111.974  96.436 100.015  97.640  94.581 139.658 108.326  92.931 


Release (code optimalization enabled)
Running jagged 
108.503 95.409 128.187 121.877 119.295 118.201 102.321 116.393 125.499 116.459 

Running multi-dimensional 
 62.292  60.627  60.611  60.883  61.167  60.923  62.083  60.932  61.444  62.974 

Running single-dimensional 
 34.974  33.901  34.088  34.659  34.064  34.735  34.919  34.694  35.006  34.796 

我研究了反汇编,发现

jagged[i][j][k] = i * j * k; 需要 34 条指令来执行

multi[i, j, k] = i * j * k; 需要 11 条指令来执行

single[i * dim * dim + j * dim + k] = i * j * k; 需要 23 条指令来执行

I无法确定为什么单维数组仍然比多维数组更快,但我的猜测是这与 CPU 上进行的一些优化有关

Update .NET 6:

With the release of .NET 6 I decided it was a good time to revisit this topic. I rewrote the test code for new .NET and ran it with the requirement of each part running at least a second. The benchmark was done on AMD Ryzen 5600x.

Results? It's complicated. It seems that Single array is the most performant for smaller and large arrays (< ~25x25x25 & > ~200x200x200) and Jagged arrays being fastest in between. Unfortunately it seems from my testing that multi-dimensional are by far the slowest option. At best performing twice as slow as the fastest option. But! It depends on what you need the arrays for because jagged arrays can take much longer to initialize on 50^3 cube the initialization was roughly 3 times longer than single dimensional. Multi-dimensional was only a little bit slower than single dimensional.

The conclusion? If you need fast code, benchmark it yourself on the machine it's going to run on. CPU architecture can complete change the relative performance of each method.

Numbers!

Method name         Ticks/Iteration     Scaled to the best
Array size 1x1x1 (10,000,000 iterations):
Jagged:             0.15                4.28
Single:             0.035               1
Multi-dimensional:  0.77                22

Array size 10x10x10 (25,000 iterations):
Jagged:             15                  1.67
Single:             9                   1
Multi-dimensional:  56                  6.2

Array size 25x25x25 (25,000 iterations):
Jagged:             157                 1.3
Single:             120                 1
Multi-dimensional:  667                 5.56

Array size 50x50x50 (10,000 iterations):
Jagged:             1,140               1
Single:             2,440               2.14
Multi-dimensional:  5,210               4.57

Array size 100x100x100 (10,000 iterations):
Jagged:             9,800               1
Single:             19,800              2
Multi-dimensional:  41,700              4.25

Array size 200x200x200 (1,000 iterations):
Jagged:             161,622             1
Single:             175,507             1.086
Multi-dimensional:  351,275             2.17

Array size 500x500x500 (100 iterations):
Jagged:             4,057.413           1.5
Single:             2,709,301           1
Multi-dimensional:  5,359,393           1.98

Don't trust me? Run it yourself and verify.

Note: the constant size seems to give jagged arrays an edge, but is not significant enough to change the order in my benchmarks. I have measured in some instance ~7% decrease in performance when using size from user input for jagged arrays, no difference for single arrays and very small difference (~1% or less) for multi-dimensional arrays. It is most prominent in the middle where jagged arrays take the lead.

    using System.Diagnostics;

const string Format = "{0,7:0.000} ";
const int TotalPasses = 25000;
const int Size = 50;
Stopwatch timer = new();

var functionList = new List<Action> { Jagged, Single, SingleStandard, Multi };

Console.WriteLine("{0,5}{1,20}{2,20}{3,20}{4,20}", "Run", "Ticks", "ms", "Ticks/Instance", "ms/Instance");

foreach (var item in functionList)
{
    var warmup = Test(item);
    var run = Test(item);

    Console.WriteLine(
quot;{item.Method.Name}:");
    PrintResult("warmup", warmup);
    PrintResult("run", run);
    Console.WriteLine();
}

static void PrintResult(string name, long ticks)
{
    Console.WriteLine("{0,10}{1,20}{2,20}{3,20}{4,20}", name, ticks, string.Format(Format, (decimal)ticks / TimeSpan.TicksPerMillisecond), (decimal)ticks / TotalPasses, (decimal)ticks / TotalPasses / TimeSpan.TicksPerMillisecond);
}

long Test(Action func)
{
    timer.Restart();
    func();
    timer.Stop();
    return timer.ElapsedTicks;
}

static void Jagged()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var jagged = new int[Size][][];
        for (var i = 0; i < Size; i++)
        {
            jagged[i] = new int[Size][];
            for (var j = 0; j < Size; j++)
            {
                jagged[i][j] = new int[Size];
                for (var k = 0; k < Size; k++)
                {
                    jagged[i][j][k] = i * j * k;
                }
            }
        }
    }
}

static void Multi()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var multi = new int[Size, Size, Size];
        for (var i = 0; i < Size; i++)
        {
            for (var j = 0; j < Size; j++)
            {
                for (var k = 0; k < Size; k++)
                {
                    multi[i, j, k] = i * j * k;
                }
            }
        }
    }
}

static void Single()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var single = new int[Size * Size * Size];
        for (var i = 0; i < Size; i++)
        {
            int iOffset = i * Size * Size;
            for (var j = 0; j < Size; j++)
            {
                var jOffset = iOffset + j * Size;
                for (var k = 0; k < Size; k++)
                {
                    single[jOffset + k] = i * j * k;
                }
            }
        }
    }
}

static void SingleStandard()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var single = new int[Size * Size * Size];
        for (var i = 0; i < Size; i++)
        {
            for (var j = 0; j < Size; j++)
            {
                for (var k = 0; k < Size; k++)
                {
                    single[i * Size * Size + j * Size + k] = i * j * k;
                }
            }
        }
    }
}

Lesson learned: Always include CPU in benchmarks, because it makes a difference. Did it this time? I don't know but I suspect it might've.


Original answer:

I would like to update on this, because in .NET Core multi-dimensional arrays are faster than jagged arrays. I ran the tests from John Leidegren and these are the results on .NET Core 2.0 preview 2. I increased the dimension value to make any possible influences from background apps less visible.

Debug (code optimalization disabled)
Running jagged 
187.232 200.585 219.927 227.765 225.334 222.745 224.036 222.396 219.912 222.737 

Running multi-dimensional  
130.732 151.398 131.763 129.740 129.572 159.948 145.464 131.930 133.117 129.342 

Running single-dimensional  
 91.153 145.657 111.974  96.436 100.015  97.640  94.581 139.658 108.326  92.931 


Release (code optimalization enabled)
Running jagged 
108.503 95.409 128.187 121.877 119.295 118.201 102.321 116.393 125.499 116.459 

Running multi-dimensional 
 62.292  60.627  60.611  60.883  61.167  60.923  62.083  60.932  61.444  62.974 

Running single-dimensional 
 34.974  33.901  34.088  34.659  34.064  34.735  34.919  34.694  35.006  34.796 

I looked into disassemblies and this is what I found

jagged[i][j][k] = i * j * k; needed 34 instructions to execute

multi[i, j, k] = i * j * k; needed 11 instructions to execute

single[i * dim * dim + j * dim + k] = i * j * k; needed 23 instructions to execute

I wasn't able to identify why single-dimensional arrays were still faster than multi-dimensional but my guess is that it has to do with some optimalization made on the CPU

有木有妳兜一样 2024-07-22 02:46:42

前言:此评论旨在解决okutane提供的答案有关锯齿状数组和多维的。

由于方法调用而导致一种类型比另一种类型慢的断言是不正确的。 由于边界检查算法更复杂,其中一种比另一种慢。 您可以通过查看已编译的程序集而不是查看 IL 来轻松验证这一点。 例如,在我的 4.5 安装中,访问存储在 ecx 指向的二维数组中的元素(通过 edx 中的指针),索引存储在 eax 和 edx 中,如下所示:

sub eax,[ecx+10]
cmp eax,[ecx+08]
jae oops //jump to throw out of bounds exception
sub edx,[ecx+14]
cmp edx,[ecx+0C]
jae oops //jump to throw out of bounds exception
imul eax,[ecx+0C]
add eax,edx
lea edx,[ecx+eax*4+18]

这里,您可以看到该方法没有任何开销来电。 由于非零索引的可能性,边界检查非常复杂,这是锯齿状数组不提供的功能。 如果我们删除非零情况下的 subcmpjmp,代码几乎会解析为 (x *y_max+y)*sizeof(ptr)+sizeof(array_header)。 这种计算与随机访问元素的任何其他计算一样快(一次乘法可以用移位代替,因为这就是我们选择字节大小为两位的幂的全部原因)。

另一个复杂之处是,在很多情况下,现代编译器会在迭代一维数组时优化元素访问的嵌套边界检查。 结果是代码基本上只是在数组的连续内存上推进索引指针。 多维数组上的朴素迭代通常涉及额外的嵌套逻辑层,因此编译器不太可能优化操作。 因此,即使访问单个元素的边界检查开销在数组维度和大小方面分摊到恒定的运行时间,用于测量差异的简单测试用例的执行时间可能会长很多倍。

Preface: This comment is intended to address the answer provided by okutane regarding the performance difference between jagged arrays and multidimensional ones.

The assertion that one type is slower than the other because of the method calls isn't correct. One is slower than the other because of more complicated bounds-checking algorithms. You can easily verify this by looking, not at the IL, but at the compiled assembly. For example, on my 4.5 install, accessing an element (via pointer in edx) stored in a two-dimensional array pointed to by ecx with indexes stored in eax and edx looks like so:

sub eax,[ecx+10]
cmp eax,[ecx+08]
jae oops //jump to throw out of bounds exception
sub edx,[ecx+14]
cmp edx,[ecx+0C]
jae oops //jump to throw out of bounds exception
imul eax,[ecx+0C]
add eax,edx
lea edx,[ecx+eax*4+18]

Here, you can see that there's no overhead from method calls. The bounds checking is just very convoluted thanks to the possibility of non-zero indexes, which is a functionality not on offer with jagged arrays. If we remove the sub, cmp, and jmps for the non-zero cases, the code pretty much resolves to (x*y_max+y)*sizeof(ptr)+sizeof(array_header). This calculation is about as fast (one multiply could be replaced by a shift, since that's the whole reason we choose bytes to be sized as powers of two bits) as anything else for random access to an element.

Another complication is that there are plenty of cases where a modern compiler will optimize away the nested bounds-checking for element access while iterating over a single-dimension array. The result is code that basically just advances an index pointer over the contiguous memory of the array. Naive iteration over multi-dimensional arrays generally involves an extra layer of nested logic, so a compiler is less likely to optimize the operation. So, even though the bounds-checking overhead of accessing a single element amortizes out to constant runtime with respect to array dimensions and sizes, a simple test-case to measure the difference may take many times longer to execute.

很酷又爱笑 2024-07-22 02:46:42

多维数组是 (n-1) 维矩阵。

所以int[,] square = new int[2,2]是方阵2x2,int[,,]cube = new int [3,3,3]是立方体 - 方阵 3x3。 不要求比例。

锯齿状数组只是数组的数组 - 每个单元格都包含一个数组的数组。

所以MDA是成正比的,JD可能不是! 每个单元格可以包含任意长度的数组!

Multi-dimension arrays are (n-1)-dimension matrices.

So int[,] square = new int[2,2] is square matrix 2x2, int[,,] cube = new int [3,3,3] is a cube - square matrix 3x3. Proportionality is not required.

Jagged arrays are just array of arrays - an array where each cell contains an array.

So MDA are proportional, JD may be not! Each cell can contains an array of arbitrary length!

花之痕靓丽 2024-07-22 02:46:42

上面的答案可能已经提到了这一点,但没有明确提及:对于锯齿状数组,您可以使用 array[row] 来引用整行数据,但这对于多维数组是不允许的。

This might have been mentioned in the above answers but not explicitly: with jagged array you can use array[row] to refer a whole row of data, but this is not allowed for multi-d arrays.

梦纸 2024-07-22 02:46:42

除了其他答案之外,请注意,多维数组在堆上被分配为一个大块对象。 这有一些含义:

  1. 一些多维数组将在大对象堆 (LOH) 上分配,而它们的等效锯齿数组对应项则不会分配。
  2. GC 需要找到一个连续的空闲内存块来分配多维数组,而锯齿状数组可能能够填充堆碎片造成的间隙...由于压缩,这在 .NET 中通常不是问题,但默认情况下 LOH 不会被压缩(您必须要求它,并且每次需要时都必须询问)。
  3. 您需要查看 对于多维数组方式,如果您只使用锯齿状数组,那么问题就会出现。

In addition to the other answers, note that a multidimensional array is allocated as one big chunky object on the heap. This has some implications:

  1. Some multidimensional arrays will get allocated on the Large Object Heap (LOH) where their equivalent jagged array counterparts would otherwise not have.
  2. The GC will need to find a single contiguous free block of memory to allocate a multidimensional array, whereas a jagged array might be able to fill in gaps caused by heap fragmentation... this isn't usually an issue in .NET because of compaction, but the LOH doesn't get compacted by default (you have to ask for it, and you have to ask every time you want it).
  3. You'll want to look into <gcAllowVeryLargeObjects> for multidimensional arrays way before the issue will ever come up if you only ever use jagged arrays.
千仐 2024-07-22 02:46:42

我想我会在未来在这里插话一下 .NET 5 的一些性能结果,因为这将是从现在起每个人都使用的平台。

这些与 John Leidegren 使用的测试相同(2009 年)。

我的结果(.NET 5.0.1):

  Debug:
  (Jagged)
  5.616   4.719   4.778   5.524   4.559   4.508   5.913   6.107   5.839   5.270
  
  (Multi)
  6.336   7.477   6.124   5.817   6.516   7.098   5.272   6.091  25.034   6.023
  
  (Single)
  4.688   3.494   4.425   6.176   4.472   4.347   4.976   4.754   3.591   4.403


  Release(code optimizations on):
  (Jagged)
  2.614   2.108   3.541   3.065   2.172   2.936   1.681   1.724   2.622   1.708

  (Multi)
  3.371   4.690   4.502   4.153   3.651   3.637   3.580   3.854   3.841   3.802

  (Single)
  1.934   2.102   2.246   2.061   1.941   1.900   2.172   2.103   1.911   1.911

在 6 核 3.7GHz AMD Ryzen 1600 机器上运行。

看起来性能比还是大致相同的。 我想说,除非你真的在努力优化,否则只需使用多维数组,因为语法稍微更容易使用。

I thought I'd chime in here from the future with some performance results from .NET 5, seen as that will be the platform which everyone uses from now on.

These are the same tests that John Leidegren used (in 2009).

My results (.NET 5.0.1):

  Debug:
  (Jagged)
  5.616   4.719   4.778   5.524   4.559   4.508   5.913   6.107   5.839   5.270
  
  (Multi)
  6.336   7.477   6.124   5.817   6.516   7.098   5.272   6.091  25.034   6.023
  
  (Single)
  4.688   3.494   4.425   6.176   4.472   4.347   4.976   4.754   3.591   4.403


  Release(code optimizations on):
  (Jagged)
  2.614   2.108   3.541   3.065   2.172   2.936   1.681   1.724   2.622   1.708

  (Multi)
  3.371   4.690   4.502   4.153   3.651   3.637   3.580   3.854   3.841   3.802

  (Single)
  1.934   2.102   2.246   2.061   1.941   1.900   2.172   2.103   1.911   1.911

Ran on a a 6 core 3.7GHz AMD Ryzen 1600 machine.

It looks as though the performance ratio is still roughly the same. I'd say unless you're really optimizing hard, just use multi-dimensional arrays as the syntax is slightly easier to use.

獨角戲 2024-07-22 02:46:42

交错数组是数组的数组或数组的数组,其中每行都包含一个自己的数组。

这些数组的长度可以与其他行中的长度不同。

声明和分配数组的数组

与常规多维数组相比,交错数组声明的唯一区别在于我们不只有一对括号。 对于锯齿状数组,每个维度都有一对括号。 我们这样分配它们:

int[][] exampleJaggedArray;
jaggedArray = new int[2][];
jaggedArray[0] = new int[5];
jaggedArray[1] = new int[3];

初始化数组的数组

int[][] exampleJaggedArray = {
new int[] {5, 7, 2},
new int[] {10, 20, 40},
new int[] {3, 25}
};

内存分配

交错数组是引用的聚合。 交错数组不直接包含任何数组,而是具有指向它们的元素。 大小未知,这就是 CLR 只保留对内部数组的引用的原因。 当我们为交错数组的一个数组元素分配内存后,引用开始指向动态内存中新创建的块。

变量exampleJaggedArray存储在程序的执行堆栈中,指向动态内存中的一个块,该块包含对内存中其他三个块的三个引用的序列; 它们每个都包含一个整数数组——锯齿状数组的元素:

Jagged arrays are arrays of arrays or arrays in which each row contains an array of its own.

These arrays can have lengths different than those in the other rows.

Declaration and Allocation an Array of Arrays

The only difference in the declaration of the jagged arrays compared to the regular multidimensional array is that we do not have just one pair of brackets. With the jagged arrays, we have a pair of brackets per dimension. We allocate them this way:

int[][] exampleJaggedArray;
jaggedArray = new int[2][];
jaggedArray[0] = new int[5];
jaggedArray[1] = new int[3];

The Initializing an array of arrays

int[][] exampleJaggedArray = {
new int[] {5, 7, 2},
new int[] {10, 20, 40},
new int[] {3, 25}
};

Memory Allocation

Jagged arrays are an aggregation of references. A jagged array does not directly contain any arrays, but rather has elements pointing to them. The size is unknown and that is why CLR just keeps references to the internal arrays. After we allocate memory for one array-element of the jagged array, then the reference starts pointing to the newly created block in the dynamic memory.

The variable exampleJaggedArray is stored in the execution stack of the program and points to a block in the dynamic memory, which contains a sequence of three references to other three blocks in memory; each of them contains an array of integer numbers – the elements of the jagged array:

迷途知返 2024-07-22 02:46:42

我正在解析 ildasm 生成的 .il 文件,以构建程序集、类、方法和存储过程的数据库,以用于进行转换。 我遇到了以下内容,这打破了我的解析。

.method private hidebysig instance uint32[0...,0...] 
        GenerateWorkingKey(uint8[] key,
                           bool forEncryption) cil managed

《Expert .NET 2.0 IL Assembler》一书,作者 Serge Lidin,Apress,于 2006 年出版,第 8 章,原始类型和签名,第 149-150 页进行了解释。

[] 被称为 的向量,[

[**] ] 被称为 的数组,

** 表示可以重复,[ ] 表示可选。

示例:让 = int32

1) int32[...,...] 是一个未定义下限和大小的二维数组

2) int32[2...5] 是一个下限为 2、大小为 4 的一维数组。

3) int32[0...,0...] 是下限为 0、大小未定义的二维数组。

汤姆

I am parsing .il files generated by ildasm to build a database of assemnblies, classes, methods, and stored procedures for use doing a conversion. I came across the following, which broke my parsing.

.method private hidebysig instance uint32[0...,0...] 
        GenerateWorkingKey(uint8[] key,
                           bool forEncryption) cil managed

The book Expert .NET 2.0 IL Assembler, by Serge Lidin, Apress, published 2006, Chapter 8, Primitive Types and Signatures, pp. 149-150 explains.

<type>[] is termed a Vector of <type>,

<type>[<bounds> [<bounds>**] ] is termed an array of <type>

** means may be repeated, [ ] means optional.

Examples: Let <type> = int32.

1) int32[...,...] is a two-dimensional array of undefined lower bounds and sizes

2) int32[2...5] is a one-dimensional array of lower bound 2 and size 4.

3) int32[0...,0...] is a two-dimensional array of lower bounds 0 and undefined size.

Tom

自控 2024-07-22 02:46:42

使用基于 John Leidegren 的测试,我使用 .NET 4.7.2 对结果进行了基准测试,这是相关的版本适合我的目的,我想我可以分享。 我最初是从 dotnet core GitHub 存储库中的此评论开始的。

看起来,随着阵列大小的变化,性能变化很大,至少在我的设置中是这样,1 个 xeon 处理器,4 个物理处理器,8 个逻辑处理器。

w = 初始化一个数组,并将 int i * j 放入其中。
wr = do w,然后在另一个循环中将 int x 设置为 [i,j]

随着数组大小的增长,多维似乎表现得更好。

大小rw方法平均误差StdDevGen 0/1k OpGen 1/1k OpGen 2/1k Op已分配内存/Op
1800*500wJagged2.445 ms0.0959 ms0.1405 ms578.1250281.250085.93753.46 MB
1800*500wMulti3.079 ms.2419 毫秒0.3621 ms269.5313269.5313269.53133.43 MB
2000*4000w锯齿状50.29 ms3.262 ms4.882 ms5937.50003375.0000937.500030.62 MB
2000 *4000w26.34 毫秒1.797 毫秒2.690 毫秒218.7500218.750030.52MB
2000*4000WR 锯齿状218.750055.30 毫秒3.066 毫秒4.589 毫秒5937.50003375.0000937.500030.62 MB
2000*400032.23 ms2.798 ms4.187 ms285.7143285.7143285.714330.52 MB
1000*2000写锯齿11.18 毫秒0.5397 毫秒0.8078 毫秒1437.5000578.1250234.37507.69 MB
1000*2000wr6.622 毫秒0.3238 毫秒0.4847 毫秒210.9375210.9375210.93757.63 MB

更新:最后两次测试使用 double[,] 而不是 int[,]。 考虑到错误,差异似乎很显着。 对于 int,锯齿状与 md 的平均值之比在 1.53x 和 1.86x 之间,对于 doubles 则为 1.88x 和 2.42x。

大小rw方法平均误差StdDevGen 0/1k OpGen 1/1k OpGen 2/1k Op已分配内存/Op
1000*2000wr锯齿状26.83 ms1.221 ms1.790 ms3062.50001531.2500531.250015.31 MB
1000*2000wrMulti12 .61 毫秒1.018 毫秒1.524 毫秒156.2500156.2500156.250015.26 MB

Using a test based on the one by John Leidegren, I benchmarked the result using .NET 4.7.2, which is the relevant version for my purposes and thought I could share. I originally started with this comment in the dotnet core GitHub repository.

It appears that the performance varies greatly as the array size changes, at least on my setup, 1 processor xeon with 4physical 8logical.

w = initialize an array, and put int i * j in it.
wr = do w, then in another loop set int x to [i,j]

As array size grows, multidimensional appears to outperform.

SizerwMethodMeanErrorStdDevGen 0/1k OpGen 1/1k OpGen 2/1k OpAllocated Memory/Op
1800*500wJagged2.445 ms0.0959 ms0.1405 ms578.1250281.250085.93753.46 MB
1800*500wMulti3.079 ms0.2419 ms0.3621 ms269.5313269.5313269.53133.43 MB
2000*4000wJagged50.29 ms3.262 ms4.882 ms5937.50003375.0000937.500030.62 MB
2000*4000wMulti26.34 ms1.797 ms2.690 ms218.7500218.7500218.750030.52 MB
2000*4000wrJagged55.30 ms3.066 ms4.589 ms5937.50003375.0000937.500030.62 MB
2000*4000wrMulti32.23 ms2.798 ms4.187 ms285.7143285.7143285.714330.52 MB
1000*2000wrJagged11.18 ms0.5397 ms0.8078 ms1437.5000578.1250234.37507.69 MB
1000*2000wrMulti6.622 ms0.3238 ms0.4847 ms210.9375210.9375210.93757.63 MB

Update: last two tests with double[,] instead of int[,]. The difference appears significant considering the errors. With int, ratio of mean for jagged vs md is between 1.53x and 1.86x, with doubles it is 1.88x and 2.42x.

SizerwMethodMeanErrorStdDevGen 0/1k OpGen 1/1k OpGen 2/1k OpAllocated Memory/Op
1000*2000wrJagged26.83 ms1.221 ms1.790 ms3062.50001531.2500531.250015.31 MB
1000*2000wrMulti12.61 ms1.018 ms1.524 ms156.2500156.2500156.250015.26 MB
﹂绝世的画 2024-07-22 02:46:42

主要区别在于它们的结构:交错数组是具有不同长度的数组的数组,而多维数组的每个维度都有固定的长度。

与多维数组相比,锯齿状数组速度更快。 然而,多维数组的语法结构是干净的。

我在此处找到了一篇关于数组的非常好的文章

The main difference is in their structure: jagged arrays are arrays of arrays with different lengths, while multi-dimensional arrays have a fixed length for each dimension.

Jagged arrays are more faster compared to multi dimensional arrays. However, multi dimensional arrays are clean in syntax structure.

I have found a very good article on arrays at here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文