多维数组“[,]”之间的差异 以及数组“[][]”的数组 在 C# 中?
C# 中多维数组 double[,]
和数组数组 double[][]
之间有什么区别?
有什么区别吗?
每一种的最佳用途是什么?
What are the differences between multidimensional arrays double[,]
and array of arrays double[][]
in C#?
If there is a difference?
What is the best use for each one?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
数组的数组(锯齿状数组)比多维数组更快,并且可以更有效地使用。 多维数组有更好的语法。
如果您使用锯齿状和多维数组编写一些简单的代码,然后使用 IL 反汇编器检查编译后的程序集,您将看到锯齿状(或单维)数组的存储和检索是简单的 IL 指令,而多维数组的相同操作是方法调用总是比较慢。
考虑以下方法:
它们的 IL 如下:
使用锯齿状数组时,您可以轻松执行行交换和行调整大小等操作。 也许在某些情况下,使用多维数组会更安全,但即使 Microsoft FxCop 也告诉您,当您使用它来分析项目时,应该使用锯齿状数组而不是多维数组。
Array of arrays (jagged arrays) are faster than multi-dimensional arrays and can be used more effectively. Multidimensional arrays have nicer syntax.
If you write some simple code using jagged and multidimensional arrays and then inspect the compiled assembly with an IL disassembler you will see that the storage and retrieval from jagged (or single dimensional) arrays are simple IL instructions while the same operations for multidimensional arrays are method invocations which are always slower.
Consider the following methods:
Their IL will be the following:
When using jagged arrays you can easily perform such operations as row swap and row resize. Maybe in some cases usage of multidimensional arrays will be more safe, but even Microsoft FxCop tells that jagged arrays should be used instead of multidimensional when you use it to analyse your projects.
多维数组创建了一个很好的线性内存布局,而锯齿状数组则意味着几个额外的间接级别。
在锯齿状数组中查找值
jagged[3][6]
var jagged = new int[10][5]
的工作方式如下:对于本例中的每个维度,都有一个额外的查找(这是一种昂贵的内存访问模式)。
多维数组在内存中线性排列,通过将索引相乘得出实际值。 但是,给定数组
var mult = new int[10,30]
,该多维数组的Length
属性返回元素总数,即 10 * 30 = 300。锯齿状数组的
Rank
属性始终为 1,但多维数组可以具有任意等级。 任何数组的GetLength
方法都可用于获取每个维度的长度。 对于本例中的多维数组,mult.GetLength(1)
返回 30。对多维数组建立索引会更快。 例如,给定本例中的多维数组
mult[1,7]
= 30 * 1 + 7 = 37,获取索引 37 处的元素。这是一种更好的内存访问模式,因为只有一个内存位置涉及到的,就是数组的基地址。因此,多维数组分配连续的内存块,而交错数组不必是方形的,例如
jagged[1].Length
不必等于jagged[2].Length< /code>,这对于任何多维数组都是正确的。
性能
从性能角度来看,多维数组应该更快。 速度要快得多,但由于 CLR 实现非常糟糕,所以速度并没有那么快。
第一行是锯齿状数组的计时,第二行显示多维数组,第三行显示应该是这样的。 该程序如下所示,仅供参考,这是在 Mono 上进行测试的。 (Windows 计时有很大不同,主要是由于 CLR 实现的变化)。
在 Windows 上,锯齿状数组的计时要优越得多,与我自己对多维数组查找的解释大致相同,请参阅“Single()”。 遗憾的是 Windows JIT 编译器实在是太愚蠢了,不幸的是这使得这些性能讨论变得困难,有太多的不一致之处。
这些是我在 Windows 上得到的计时,这里同样的处理,第一行是锯齿状数组,第二行是多维,第三行是我自己的多维实现,请注意与 Mono 相比,这在 Windows 上慢了多少。
源代码:
A multidimensional array creates a nice linear memory layout while a jagged array implies several extra levels of indirection.
Looking up the value
jagged[3][6]
in a jagged arrayvar jagged = new int[10][5]
works like this:For each dimension in this case, there's an additional look up (this is an expensive memory access pattern).
A multidimensional array is laid out linearly in memory, the actual value is found by multiplying together the indexes. However, given the array
var mult = new int[10,30]
, theLength
property of that multidimensional array returns the total number of elements i.e. 10 * 30 = 300.The
Rank
property of a jagged array is always 1, but a multidimensional array can have any rank. TheGetLength
method of any array can be used to get the length of each dimension. For the multidimensional array in this examplemult.GetLength(1)
returns 30.Indexing the multidimensional array is faster. e.g. given the multidimensional array in this example
mult[1,7]
= 30 * 1 + 7 = 37, get the element at that index 37. This is a better memory access pattern because only one memory location is involved, which is the base address of the array.A multidimensional array therefore allocates a continuous memory block, while a jagged array does not have to be square, e.g.
jagged[1].Length
does not have to equaljagged[2].Length
, which would be true for any multidimensional array.Performance
Performance wise, multidimensional arrays should be faster. A lot faster, but due to a really bad CLR implementation they are not.
The first row are timings of jagged arrays, the second shows multidimensional arrays and the third, well that's how it should be. The program is shown below, FYI this was tested running Mono. (The Windows timings are vastly different, mostly due to the CLR implementation variations).
On Windows, the timings of the jagged arrays are greatly superior, about the same as my own interpretation of what multidimensional array look up should be like, see 'Single()'. Sadly the Windows JIT-compiler is really stupid, and this unfortunately makes these performance discussions difficult, there are too many inconsistencies.
These are the timings I got on Windows, same deal here, the first row are jagged arrays, second multidimensional and third my own implementation of multidimensional, note how much slower this is on Windows compared to Mono.
Source code:
简单地说,多维数组类似于 DBMS 中的表。
数组的数组(交错数组)可让每个元素保存另一个相同类型的可变长度数组。
因此,如果您确定数据结构看起来像表格(固定行/列),则可以使用多维数组。 锯齿状数组是固定元素& 每个元素可以保存一个可变长度的数组
,例如 伪代码:
将上面的内容视为一个 2x2 表:
将上面的内容视为每行具有可变的列数:
Simply put multidimensional arrays are similar to a table in DBMS.
Array of Array (jagged array) lets you have each element hold another array of the same type of variable length.
So, if you are sure that the structure of data looks like a table (fixed rows/columns), you can use a multi-dimensional array. Jagged array are fixed elements & each element can hold an array of variable length
E.g. Psuedocode:
Think of the above as a 2x2 table:
Think of the above as each row having variable number of columns:
更新 .NET 6:
随着 .NET 6 的发布,我认为现在是重新审视这个主题的好时机。 我为新的.NET重写了测试代码并运行它,要求每个部分至少运行一秒钟。 该基准测试是在 AMD Ryzen 5600x 上完成的。
结果? 情况很复杂。 似乎单阵列对于较小和大型阵列(<~25x25x25 >~200x200x200)来说性能最佳,而锯齿状阵列在两者之间速度最快。 不幸的是,从我的测试来看,多维是迄今为止最慢的选择。 最好的性能是最快选项的两倍。 但! 这取决于您需要数组的用途,因为在 50^3 立方体上,锯齿状数组可能需要更长的时间来初始化,初始化时间大约是单维数组的 3 倍。 多维只比单维慢一点点。
结论? 如果您需要快速代码,请在将要运行的机器上自行对其进行基准测试。 CPU架构可以完全改变每种方法的相对性能。
数字!
不相信我? 自己运行一下并验证一下。
注意:恒定大小似乎给锯齿状数组带来了优势,但不足以改变我的基准测试中的顺序。 我在某些情况下测量到,当使用用户输入的大小来处理锯齿状数组时,性能下降约 7%,对于单个数组没有差异,对于多维数组,差异非常小(约 1% 或更少)。 它在中间最为突出,锯齿状阵列占据主导地位。
经验教训:始终将 CPU 纳入基准测试中,因为它会产生影响。 这次有吗? 我不知道,但我怀疑可能是这样。
原始答案:
我想对此进行更新,因为在 .NET Core 中多维数组比锯齿状数组更快。 我从 John Leidegren 运行了测试,这些是 .NET Core 2.0 预览版 2 上的结果。我将维度值增加到使后台应用程序的任何可能影响变得不那么明显。
我研究了反汇编,发现
jagged[i][j][k] = i * j * k;
需要 34 条指令来执行multi[i, j, k] = i * j * k;
需要 11 条指令来执行single[i * dim * dim + j * dim + k] = i * j * k;
需要 23 条指令来执行I无法确定为什么单维数组仍然比多维数组更快,但我的猜测是这与 CPU 上进行的一些优化有关
Update .NET 6:
With the release of .NET 6 I decided it was a good time to revisit this topic. I rewrote the test code for new .NET and ran it with the requirement of each part running at least a second. The benchmark was done on AMD Ryzen 5600x.
Results? It's complicated. It seems that Single array is the most performant for smaller and large arrays (< ~25x25x25 & > ~200x200x200) and Jagged arrays being fastest in between. Unfortunately it seems from my testing that multi-dimensional are by far the slowest option. At best performing twice as slow as the fastest option. But! It depends on what you need the arrays for because jagged arrays can take much longer to initialize on 50^3 cube the initialization was roughly 3 times longer than single dimensional. Multi-dimensional was only a little bit slower than single dimensional.
The conclusion? If you need fast code, benchmark it yourself on the machine it's going to run on. CPU architecture can complete change the relative performance of each method.
Numbers!
Don't trust me? Run it yourself and verify.
Note: the constant size seems to give jagged arrays an edge, but is not significant enough to change the order in my benchmarks. I have measured in some instance ~7% decrease in performance when using size from user input for jagged arrays, no difference for single arrays and very small difference (~1% or less) for multi-dimensional arrays. It is most prominent in the middle where jagged arrays take the lead.
Lesson learned: Always include CPU in benchmarks, because it makes a difference. Did it this time? I don't know but I suspect it might've.
Original answer:
I would like to update on this, because in .NET Core multi-dimensional arrays are faster than jagged arrays. I ran the tests from John Leidegren and these are the results on .NET Core 2.0 preview 2. I increased the dimension value to make any possible influences from background apps less visible.
I looked into disassemblies and this is what I found
jagged[i][j][k] = i * j * k;
needed 34 instructions to executemulti[i, j, k] = i * j * k;
needed 11 instructions to executesingle[i * dim * dim + j * dim + k] = i * j * k;
needed 23 instructions to executeI wasn't able to identify why single-dimensional arrays were still faster than multi-dimensional but my guess is that it has to do with some optimalization made on the CPU
前言:此评论旨在解决okutane提供的答案有关锯齿状数组和多维的。
由于方法调用而导致一种类型比另一种类型慢的断言是不正确的。 由于边界检查算法更复杂,其中一种比另一种慢。 您可以通过查看已编译的程序集而不是查看 IL 来轻松验证这一点。 例如,在我的 4.5 安装中,访问存储在 ecx 指向的二维数组中的元素(通过 edx 中的指针),索引存储在 eax 和 edx 中,如下所示:
这里,您可以看到该方法没有任何开销来电。 由于非零索引的可能性,边界检查非常复杂,这是锯齿状数组不提供的功能。 如果我们删除非零情况下的
sub
、cmp
和jmp
,代码几乎会解析为(x *y_max+y)*sizeof(ptr)+sizeof(array_header)
。 这种计算与随机访问元素的任何其他计算一样快(一次乘法可以用移位代替,因为这就是我们选择字节大小为两位的幂的全部原因)。另一个复杂之处是,在很多情况下,现代编译器会在迭代一维数组时优化元素访问的嵌套边界检查。 结果是代码基本上只是在数组的连续内存上推进索引指针。 多维数组上的朴素迭代通常涉及额外的嵌套逻辑层,因此编译器不太可能优化操作。 因此,即使访问单个元素的边界检查开销在数组维度和大小方面分摊到恒定的运行时间,用于测量差异的简单测试用例的执行时间可能会长很多倍。
Preface: This comment is intended to address the answer provided by okutane regarding the performance difference between jagged arrays and multidimensional ones.
The assertion that one type is slower than the other because of the method calls isn't correct. One is slower than the other because of more complicated bounds-checking algorithms. You can easily verify this by looking, not at the IL, but at the compiled assembly. For example, on my 4.5 install, accessing an element (via pointer in edx) stored in a two-dimensional array pointed to by ecx with indexes stored in eax and edx looks like so:
Here, you can see that there's no overhead from method calls. The bounds checking is just very convoluted thanks to the possibility of non-zero indexes, which is a functionality not on offer with jagged arrays. If we remove the
sub
,cmp
, andjmp
s for the non-zero cases, the code pretty much resolves to(x*y_max+y)*sizeof(ptr)+sizeof(array_header)
. This calculation is about as fast (one multiply could be replaced by a shift, since that's the whole reason we choose bytes to be sized as powers of two bits) as anything else for random access to an element.Another complication is that there are plenty of cases where a modern compiler will optimize away the nested bounds-checking for element access while iterating over a single-dimension array. The result is code that basically just advances an index pointer over the contiguous memory of the array. Naive iteration over multi-dimensional arrays generally involves an extra layer of nested logic, so a compiler is less likely to optimize the operation. So, even though the bounds-checking overhead of accessing a single element amortizes out to constant runtime with respect to array dimensions and sizes, a simple test-case to measure the difference may take many times longer to execute.
多维数组是 (n-1) 维矩阵。
所以
int[,] square = new int[2,2]
是方阵2x2,int[,,]cube = new int [3,3,3]
是立方体 - 方阵 3x3。 不要求比例。锯齿状数组只是数组的数组 - 每个单元格都包含一个数组的数组。
所以MDA是成正比的,JD可能不是! 每个单元格可以包含任意长度的数组!
Multi-dimension arrays are (n-1)-dimension matrices.
So
int[,] square = new int[2,2]
is square matrix 2x2,int[,,] cube = new int [3,3,3]
is a cube - square matrix 3x3. Proportionality is not required.Jagged arrays are just array of arrays - an array where each cell contains an array.
So MDA are proportional, JD may be not! Each cell can contains an array of arbitrary length!
上面的答案可能已经提到了这一点,但没有明确提及:对于锯齿状数组,您可以使用 array[row] 来引用整行数据,但这对于多维数组是不允许的。
This might have been mentioned in the above answers but not explicitly: with jagged array you can use
array[row]
to refer a whole row of data, but this is not allowed for multi-d arrays.除了其他答案之外,请注意,多维数组在堆上被分配为一个大块对象。 这有一些含义:
对于多维数组方式,如果您只使用锯齿状数组,那么问题就会出现。In addition to the other answers, note that a multidimensional array is allocated as one big chunky object on the heap. This has some implications:
<gcAllowVeryLargeObjects>
for multidimensional arrays way before the issue will ever come up if you only ever use jagged arrays.我想我会在未来在这里插话一下 .NET 5 的一些性能结果,因为这将是从现在起每个人都使用的平台。
这些与 John Leidegren 使用的测试相同(2009 年)。
我的结果(.NET 5.0.1):
在 6 核 3.7GHz AMD Ryzen 1600 机器上运行。
看起来性能比还是大致相同的。 我想说,除非你真的在努力优化,否则只需使用多维数组,因为语法稍微更容易使用。
I thought I'd chime in here from the future with some performance results from .NET 5, seen as that will be the platform which everyone uses from now on.
These are the same tests that John Leidegren used (in 2009).
My results (.NET 5.0.1):
Ran on a a 6 core 3.7GHz AMD Ryzen 1600 machine.
It looks as though the performance ratio is still roughly the same. I'd say unless you're really optimizing hard, just use multi-dimensional arrays as the syntax is slightly easier to use.
交错数组是数组的数组或数组的数组,其中每行都包含一个自己的数组。
这些数组的长度可以与其他行中的长度不同。
声明和分配数组的数组
与常规多维数组相比,交错数组声明的唯一区别在于我们不只有一对括号。 对于锯齿状数组,每个维度都有一对括号。 我们这样分配它们:
初始化数组的数组
内存分配
交错数组是引用的聚合。 交错数组不直接包含任何数组,而是具有指向它们的元素。 大小未知,这就是 CLR 只保留对内部数组的引用的原因。 当我们为交错数组的一个数组元素分配内存后,引用开始指向动态内存中新创建的块。
变量exampleJaggedArray存储在程序的执行堆栈中,指向动态内存中的一个块,该块包含对内存中其他三个块的三个引用的序列; 它们每个都包含一个整数数组——锯齿状数组的元素:
Jagged arrays are arrays of arrays or arrays in which each row contains an array of its own.
These arrays can have lengths different than those in the other rows.
Declaration and Allocation an Array of Arrays
The only difference in the declaration of the jagged arrays compared to the regular multidimensional array is that we do not have just one pair of brackets. With the jagged arrays, we have a pair of brackets per dimension. We allocate them this way:
The Initializing an array of arrays
Memory Allocation
Jagged arrays are an aggregation of references. A jagged array does not directly contain any arrays, but rather has elements pointing to them. The size is unknown and that is why CLR just keeps references to the internal arrays. After we allocate memory for one array-element of the jagged array, then the reference starts pointing to the newly created block in the dynamic memory.
The variable exampleJaggedArray is stored in the execution stack of the program and points to a block in the dynamic memory, which contains a sequence of three references to other three blocks in memory; each of them contains an array of integer numbers – the elements of the jagged array:
我正在解析 ildasm 生成的 .il 文件,以构建程序集、类、方法和存储过程的数据库,以用于进行转换。 我遇到了以下内容,这打破了我的解析。
《Expert .NET 2.0 IL Assembler》一书,作者 Serge Lidin,Apress,于 2006 年出版,第 8 章,原始类型和签名,第 149-150 页进行了解释。
[]
被称为
的向量,
的数组,**
表示可以重复,[ ] 表示可选。
示例:让
= int32
。1)
int32[...,...]
是一个未定义下限和大小的二维数组2)
int32[2...5]
是一个下限为 2、大小为 4 的一维数组。3)
int32[0...,0...]
是下限为 0、大小未定义的二维数组。汤姆
I am parsing .il files generated by ildasm to build a database of assemnblies, classes, methods, and stored procedures for use doing a conversion. I came across the following, which broke my parsing.
The book Expert .NET 2.0 IL Assembler, by Serge Lidin, Apress, published 2006, Chapter 8, Primitive Types and Signatures, pp. 149-150 explains.
<type>[]
is termed a Vector of<type>
,<type>[<bounds> [<bounds>**] ]
is termed an array of<type>
**
means may be repeated,[ ]
means optional.Examples: Let
<type> = int32
.1)
int32[...,...]
is a two-dimensional array of undefined lower bounds and sizes2)
int32[2...5]
is a one-dimensional array of lower bound 2 and size 4.3)
int32[0...,0...]
is a two-dimensional array of lower bounds 0 and undefined size.Tom
使用基于 John Leidegren 的测试,我使用 .NET 4.7.2 对结果进行了基准测试,这是相关的版本适合我的目的,我想我可以分享。 我最初是从 dotnet core GitHub 存储库中的此评论开始的。
看起来,随着阵列大小的变化,性能变化很大,至少在我的设置中是这样,1 个 xeon 处理器,4 个物理处理器,8 个逻辑处理器。
w = 初始化一个数组,并将 int i * j 放入其中。
wr = do w,然后在另一个循环中将 int x 设置为 [i,j]
随着数组大小的增长,多维似乎表现得更好。
更新:最后两次测试使用 double[,] 而不是 int[,]。 考虑到错误,差异似乎很显着。 对于 int,锯齿状与 md 的平均值之比在 1.53x 和 1.86x 之间,对于 doubles 则为 1.88x 和 2.42x。
Using a test based on the one by John Leidegren, I benchmarked the result using .NET 4.7.2, which is the relevant version for my purposes and thought I could share. I originally started with this comment in the dotnet core GitHub repository.
It appears that the performance varies greatly as the array size changes, at least on my setup, 1 processor xeon with 4physical 8logical.
w = initialize an array, and put int i * j in it.
wr = do w, then in another loop set int x to [i,j]
As array size grows, multidimensional appears to outperform.
Update: last two tests with double[,] instead of int[,]. The difference appears significant considering the errors. With int, ratio of mean for jagged vs md is between 1.53x and 1.86x, with doubles it is 1.88x and 2.42x.
主要区别在于它们的结构:交错数组是具有不同长度的数组的数组,而多维数组的每个维度都有固定的长度。
与多维数组相比,锯齿状数组速度更快。 然而,多维数组的语法结构是干净的。
我在此处找到了一篇关于数组的非常好的文章
The main difference is in their structure: jagged arrays are arrays of arrays with different lengths, while multi-dimensional arrays have a fixed length for each dimension.
Jagged arrays are more faster compared to multi dimensional arrays. However, multi dimensional arrays are clean in syntax structure.
I have found a very good article on arrays at here