如何生成具有不同“分支”的二维数组长度非常快

发布于 2024-10-27 16:21:39 字数 404 浏览 4 评论 0原文

我是一名德尔福程序员。 在程序中,我必须生成具有不同“分支”长度的二维数组。 它们非常大,操作需要几秒钟(烦人)。

例如:

var a: array of array of Word;
  i: Integer;

begin
   SetLength(a, 5000000);
   for i := 0 to 4999999 do
      SetLength(a[i], Diff_Values);
end;

我知道命令 SetLength(a, dim1, dim2) 但不适用。甚至没有为dim2设置最小值(> 0)并从那里继续,因为dim2的最小值为0(某些“分支”可以为空)。

那么,有没有办法让它快一点呢?不只是 5..10%,而且真的很快...

谢谢。

I am a Delphi programmer.
In a program I have to generate bidimensional arrays with different "branch" lengths.
They are very big and the operation takes a few seconds (annoying).

For example:

var a: array of array of Word;
  i: Integer;

begin
   SetLength(a, 5000000);
   for i := 0 to 4999999 do
      SetLength(a[i], Diff_Values);
end;

I am aware of the command SetLength(a, dim1, dim2) but is not applicable. Not even setting a min value (> 0) for dim2 and continuing from there because min of dim2 is 0 (some "branches" can be empty).

So, is there a way to make it fast? Not just by 5..10% but really FAST...

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

如果没有 2024-11-03 16:21:39

处理大量数据时,需要完成大量工作,这对完成这些工作所需的时间设置了理论上的最小值。

500 万次迭代中的每一次,您需要:

  • 对于 以某种方式
  • 从内存管理器中分配一个适当大小的新数组
  • 将新数组使用的所有内存清零(SetLength 自动为您执行此操作)

步骤 1 完全在您的控制之下,并且可以进行优化。不过,如果您使用的是现代版本的 Delphi,则 2 和 3 的速度大约是最快的。 (如果您使用的是旧版本,您可能会受益于安装 FastMM 和 FastCode,它们可以加快这些操作的速度。)

如果合适的话,您可以做的另一件事是延迟初始化。不要尝试一次分配所有 500 万个数组,只需首先执行 SetLength(a, 5000000); 即可。然后,当你需要到达一个“分支”时,首先检查它的长度是否= 0。如果是,则它尚未初始化,因此将其初始化为适当的长度。这并不能节省总体时间,事实上,它总共花费的时间会稍长一些,但它确实分散了初始化时间,因此用户不会注意到。

如果你的初始化已经尽可能快,并且你的情况是这里不能使用延迟初始化,那么你基本上就不走运了。这就是处理大量数据的代价。

When dealing with a large amount of data, there's a lot of work that has to be done, and this places a theoretical minimum on the amount of time it can be done in.

For each of 5 million iterations, you need to:

  • Determine the size of the "branch" somehow
  • Allocate a new array of the appropriate size from the memory manager
  • Zero out all the memory used by the new array (SetLength does this for you automatically)

Step 1 is completely under your control and can possibly be optimized. 2 and 3, though, are about as fast as they're gonna get if you're using a modern version of Delphi. (If you're on an old version, you might benefit from installing FastMM and FastCode, which can speed up these operations.)

The other thing you might do, if appropriate, is lazy initialization. Instead of trying to allocate all 5 million arrays at once, just do the SetLength(a, 5000000); at first. Then when you need to get at a "branch", first check if its length = 0. If so, it hasn't been initialized, so initialize it to the proper length. This doesn't save time overall, in fact it will take slightly longer in total, but it does spread out the initialization time so the user doesn't notice.

If your initialization is already as fast as it will get, and your situation is such that lazy initialization can't be used here, then you're basically out of luck. That's the price of dealing with large amounts of data.

后eg是否自 2024-11-03 16:21:39

我刚刚测试了您的确切代码,使用 Diff_Values 常量,使用 GetTickCount() 对其进行计时以进行基本计时。如果 Diff_Values186,则需要 1466 毫秒,如果 Diff_Values187,则失败并显示 内存不足。您知道,内存不足 意味着地址空间不足,而不是真正的内存不足

在我看来,您分配的数据太多,导致 RAM 耗尽,Windows 开始分页,这就是速度缓慢的原因。在我的系统上,我有足够的 RAM 供进程分配所需的内存;确实如此,直到失败。

可能的解决方案

  • 最明显的一个是:不要分配那么多!
  • 找出一种将所有数据分配到一个连续的内存块中的方法:有助于解决地址空间碎片问题。与“分支”上具有固定大小的二维数组的分配方式类似,但如果您的“分支”具有不同的大小,则需要根据您的数据计算不同的数学公式。
  • 查看其他数据结构,可能是在磁盘上缓存的数据结构(以打破 2Gb 地址空间限制)。

I just tested your exact code, with a constant for Diff_Values, timed it using GetTickCount() for rudimentary timing. If Diff_Values is 186 it takes 1466 milliseconds, if Diff_Values is 187 it fails with Out of Memory. You know, Out of Memory means Out of Address Space, not really Out of Memory.

In my opinion you're allocating so much data you run out of RAM and Windows starts paging, that's why it's slow. On my system I've got enough RAM for the process to allocate as much as it wants; And it does, until it fails.

Possible solutions

  • The obvious one: Don't allocate that much!
  • Figure out a way to allocate all data into one contiguous block of memory: helps with address space fragmentation. Similar to how a bi dimensional array with fixed size on the "branches" is allocated, but if your "branches" have different sizes, you'll need to figure a different mathematical formula, based on your data.
  • Look into other data structures, possibly ones that cache on disk (to brake the 2Gb address space limit).
撩人痒 2024-11-03 16:21:39

除了梅森的观点之外,这里还有一些需要考虑的想法:

如果分支长度在分配后永远不会改变,并且您对将存储在所有分支的数组中的项目总数有一个上限,那么您通过分配一大块内存并自己分配该内存块中的“分支”,也许可以节省一些时间。您的数组将成为一维指针数组,该数组中的每个条目都指向该分支的数据的开头。您可以使用单个指针变量跟踪大块中已用空间的“结束”,并且当您需要为新“分支”保留空间时,您可以将当前“结束”指针值作为新“分支”的开始分支并将“结束”指针增加分支所需的空间量。不要忘记四舍五入到双字边界以避免错位惩罚。

这种技术需要更多地使用指针,但它提供了消除所有堆分配开销的潜力,或者至少用与您的特定使用模式相匹配的专门构建的非常简单、非常快速的子分配器来替换通用堆分配。它的执行速度应该更快,但需要更多的时间来编写和测试。

此技术还将避免堆碎片,并将所有内存的释放减少到一次释放(而不是当前模型中的数百万次单独分配)。

另一个需要考虑的提示:如果您对每个新分配的数组“分支”所做的第一件事是将数据分配到每个槽中,那么您可以消除 Mason 示例中的步骤 3 - 如果全部都将内存清零,则无需将内存清零您要做的就是立即将真实数据分配给它。这将使您的内存写入操作减少一半。

In addition to Mason's points, here are some more ideas to consider:

If the branch lengths never change after they are allocated, and you have an upper bound on the total number of items that will be stored in the array across all branches, then you might be able to save some time by allocating one huge chunk of memory and divvying up the "branches" within that chunk yourself. Your array would become a 1 dimensional array of pointers, and each entry in that array points to the start of the data for that branch. You keep track of the "end" of the used space in your big block with a single pointer variable, and when you need to reserve space for a new "branch" you take the current "end" pointer value as the start of the new branch and increment the "end" pointer by the amount of space that branch requires. Don't forget to round up to dword boundaries to avoid misalignment penalties.

This technique will require more use of pointers, but it offers the potential of eliminating all the heap allocation overhead, or at least replacing the general purpose heap allocation with a purpose-built very simple, very fast suballocator that matches your specific use pattern. It should be faster to execute, but it will require more time to write and test.

This technique will also avoid heap fragmentation and reduces the releasing of all the memory to a single deallocation (instead of millions of separate allocations in your present model).

Another tip to consider: If the first thing you always do with the each newly allocated array "branch" is assign data into every slot, then you can eliminate step 3 in Mason's example - you don't need to zero out the memory if all you're going to do is immediately assign real data into it. This will cut your memory write operations by half.

蓝梦月影 2024-11-03 16:21:39

假设您可以将整个数据结构放入连续的内存块中,则可以一次性完成分配,然后接管索引。

注意:即使您无法将数据放入单个连续的内存块中,您仍然可以通过分配多个大块然后将它们拼凑在一起来使用此技术。

首先形成一个辅助数组 colIndex,它包含每行第一列的索引。将 colIndex 的长度设置为 RowCount+1。您可以通过设置 colIndex[0] := 0 然后设置 colIndex[i+1] := colIndex[i] + ColCount[i] 来构建它。在 for 循环中执行此操作,该循环运行到并包括 RowCount。因此,在最后一个条目 colIndex[RowCount] 中,您存储元素的总数。

现在将 a 的长度设置为 colIndex[RowCount]。这可能需要一些时间,但会比您之前所做的更快。

现在您需要编写几个索引器。将它们放入一个类或一个记录中。

getter 看起来像这样:

 function GetItem(row, col: Integer): Word;
 begin
   Result := a[colIndex[row]+col];
 end;

setter 很明显。您可以内联这些访问方法以提高性能。将它们公开为索引属性以方便对象的客户端。

您需要添加一些代码来检查 rowcol 的有效性。对于后者,您需要使用 colIndex 。如果您想模仿本机索引的范围检查,可以使用 {$IFOPT R+} 将此检查设置为可选。

当然,如果您想在初始实例化后更改任何列计数,那么这是完全行不通的!

Assuming you can fit the entire data structure into a contiguous block of memory, you can do the allocation in one shot and then take over the indexing.

Note: Even if you can't fit the data into a single contiguous block of memory, you can still use this technique by allocating multiple large blocks and then piecing them together.

First off form a helper array, colIndex, which is to contain the index of the first column of each row. Set the length of colIndex to RowCount+1. You build this by setting colIndex[0] := 0 and then colIndex[i+1] := colIndex[i] + ColCount[i]. Do this in a for loop which runs up to and including RowCount. So, in the final entry, colIndex[RowCount], you store the total number of elements.

Now set the length of a to be colIndex[RowCount]. This may take a little while, but it will be quicker than what you were doing before.

Now you need to write a couple of indexers. Put them in a class or a record.

The getter looks like this:

 function GetItem(row, col: Integer): Word;
 begin
   Result := a[colIndex[row]+col];
 end;

The setter is obvious. You can inline these access methods for increased performance. Expose them as an indexed property for convenience to the object's clients.

You'll want to add some code to check for validity of row and col. You need to use colIndex for the latter. You can make this checking optional with {$IFOPT R+} if you want to mimic range checking for native indexing.

Of course, this is a total non-starter if you want to change any of your column counts after the initial instantiation!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文