在迭代期间取消设置数组值是否会节省内存?

发布于 2024-10-11 21:07:21 字数 924 浏览 2 评论 0原文

这是一个简单的编程问题,因为我对 PHP 如何在 foreach 循环期间处理数组复制和取消设置缺乏了解。就像这样,我有一个来自外部源的数组,其格式按照我想要更改的方式进行。一个简单的例子是:

$myData = array('Key1' => array('value1', 'value2'));

但我想要的是这样的:

$myData = array([0] => array('MyKey' => array('Key1' => array('value1', 'value2'))));

所以我采用第一个 $myData 并将其格式化为第二个 $myData。我对我的格式化算法完全满意。我的问题在于找到一种节省内存的方法,因为这些数组可能会变得有点笨拙。因此,在 foreach 循环期间,我将当前数组值复制到新格式中,然后从原始数组中取消设置正在使用的值。例如:

$formattedData = array();
foreach ($myData as $key => $val) {
    // do some formatting here, copy to $reformattedVal

    $formattedData[] = $reformattedVal;

    unset($myData[$key]);
}

调用 unset() 是个好主意吗?即,由于我已经复制了数据并且不再需要原始值,它是否节省内存?或者,PHP 是否会自动垃圾收集数据,因为我没有在任何后续代码中引用它?

代码运行良好,到目前为止,我的数据集的大小可以忽略不计,无法测试性能差异。我只是不知道以后是否会遇到一些奇怪的错误或 CPU 命中。

感谢您的任何见解。
-sR

This is a simple programming question, coming from my lack of knowledge of how PHP handles array copying and unsetting during a foreach loop. It's like this, I have an array that comes to me from an outside source formatted in a way I want to change. A simple example would be:

$myData = array('Key1' => array('value1', 'value2'));

But what I want would be something like:

$myData = array([0] => array('MyKey' => array('Key1' => array('value1', 'value2'))));

So I take the first $myData and format it like the second $myData. I'm totally fine with my formatting algorithm. My question lies in finding a way to conserve memory since these arrays might get a little unwieldy. So, during my foreach loop I copy the current array value(s) into the new format, then I unset the value I'm working with from the original array. E.g.:

$formattedData = array();
foreach ($myData as $key => $val) {
    // do some formatting here, copy to $reformattedVal

    $formattedData[] = $reformattedVal;

    unset($myData[$key]);
}

Is the call to unset() a good idea here? I.e., does it conserve memory since I have copied the data and no longer need the original value? Or, does PHP automatically garbage collect the data since I don't reference it in any subsequent code?

The code runs fine, and so far my datasets have been too negligible in size to test for performance differences. I just don't know if I'm setting myself up for some weird bugs or CPU hits later on.

Thanks for any insights.
-sR

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

看春风乍起 2024-10-18 21:07:21

我在循环内处理文本 (xml) 文件的行时内存不足。对于任何有类似情况的人来说,这对我有用:

while($data = array_pop($xml_data)){
     //process $data
}

I was running out of memory while processing lines of a text (xml) file within a loop. For anyone with a similar situation, this worked for me:

while($data = array_pop($xml_data)){
     //process $data
}
撧情箌佬 2024-10-18 21:07:21

使用 & 运算符在 foreach 循环中使用对变量的引用。这可以避免在内存中复制数组供 foreach 进行迭代。

编辑:正如Artefacto 取消设置变量只会减少对原始变量的引用数量,因此节省的内存仅在指针上,而不是变量的值上。奇怪的是,使用引用实际上会增加总内存使用量,因为可能该值被复制到新的内存位置而不是被引用。

除非数组被引用,
foreach 操作的是
指定数组而不是数组
本身。 foreach 有一些副作用
在数组指针上。不要依赖
期间或之后的数组指针
foreach 而不重置它。

使用 memory_get_usage() 来确定您正在使用多少内存。

关于内存使用和分配有一篇很好的文章

这是查看内存分配的有用测试代码 - 尝试取消注释行以查看不同场景下的总内存使用情况。

echo memory_get_usage() . PHP_EOL;
$test = $testCopy = array();
$i = 0;
while ($i++ < 100000) {
    $test[] = $i;
}
echo memory_get_usage() . PHP_EOL;
foreach ($test as $k => $v) {
//foreach ($test as $k => &$v) {
    $testCopy[$k] = $v;
    //unset($test[$k]);
}
echo memory_get_usage() . PHP_EOL;

Use a reference to the variable in the foreach loop using the & operator. This avoids making a copy of the array in memory for foreach to iterate over.

edit: as pointed out by Artefacto unsetting the variable only decreases the number of references to the original variable, so the memory saved is only on pointers rather than the value of the variable. Bizarrely using a reference actually increases the total memory usage as presumably the value is copied to a new memory location instead of being referenced.

Unless the array is referenced,
foreach operates on a copy of the
specified array and not the array
itself. foreach has some side effects
on the array pointer. Don't rely on
the array pointer during or after the
foreach without resetting it.

Use memory_get_usage() to identify how much memory you are using.

There is a good write up on memory usage and allocation here.

This is useful test code to see memory allocation - try uncommenting the commented lines to see total memory usage in different scenarios.

echo memory_get_usage() . PHP_EOL;
$test = $testCopy = array();
$i = 0;
while ($i++ < 100000) {
    $test[] = $i;
}
echo memory_get_usage() . PHP_EOL;
foreach ($test as $k => $v) {
//foreach ($test as $k => &$v) {
    $testCopy[$k] = $v;
    //unset($test[$k]);
}
echo memory_get_usage() . PHP_EOL;
っ左 2024-10-18 21:07:21

请记住优化俱乐部规则

  1. 优化俱乐部的第一条规则是,你不要优化。
  2. 优化俱乐部的第二条规则是,没有测量就不能优化。
  3. 如果您的应用程序运行速度快于底层传输协议,则优化结束。
  4. 一次只考虑一个因素。
  5. 没有市场机器人,没有市场机器人时间表。
  6. 只要有必要,测试就会继续进行。
  7. 如果这是您在优化俱乐部的第一晚,您必须编写一个测试用例。

规则#1 和#2 在这里尤其重要。除非您知道需要优化,并且除非您已经衡量了优化的需要,否则不要这样做。添加未设置将增加运行时命中,并使未来的程序员了解您这样做的原因。

别管它。

Please remember the rules of Optimization Club:

  1. The first rule of Optimization Club is, you do not Optimize.
  2. The second rule of Optimization Club is, you do not Optimize without measuring.
  3. If your app is running faster than the underlying transport protocol, the optimization is over.
  4. One factor at a time.
  5. No marketroids, no marketroid schedules.
  6. Testing will go on as long as it has to.
  7. If this is your first night at Optimization Club, you have to write a test case.

Rules #1 and #2 are especially relevant here. Unless you know that you need to optimize, and unless you have measured that need to optimize, then don't do it. Adding the unset will add a run-time hit and will make future programmers why you are doing it.

Leave it alone.

死开点丶别碍眼 2024-10-18 21:07:21

如果在“格式化”中的任何时候您执行以下操作:

$reformattedVal['a']['b'] = $myData[$key];

那么执行 unset($myData[$key]); 与内存无关,因为您只是减少变量的引用计数,它现在存在于两个位置(在 $myData[$key]$reformattedVal['a']['b'] 内)。实际上,您节省了在原始数组中索引变量的内存,但这几乎没有什么。

If at any point in the "formatting" you do something like:

$reformattedVal['a']['b'] = $myData[$key];

Then doing unset($myData[$key]); is irrelevant memory-wise because you are only decreasing the reference count of the variable, which now exists in two places (inside $myData[$key] and $reformattedVal['a']['b']). Actually, you save the memory of indexing the variable inside the original array, but that's almost nothing.

小霸王臭丫头 2024-10-18 21:07:21

除非您通过引用访问元素,否则取消设置将不会执行任何操作,因为您无法在迭代器内更改数组。

也就是说,修改要迭代的集合通常被认为是不好的做法 - 更好的方法是将源数组分解为更小的块(通过一次仅加载源数据的一部分)并处理这些块,随时取消设置每个整个数组“块”。

Unless you're accessing the element by reference unsetting will do nothing whatsoever, as you can't alter the array during within the iterator.

That said, it's generally considered bad practice to modify the collection you're iterating over - a better approach would be to break down the source array into smaller chunks (by only loading a portion of the source data at a time) and process these, unsetting each entire array "chunk" as you go.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文