在迭代期间取消设置数组值是否会节省内存?
这是一个简单的编程问题,因为我对 PHP 如何在 foreach 循环期间处理数组复制和取消设置缺乏了解。就像这样,我有一个来自外部源的数组,其格式按照我想要更改的方式进行。一个简单的例子是:
$myData = array('Key1' => array('value1', 'value2'));
但我想要的是这样的:
$myData = array([0] => array('MyKey' => array('Key1' => array('value1', 'value2'))));
所以我采用第一个 $myData
并将其格式化为第二个 $myData
。我对我的格式化算法完全满意。我的问题在于找到一种节省内存的方法,因为这些数组可能会变得有点笨拙。因此,在 foreach 循环期间,我将当前数组值复制到新格式中,然后从原始数组中取消设置正在使用的值。例如:
$formattedData = array();
foreach ($myData as $key => $val) {
// do some formatting here, copy to $reformattedVal
$formattedData[] = $reformattedVal;
unset($myData[$key]);
}
调用 unset()
是个好主意吗?即,由于我已经复制了数据并且不再需要原始值,它是否节省内存?或者,PHP 是否会自动垃圾收集数据,因为我没有在任何后续代码中引用它?
代码运行良好,到目前为止,我的数据集的大小可以忽略不计,无法测试性能差异。我只是不知道以后是否会遇到一些奇怪的错误或 CPU 命中。
感谢您的任何见解。
-sR
This is a simple programming question, coming from my lack of knowledge of how PHP handles array copying and unsetting during a foreach
loop. It's like this, I have an array that comes to me from an outside source formatted in a way I want to change. A simple example would be:
$myData = array('Key1' => array('value1', 'value2'));
But what I want would be something like:
$myData = array([0] => array('MyKey' => array('Key1' => array('value1', 'value2'))));
So I take the first $myData
and format it like the second $myData
. I'm totally fine with my formatting algorithm. My question lies in finding a way to conserve memory since these arrays might get a little unwieldy. So, during my foreach
loop I copy the current array value(s) into the new format, then I unset the value I'm working with from the original array. E.g.:
$formattedData = array();
foreach ($myData as $key => $val) {
// do some formatting here, copy to $reformattedVal
$formattedData[] = $reformattedVal;
unset($myData[$key]);
}
Is the call to unset()
a good idea here? I.e., does it conserve memory since I have copied the data and no longer need the original value? Or, does PHP automatically garbage collect the data since I don't reference it in any subsequent code?
The code runs fine, and so far my datasets have been too negligible in size to test for performance differences. I just don't know if I'm setting myself up for some weird bugs or CPU hits later on.
Thanks for any insights.
-sR
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我在循环内处理文本 (xml) 文件的行时内存不足。对于任何有类似情况的人来说,这对我有用:
I was running out of memory while processing lines of a text (xml) file within a loop. For anyone with a similar situation, this worked for me:
使用
&
运算符在foreach
循环中使用对变量的引用。这可以避免在内存中复制数组供 foreach 进行迭代。编辑:正如Artefacto 取消设置变量只会减少对原始变量的引用数量,因此节省的内存仅在指针上,而不是变量的值上。奇怪的是,使用引用实际上会增加总内存使用量,因为可能该值被复制到新的内存位置而不是被引用。
使用
memory_get_usage()
来确定您正在使用多少内存。关于内存使用和分配有一篇很好的文章
这是查看内存分配的有用测试代码 - 尝试取消注释行以查看不同场景下的总内存使用情况。
Use a reference to the variable in the
foreach
loop using the&
operator. This avoids making a copy of the array in memory forforeach
to iterate over.edit: as pointed out by Artefacto unsetting the variable only decreases the number of references to the original variable, so the memory saved is only on pointers rather than the value of the variable. Bizarrely using a reference actually increases the total memory usage as presumably the value is copied to a new memory location instead of being referenced.
Use
memory_get_usage()
to identify how much memory you are using.There is a good write up on memory usage and allocation here.
This is useful test code to see memory allocation - try uncommenting the commented lines to see total memory usage in different scenarios.
请记住优化俱乐部规则:
规则#1 和#2 在这里尤其重要。除非您知道需要优化,并且除非您已经衡量了优化的需要,否则不要这样做。添加未设置将增加运行时命中,并使未来的程序员了解您这样做的原因。
别管它。
Please remember the rules of Optimization Club:
Rules #1 and #2 are especially relevant here. Unless you know that you need to optimize, and unless you have measured that need to optimize, then don't do it. Adding the unset will add a run-time hit and will make future programmers why you are doing it.
Leave it alone.
如果在“格式化”中的任何时候您执行以下操作:
那么执行
unset($myData[$key]);
与内存无关,因为您只是减少变量的引用计数,它现在存在于两个位置(在$myData[$key]
和$reformattedVal['a']['b']
内)。实际上,您节省了在原始数组中索引变量的内存,但这几乎没有什么。If at any point in the "formatting" you do something like:
Then doing
unset($myData[$key]);
is irrelevant memory-wise because you are only decreasing the reference count of the variable, which now exists in two places (inside$myData[$key]
and$reformattedVal['a']['b']
). Actually, you save the memory of indexing the variable inside the original array, but that's almost nothing.除非您通过引用访问元素,否则取消设置将不会执行任何操作,因为您无法在迭代器内更改数组。
也就是说,修改要迭代的集合通常被认为是不好的做法 - 更好的方法是将源数组分解为更小的块(通过一次仅加载源数据的一部分)并处理这些块,随时取消设置每个整个数组“块”。
Unless you're accessing the element by reference unsetting will do nothing whatsoever, as you can't alter the array during within the iterator.
That said, it's generally considered bad practice to modify the collection you're iterating over - a better approach would be to break down the source array into smaller chunks (by only loading a portion of the source data at a time) and process these, unsetting each entire array "chunk" as you go.