性能比较:在 foreach() 签名中调用explode() 与将爆炸数据作为变量传递给 foreach()

发布于 2024-11-05 06:43:02 字数 306 浏览 3 评论 0原文

foreach(explode(',' $foo) as $bar) { ... }

vs

$test = explode(',' $foo);
foreach($test as $bar) { ... }

在第一个示例中,它是否在每次迭代时分解 $foo 字符串,或者 PHP 是否将其保留在分解为自己的临时变量的内存中?从效率的角度来看,创建额外的变量 $test 是否有意义,或者两者几乎相等?

foreach(explode(',' $foo) as $bar) { ... }

vs

$test = explode(',' $foo);
foreach($test as $bar) { ... }

In the first example, does it explode the $foo string for each iteration or does PHP keep it in memory exploded in its own temporary variable? From an efficiency point of view, does it make sense to create the extra variable $test or are both pretty much equal?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夕色琉璃 2024-11-12 06:43:02

我可以做出有根据的猜测,但是让我们尝试一下

我认为可以通过三种主要方法来解决这个问题。

  1. 在进入循环之前爆炸并分配
  2. 在循环内爆炸,没有分配
  3. 字符串 tokenize

我的假设:

  1. 由于分配
  2. 可能与 #1 或 #3 相同,可能会消耗更多内存,不确定哪个
  3. 可能更快且内存占用小得多

测试

这是我的 脚本:

<?php

ini_set('memory_limit', '1024M');

$listStr = 'text';
$listStr .= str_repeat(',text', 9999999);

$timeStart = microtime(true);

/*****
 * {INSERT LOOP HERE}
 */

$timeEnd = microtime(true);
$timeElapsed = $timeEnd - $timeStart;

printf("Memory used: %s kB\n", memory_get_peak_usage()/1024);
printf("Total time: %s s\n", $timeElapsed);

这里是三个版本:

1)

// explode separately 
$arr = explode(',', $listStr);
foreach ($arr as $val) {}

2)

// explode inline-ly 
foreach (explode(',', $listStr) as $val) {}

3)

// tokenize
$tok = strtok($listStr, ',');
while ($tok = strtok(',')) {}

结果

explode() benchmark results

结论

看起来像一些假设被证实了。你不热爱科学吗? :-)

  • 总体而言,对于“合理大小”(几百或几千)的列表来说,这些方法中的任何一种都足够快。
  • 如果您要迭代巨大的东西,时间差异相对较小,但内存使用量可能会不同一个数量级!
  • 当您在没有预分配的情况下内联 explode() 时,由于某种原因,它会慢一些。
  • 令人惊讶的是,标记化比显式迭代声明的数组要慢一些。在如此小的规模上工作,我相信这是由于每次迭代对 strtok() 进行函数调用的调用堆栈开销造成的。下面详细介绍这一点。

就函数调用数量而言,explode()ing 确实领先于标记化。 O(1)O(n)

我在图表中添加了一个奖励,我在循环中通过函数调用运行方法 1)。我使用了 strlen($val),认为这将是一个相对相似的执行时间。这是有争议的,但我只是想表达一个一般性的观点。 我只运行了 strlen($val) 并忽略了它的输出。我没有将它分配给任何东西,因为分配会增加额外的时间成本。)

// explode separately 
$arr = explode(',', $listStr);
foreach ($arr as $val) {strlen($val);}

( 从结果表可以看出,它成为三种方法中最慢的。

最后的想法

这很有趣,但我的建议是做任何你认为最具可读性/可维护性的事情。仅当您确实正在处理非常大的数据集时,您才应该担心这些微观优化。

I could make an educated guess, but let's try it out!

I figured there were three main ways to approach this.

  1. explode and assign before entering the loop
  2. explode within the loop, no assignment
  3. string tokenize

My hypotheses:

  1. probably consume more memory due to assignment
  2. probably identical to #1 or #3, not sure which
  3. probably both quicker and much smaller memory footprint

Approach

Here's my test script:

<?php

ini_set('memory_limit', '1024M');

$listStr = 'text';
$listStr .= str_repeat(',text', 9999999);

$timeStart = microtime(true);

/*****
 * {INSERT LOOP HERE}
 */

$timeEnd = microtime(true);
$timeElapsed = $timeEnd - $timeStart;

printf("Memory used: %s kB\n", memory_get_peak_usage()/1024);
printf("Total time: %s s\n", $timeElapsed);

And here are the three versions:

1)

// explode separately 
$arr = explode(',', $listStr);
foreach ($arr as $val) {}

2)

// explode inline-ly 
foreach (explode(',', $listStr) as $val) {}

3)

// tokenize
$tok = strtok($listStr, ',');
while ($tok = strtok(',')) {}

Results

explode() benchmark results

Conclusions

Looks like some assumptions were disproven. Don't you love science? :-)

  • In the big picture, any of these methods is sufficiently fast for a list of "reasonable size" (few hundred or few thousand).
  • If you're iterating over something huge, time difference is relatively minor but memory usage could be different by an order of magnitude!
  • When you explode() inline without pre-assignment, it's a fair bit slower for some reason.
  • Surprisingly, tokenizing is a bit slower than explicitly iterating a declared array. Working on such a small scale, I believe that's due to the call stack overhead of making a function call to strtok() every iteration. More on this below.

In terms of number of function calls, explode()ing really tops tokenizing. O(1) vs O(n)

I added a bonus to the chart where I run method 1) with a function call in the loop. I used strlen($val), thinking it would be a relatively similar execution time. That's subject to debate, but I was only trying to make a general point. (I only ran strlen($val) and ignored its output. I did not assign it to anything, for an assignment would be an additional time-cost.)

// explode separately 
$arr = explode(',', $listStr);
foreach ($arr as $val) {strlen($val);}

As you can see from the results table, it then becomes the slowest method of the three.

Final thought

This is interesting to know, but my suggestion is to do whatever you feel is most readable/maintainable. Only if you're really dealing with a significantly large dataset should you be worried about these micro-optimizations.

陌上青苔 2024-11-12 06:43:02

在第一种情况下,PHP 将其分解一次并将其保留在内存中。

创建不同变量或其他方式的影响可以忽略不计。 PHP 解释器需要维护一个指向下一项位置的指针,无论它们是否是用户定义的。

In the first case, PHP explodes it once and keeps it in memory.

The impact of creating a different variable or the other way would be negligible. PHP Interpreter would need to maintain a pointer to a location of next item whether they are user defined or not.

初见 2024-11-12 06:43:02

从内存的角度来看,这不会产生任何影响,因为 PHP 使用 写入时复制概念

除此之外,我个人会选择第一个选项 - 它少了一行,但可读性并不差(恕我直言!)。

From the point of memory it will not make a difference, because PHP uses the copy on write concept.

Apart from that, I personally would opt for the first option - it's a line less, but not less readable (imho!).

(り薆情海 2024-11-12 06:43:02

什么意义上的效率?内存管理,还是处理器?对于内存来说,处理器不会有什么区别 - 你总是可以这样做 $foo =explode(',', $foo)

Efficiency in what sense? Memory management, or processor? Processor wouldn't make a difference, for memory - you can always do $foo = explode(',', $foo)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文