C++映射查找性能与 PHP 数组查找性能
我无法理解以下内容,我希望有人能为我解释一下:
在 C++ 中,如果我创建一个包含 2M 不同文本位(测试数据)的测试数据向量,然后使用这些字符串作为索引值,然后查找所有值,如下所示:
//Create test data
for(int f=0; f<loopvalue; f++)
{
stringstream convertToString;
convertToString << f;
string strf = convertToString.str();
testdata[f] = "test" + strf;
}
time_t startTimeSeconds = time(NULL);
for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map
for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup
time_t endTimeSeconds = time(NULL);
cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;
需要 10 秒。
如果我在 PHP 中做看似至少相同的事情:
<?php
$starttime = time();
$loopvalue = 2000000;
//fill array
for($f=0; $f<$loopvalue; $f++)
{
$filler = "test" . $f;
$testarray[$filler] = $f;
}
//look up array
for($f=0; $f<$loopvalue; $f++)
{
$filler = "test" . $f;
$result = $testarray[$filler];
}
$endtime = time();
echo "Time taken ".($endtime-$starttime)." seconds.";
?>
...只需要 3 秒。
鉴于 PHP 是用 C 编写的,有谁知道 PHP 如何实现如此快速的文本索引查找?
谢谢 C
I can't understand the following and I'm hoping someone can shed some light on it for me:
In C++ if I create a vector of test data containing 2M different bits of text (testdata) then create a map using these strings as the index values, then look up all the values, like this:
//Create test data
for(int f=0; f<loopvalue; f++)
{
stringstream convertToString;
convertToString << f;
string strf = convertToString.str();
testdata[f] = "test" + strf;
}
time_t startTimeSeconds = time(NULL);
for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map
for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup
time_t endTimeSeconds = time(NULL);
cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;
It takes 10 seconds.
If I do seemingly at least the same in PHP:
<?php
$starttime = time();
$loopvalue = 2000000;
//fill array
for($f=0; $f<$loopvalue; $f++)
{
$filler = "test" . $f;
$testarray[$filler] = $f;
}
//look up array
for($f=0; $f<$loopvalue; $f++)
{
$filler = "test" . $f;
$result = $testarray[$filler];
}
$endtime = time();
echo "Time taken ".($endtime-$starttime)." seconds.";
?>
...it takes only 3 seconds.
Given that PHP is written in C does anyone know how PHP achieves this much faster text index lookup?
Thanks
C
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的循环不是绝对等效的算法。
请注意,在 C++ 版本中,您有
在 PHP 版本中,您只需在第一个循环中插入并在第二个循环中查找一。
PHP 是解释型的 - 一般来说,如果您用 PHP 编写代码速度更快,请先检查代码! ;-)
Your loops are not absolutely equivalent algorithms.
Note that in the C++ version you have
In the PHP versions you just have insert in the first loop and lookup in the second one.
PHP is interpreted - generally if you code is faster in PHP, check the code first ! ;-)
我怀疑你对错误的事情进行了基准测试。
不管怎样,我使用了你的代码(必须对你的数据类型做出一些假设),这是我的机器的结果:
PHP:
耗时2秒。
C++(使用 std::map):
耗时3秒。
C++(使用 std::tr1::unordered_map):
耗时1秒。
C++ 编译如下
是我的测试 C++ 代码:
结论:
您测试了未优化的 C++ 代码,甚至可能使用 VC++ 进行编译,默认情况下,在调试模式下编译时,它在 std::vector::operator[] 中进行边界检查。
当我们使用 std::map 时,由于查找复杂性的差异(参见 n0rd 的答案),PHP 与优化的 C++ 代码仍然存在差异,但是当您使用 Hashmap 时,C++ 速度更快。
I suspect you benchmark the wrong things.
Anyway, I used your code (had to make some assumptions on your data types) and here are the results from my machine:
PHP:
Time taken 2 seconds.
C++ (using std::map):
Time taken 3 seconds.
C++ (using std::tr1::unordered_map):
Time taken 1 seconds.
C++ compiled with
Here is my test C++ code:
Conclusion:
You tested unoptimized C++ code, probably even compiled with VC++, which by default has a bounds check in std::vector::operator[] when compiled in debug mode.
There still is a difference of PHP to the optimised C++ code, when we use std::map, because of the difference in lookup complexity (see n0rd's answer), but C++ is faster when you use a Hashmap.
根据另一个问题,PHP中的关联数组实现为哈希表,平均搜索复杂度为 O(1),而 std::map 在 C++ 中是一棵二叉树,搜索复杂度为 O(log n ),速度较慢。
According to another question, associative arrays in PHP are implemented as hash tables, which have search complexity of O(1) on average, while std::map in C++ is a binary tree with search complexity of O(log n), which is slower.