C++映射查找性能与 PHP 数组查找性能

发布于 2024-12-14 14:20:14 字数 1197 浏览 0 评论 0原文

我无法理解以下内容,我希望有人能为我解释一下:

在 C++ 中,如果我创建一个包含 2M 不同文本位(测试数据)的测试数据向量,然后使用这些字符串作为索引值,然后查找所有值,如下所示:

 //Create test data
for(int f=0; f<loopvalue; f++)
{   
    stringstream convertToString;
    convertToString << f;
    string strf = convertToString.str();
    testdata[f] = "test" + strf;
}

    time_t startTimeSeconds = time(NULL);

   for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map
   for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup

   time_t endTimeSeconds = time(NULL);
   cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;

需要 10 秒。

如果我在 PHP 中做看似至少相同的事情:

<?php
$starttime = time();
$loopvalue = 2000000;

//fill array
for($f=0; $f<$loopvalue; $f++)
{
    $filler = "test" . $f;
    $testarray[$filler] = $f;
}

//look up array
for($f=0; $f<$loopvalue; $f++)
{
    $filler = "test" . $f;
    $result = $testarray[$filler];
}

$endtime = time();
echo "Time taken ".($endtime-$starttime)." seconds.";
?>

...只需要 3 秒。

鉴于 PHP 是用 C 编写的,有谁知道 PHP 如何实现如此快速的文本索引查找?

谢谢 C

I can't understand the following and I'm hoping someone can shed some light on it for me:

In C++ if I create a vector of test data containing 2M different bits of text (testdata) then create a map using these strings as the index values, then look up all the values, like this:

 //Create test data
for(int f=0; f<loopvalue; f++)
{   
    stringstream convertToString;
    convertToString << f;
    string strf = convertToString.str();
    testdata[f] = "test" + strf;
}

    time_t startTimeSeconds = time(NULL);

   for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map
   for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup

   time_t endTimeSeconds = time(NULL);
   cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;

It takes 10 seconds.

If I do seemingly at least the same in PHP:

<?php
$starttime = time();
$loopvalue = 2000000;

//fill array
for($f=0; $f<$loopvalue; $f++)
{
    $filler = "test" . $f;
    $testarray[$filler] = $f;
}

//look up array
for($f=0; $f<$loopvalue; $f++)
{
    $filler = "test" . $f;
    $result = $testarray[$filler];
}

$endtime = time();
echo "Time taken ".($endtime-$starttime)." seconds.";
?>

...it takes only 3 seconds.

Given that PHP is written in C does anyone know how PHP achieves this much faster text index lookup?

Thanks
C

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

メ斷腸人バ 2024-12-21 14:20:14

您的循环不是绝对等效的算法。
请注意,在 C++ 版本中,您有

  1. testmap[ testdata[f] ] - 这实际上是一个查找 + 插入
  2. testmap[ testdata[f] ] - 2 个查找

在 PHP 版本中,您只需在第一个循环中插入并在第二个循环中查找一。

PHP 是解释型的 - 一般来说,如果您用 PHP 编写代码速度更快,请先检查代码! ;-)

Your loops are not absolutely equivalent algorithms.
Note that in the C++ version you have

  1. testmap[ testdata[f] ] - this is actually a lookup + insert
  2. testmap[ testdata[f] ] - 2 lookups

In the PHP versions you just have insert in the first loop and lookup in the second one.

PHP is interpreted - generally if you code is faster in PHP, check the code first ! ;-)

静赏你的温柔 2024-12-21 14:20:14

我怀疑你对错误的事情进行了基准测试。
不管怎样,我使用了你的代码(必须对你的数据类型做出一些假设),这是我的机器的结果:

PHP:
耗时2秒。

C++(使用 std::map):
耗时3秒。

C++(使用 std::tr1::unordered_map):
耗时1秒。

C++ 编译如下

g++ -03

是我的测试 C++ 代码:

#include <map>
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
#include <tr1/unordered_map>


int main(){
    const int loopvalue=2000000;
    std::vector<std::string> testdata(loopvalue);
    std::tr1::unordered_map<std::string, int> testmap;
    std::string result;
    for(int f=0; f<loopvalue; f++)
    {   
        std::stringstream convertToString;
        convertToString << f;
        std::string strf = convertToString.str();
        testdata[f] = "test" + strf;
    }

    time_t startTimeSeconds = time(NULL);

    for(int f=0; f<loopvalue; f++) testmap[ testdata[f] ] = f; //Write to map
    for(int f=0; f<loopvalue; f++) result = testmap[ testdata[f] ]; //Lookup

    time_t endTimeSeconds = time(NULL);
    std::cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << std::endl;
}

结论:

您测试了未优化的 C++ 代码,甚至可能使用 VC++ 进行编译,默认情况下,在调试模式下编译时,它在 std::vector::operator[] 中进行边界检查。

当我们使用 std::map 时,由于查找复杂性的差异(参见 n0rd 的答案),PHP 与优化的 C++ 代码仍然存在差异,但是当您使用 Hashmap 时,C++ 速度更快。

I suspect you benchmark the wrong things.
Anyway, I used your code (had to make some assumptions on your data types) and here are the results from my machine:

PHP:
Time taken 2 seconds.

C++ (using std::map):
Time taken 3 seconds.

C++ (using std::tr1::unordered_map):
Time taken 1 seconds.

C++ compiled with

g++ -03

Here is my test C++ code:

#include <map>
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
#include <tr1/unordered_map>


int main(){
    const int loopvalue=2000000;
    std::vector<std::string> testdata(loopvalue);
    std::tr1::unordered_map<std::string, int> testmap;
    std::string result;
    for(int f=0; f<loopvalue; f++)
    {   
        std::stringstream convertToString;
        convertToString << f;
        std::string strf = convertToString.str();
        testdata[f] = "test" + strf;
    }

    time_t startTimeSeconds = time(NULL);

    for(int f=0; f<loopvalue; f++) testmap[ testdata[f] ] = f; //Write to map
    for(int f=0; f<loopvalue; f++) result = testmap[ testdata[f] ]; //Lookup

    time_t endTimeSeconds = time(NULL);
    std::cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << std::endl;
}

Conclusion:

You tested unoptimized C++ code, probably even compiled with VC++, which by default has a bounds check in std::vector::operator[] when compiled in debug mode.

There still is a difference of PHP to the optimised C++ code, when we use std::map, because of the difference in lookup complexity (see n0rd's answer), but C++ is faster when you use a Hashmap.

真心难拥有 2024-12-21 14:20:14

根据另一个问题,PHP中的关联数组实现为哈希表,平均搜索复杂度为 O(1),而 std::map 在 C++ 中是一棵二叉树,搜索复杂度为 O(log n ),速度较慢。

According to another question, associative arrays in PHP are implemented as hash tables, which have search complexity of O(1) on average, while std::map in C++ is a binary tree with search complexity of O(log n), which is slower.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文