在数组中查找两个最小 int64 元素的最快方法

发布于 2024-12-10 11:36:32 字数 754 浏览 1 评论 0原文

我的数组大小从 1000 到 10000 (1k .. 10k)。每个元素都是 int64。我的任务是找到数组中两个最小的元素，即最小元素和剩余元素中的最小值。

我想为 Intel Core2 或 Corei7（CPU 模式为 64 位）获得最快的 C++ 单线程代码。

这个函数（从数组中获取 2 个最小的值）是热点，它嵌套在两个或三个 for 循环中，迭代次数巨大。

当前代码如下：

int f()
{
    int best; // index of the minimum element
    int64 min_cost = 1LL << 61;
    int64 second_min_cost = 1LL << 62;
    for (int i = 1; i < width; i++) {
     int64 cost = get_ith_element_from_array(i); // it is inlined
     if (cost < min_cost) {
        best = i;
        second_min_cost = min_cost;
        min_cost = cost;
     } else if (cost < second_min_cost) {
        second_min_cost = cost;
     }
    }
    save_min_and_next(min_cost, best, second_min_cost);
}

原文

I have arrays with sizes from 1000 to 10000 (1k .. 10k). Each element is int64. My task is to find two smallest elements of the arrays, the minimum element and the minimum from the remaining.

I want to get fastest possible single-threaded code in C++ for Intel Core2 or Corei7 (cpu mode is 64 bit).

This function (getting the 2 smallest from array) is the hotspot, it is nested in two or three for loops with huge iteration count.

Current code is like:

int f()
{
    int best; // index of the minimum element
    int64 min_cost = 1LL << 61;
    int64 second_min_cost = 1LL << 62;
    for (int i = 1; i < width; i++) {
     int64 cost = get_ith_element_from_array(i); // it is inlined
     if (cost < min_cost) {
        best = i;
        second_min_cost = min_cost;
        min_cost = cost;
     } else if (cost < second_min_cost) {
        second_min_cost = cost;
     }
    }
    save_min_and_next(min_cost, best, second_min_cost);
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寒尘 2024-12-17 11:36:32

查看 partial_sort 和 nth_element

std::vector<int64_t> arr(10000); // large

std::partial_sort(arr.begin(), arr.begin()+2, arr.end());
// arr[0] and arr[1] are minimum two values

如果您只想要第二低的值，则 nth_element 为你的家伙

Look at partial_sort and nth_element

std::vector<int64_t> arr(10000); // large

std::partial_sort(arr.begin(), arr.begin()+2, arr.end());
// arr[0] and arr[1] are minimum two values

If you only wanted the second lowest value, nth_element is your guy

回复收藏 0 原文

吃素的狼 2024-12-17 11:36:32

尝试反转 if：

if (cost < second_min_cost) 
{ 
    if (cost < min_cost) 
    { 
    } 
    else
    {
    }
}

并且您可能应该使用相同的值初始化 min_cost 和 secondary_min_cost ，使用 int64 的最大值（或者更好地使用 qbert220 的建议）

Try inverting the if:

if (cost < second_min_cost) 
{ 
    if (cost < min_cost) 
    { 
    } 
    else
    {
    }
}

And you should probably initialize min_cost and second_min_cost with the same value, using the max value of int64 (or even better use the suggestion of qbert220)

回复收藏 0 原文

淡莣 2024-12-17 11:36:32

一些小事情（可能已经发生了，但我想可能值得尝试）。

稍微展开循环 - 比如说以 8 的步幅进行迭代（即一次缓存行），预取主体中的下一个缓存行，然后处理 8 个项目。为了避免大量检查，请确保结束条件是 8 的倍数，并且应在循环外部处理剩余的项目（少于 8） - 展开...
对于不感兴趣的项目，您是在身体里做两次检查，也许你可以修剪到1？，如果 cost 小于 second_min，则也检查 min - 否则无需打扰...

回复收藏 0 原文

能否归途做我良人 2024-12-17 11:36:32

您最好先检查 secondary_min_cost，因为它是唯一需要修改结果的条件。这样，您将在主循环中获得一个分支，而不是两个。这应该有很大帮助。

除此之外，几乎没有什么需要优化的，你已经接近最佳状态了。展开可能会有所帮助，但我怀疑它在这种情况下会带来任何显着的优势。

所以，它变成：

int f()
{
    int best; // index of the minimum element
    int64 min_cost = 1LL << 61;
    int64 second_min_cost = 1LL << 62;
    for (int i = 1; i < width; i++) {
    int64 cost = get_ith_element_from_array(i); // it is inlined
    if (cost < second_min_cost)
    {
      if (cost < min_cost) 
      {
        best = i;
        second_min_cost = min_cost;
        min_cost = cost;
      } 
      else second_min_cost = cost;
    }
    save_min_and_next(min_cost, best, second_min_cost);
}

You'd better check second_min_cost first, since it is the only condition which requires to modify the result. This way, you'll get one branch, instead of 2, into your main loop. This should help quite a bit.

Other than that, there is very little to optimise, your are already close to optimal. Unrolling may help, but i doubt it will bring any significant advantage in this scenario.

So, it becomes :

int f()
{
    int best; // index of the minimum element
    int64 min_cost = 1LL << 61;
    int64 second_min_cost = 1LL << 62;
    for (int i = 1; i < width; i++) {
    int64 cost = get_ith_element_from_array(i); // it is inlined
    if (cost < second_min_cost)
    {
      if (cost < min_cost) 
      {
        best = i;
        second_min_cost = min_cost;
        min_cost = cost;
      } 
      else second_min_cost = cost;
    }
    save_min_and_next(min_cost, best, second_min_cost);
}

回复收藏 0 原文