如何从排序映射中获取中值

发布于 2024-09-13 19:30:07 字数 168 浏览 2 评论 0 原文

我正在使用 std::map。有时我会做这样的操作:找到所有项目的中值。例如 如果我添加

1 "s"
2 "sdf"
3 "sdfb"
4 "njw"
5 "loo"

,则中位数为 3。

是否有一些解决方案,无需迭代地图中超过一半的项目?

I am using a std::map. Sometimes I will do an operation like: finding the median value of all items. e.g
if I add

1 "s"
2 "sdf"
3 "sdfb"
4 "njw"
5 "loo"

then the median is 3.

Is there some solution without iterating over half the items in the map?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

十年不长 2024-09-20 19:30:07

我认为你可以通过使用两个 std::map 来解决这个问题。第一个用于较小的一半项目 (mapL),第二个用于另一半 (mapU)。当你有插入操作时。可能是以下两种情况:

  • 将项目添加到mapU并将最小元素移动到mapL
  • 将项目添加到mapL并将最大元素移动到mapU

如果映射具有不同的大小,并且您将元素插入到具有较小数量的映射中
您跳过移动部分的元素。
基本思想是保持地图平衡,因此最大尺寸差异为 1 个元素。
据我所知,STL 所有操作都应该在 O(ln(n)) 时间内完成。可以使用迭代器访问映射中的最小和最大元素。
当您有第 n_th 位置查询时,只需检查映射大小并返回 mapL 中的最大元素或 mapR 中的最小元素。

上面的使用场景仅用于插入,但您也可以将其扩展为删除项目,但您必须跟踪哪个地图保存项目或尝试从两者中删除。

这是我的代码和示例用法:

#include <iostream>
#include <string>
#include <map>
using namespace std;

typedef pair<int,string> pis;
typedef map<int,string>::iterator itis;

map<int,string>Left;
map<int,string>Right;

itis get_last(map<int,string> &m){
    return (--m.end());
}

int add_element(int key, string val){
    if (Left.empty()){
        Left.insert(make_pair(key,val));
        return 1;
    }

    pis maxl = *get_last(Left);
    if (key <= maxl.first){
        Left.insert(make_pair(key,val));
        if (Left.size() > Right.size() + 1){
            itis to_rem = get_last(Left);
            pis cpy = *to_rem;
            Left.erase(to_rem);
            Right.insert(cpy);
        }
        return 1;
    } else {
        Right.insert(make_pair(key,val));
        if (Right.size() > Left.size()){
            itis to_rem = Right.begin();
            pis cpy = *to_rem;
            Right.erase(to_rem);
            Left.insert(*to_rem);
        }
        return 2;
    }   
}

pis get_mid(){
    int size = Left.size() + Right.size();
    if (Left.size() >= size / 2){
        return *(get_last(Left));
    }
    return *(Right.begin());
}

int main(){
    Left.clear();
    Right.clear();

    int key;
    string val;
    while (!cin.eof()){
        cin >> key >> val;
        add_element(key,val);
        pis mid = get_mid();
        cout << "mid " << mid.first << " " << mid.second << endl;
    }
}

I think you can solve the problem by using two std::map. One for smaller half of items (mapL) and second for the other half (mapU). When you have insert operation. It will be either case:

  • add item to mapU and move smallest element to mapL
  • add item to mapL and move greatest element to mapU

In case the maps have different size and you insert element to the one with smaller number of
elements you skip the move section.
The basic idea is that you keep your maps balanced so the maximum size difference is 1 element.
As far as I know STL all operations should work in O(ln(n)) time. Accessing smallest and greatest element in map can be done by using iterator.
When you have n_th position query just check map sizes and return greatest element in mapL or smallest element in mapR.

The above usage scenario is for inserting only but you can extend it to deleting items as well but you have to keep track of which map holds item or try to delete from both.

Here is my code with sample usage:

#include <iostream>
#include <string>
#include <map>
using namespace std;

typedef pair<int,string> pis;
typedef map<int,string>::iterator itis;

map<int,string>Left;
map<int,string>Right;

itis get_last(map<int,string> &m){
    return (--m.end());
}

int add_element(int key, string val){
    if (Left.empty()){
        Left.insert(make_pair(key,val));
        return 1;
    }

    pis maxl = *get_last(Left);
    if (key <= maxl.first){
        Left.insert(make_pair(key,val));
        if (Left.size() > Right.size() + 1){
            itis to_rem = get_last(Left);
            pis cpy = *to_rem;
            Left.erase(to_rem);
            Right.insert(cpy);
        }
        return 1;
    } else {
        Right.insert(make_pair(key,val));
        if (Right.size() > Left.size()){
            itis to_rem = Right.begin();
            pis cpy = *to_rem;
            Right.erase(to_rem);
            Left.insert(*to_rem);
        }
        return 2;
    }   
}

pis get_mid(){
    int size = Left.size() + Right.size();
    if (Left.size() >= size / 2){
        return *(get_last(Left));
    }
    return *(Right.begin());
}

int main(){
    Left.clear();
    Right.clear();

    int key;
    string val;
    while (!cin.eof()){
        cin >> key >> val;
        add_element(key,val);
        pis mid = get_mid();
        cout << "mid " << mid.first << " " << mid.second << endl;
    }
}
↘紸啶 2024-09-20 19:30:07

我想答案是否定的。您不能直接跳转到开头之后的 N / 2 项,因为 std::map 使用 双向迭代器。您必须迭代地图中一半的项目。如果您有权访问通常用于 std::map 的底层红/黑树实现,您可能能够像 丹尼的回答。但是,您无权访问它,因为它被封装为实现细节。

I think the answer is no. You cannot just jump to the N / 2 item past the beginning because a std::map uses bidirectional iterators. You must iterate through half of the items in the map. If you had access to the underlying Red/Black tree implementation that is typically used for the std::map, you might be able to get close like in Dani's answer. However, you don't have access to that as it is encapsulated as an implementation detail.

不知所踪 2024-09-20 19:30:07

尝试:

typedef std::map<int,std::string>  Data;
Data           data;
Data::iterator median = std::advance(data.begin(), data.size() / 2); 

如果 size() 为奇数,则有效。我会让你弄清楚当 size() 为偶数时如何做到这一点。

Try:

typedef std::map<int,std::string>  Data;
Data           data;
Data::iterator median = std::advance(data.begin(), data.size() / 2); 

Works if the size() is odd. I'll let you work out how to do it when size() is even.

倦话 2024-09-20 19:30:07

在自平衡二叉树(我认为 std::map 是一个)中,一个很好的近似值是根。
对于精确值,只需使用平衡指示器缓存它,每次添加低于中位数的项目时,指示器都会减少,而当项目添加到高于中位数时,指示器会增加。当指标等于 2/-2 时,将中位数向上/向下移动一步并重置指标。

In self balancing binary tree(std::map is one I think) a good approximation would be the root.
For exact value just cache it with a balance indicator, and each time an item added below the median decrease the indicator and increase when item is added above. When indicator is equal to 2/-2 move the median upwards/downwards one step and reset the indicator.

野生奥特曼 2024-09-20 19:30:07

如果您可以切换数据结构,请将项目存储在 std::vector 中并对其进行排序。这将允许在不迭代的情况下按位置访问中间项。 (这可能会令人惊讶,但由于局部性,排序的向量通常比映射性能更好。对于按排序键进行查找,您可以使用二分搜索,它会有很多无论如何,与 map 的性能相同,请参阅 Scott Meyer 的 Effective STL。)

If you can switch data structures, store the items in a std::vector and sort it. That will enable accessing the middle item positionally without iterating. (It can be surprising but a sorted vector often out-performs a map, due to locality. For lookups by the sort key you can use binary search and it will have much the same performance as a map anyway. See Scott Meyer's Effective STL.)

猫卆 2024-09-20 19:30:07

如果您知道地图将被排序,则获取楼层(长度/ 2)处的元素。如果您心情有点烦躁,请尝试(长度>> 1)。

If you know the map will be sorted, then get the element at floor(length / 2). If you're in a bit twiddly mood, try (length >> 1).

醉生梦死 2024-09-20 19:30:07

对于大地图,我不知道如何快速从纯 STL 地图中获取中位数。如果你的地图很小或者你很少需要中位数,我认为你应该使用线性前进到 n/2 - 为了简单和标准。

您可以使用该映射构建一个提供中值的新容器:Jethro 建议使用两个映射,基于此也许更好的是单个映射和不断更新的中值迭代器。这些方法的缺点是您必须重新实现每个修改操作,在 jethro 的情况下甚至是读取操作。

定制编写的容器也可以做您想做的事情,可能是最有效的,但需要定制代码的价格。您可以尝试按照建议修改现有的 stl 地图实现。您还可以查找现有的实现。

有一个超级高效的 C 实现,它提供了大多数地图功能以及随机访问,称为 Judy Arrays。这些适用于整数、字符串和字节数组键。

I know no way to get the median from a pure STL map quickly for big maps. If your map is small or you need the median rarely you should use the linear advance to n/2 anyway I think - for the sake of simplicity and being standard.

You can use the map to build a new container that offers median: Jethro suggested using two maps, based on this perhaps better would be a single map and a continuously updated median iterator. These methods suffer from the drawback that you have to reimplement every modifiying operation and in jethro's case even the reading operations.

A custom written container will also do what you what, probably most efficiently but for the price of custom code. You could try, as was suggested to modify an existing stl map implementation. You can also look for existing implementations.

There is a super efficient C implementation that offers most map functionality and also random access called Judy Arrays. These work for integer, string and byte array keys.

挽清梦 2024-09-20 19:30:07

因为听起来插入和查找是两个常见的操作,而中值很少见,所以最简单的方法是使用映射和 std::advance( m.begin(), m.size()/2 );< /code> 正如 David Rodríguez 最初建议的那样。这是线性时间,但很容易理解,因此如果分析显示中值调用相对于您的应用程序正在执行的工作来说太昂贵,我只会考虑另一种方法。

Since it sounds like insert and find are your two common operations while median is rare, the simplest approach is to use the map and std::advance( m.begin(), m.size()/2 ); as originally suggested by David Rodríguez. This is linear time, but easy to understand so I'd only consider another approach if profiling shows that the median calls are too expensive relative to the work your app is doing.

眼眸 2024-09-20 19:30:07

nth_element() 方法就是为您准备的:)它实现了快速排序的分区部分,并且您不需要对向量(或数组)进行排序。
而且时间复杂度是O(n)(而排序则需要O(nlogn))。

The nth_element() method is there for you for this :) It implements the partition part of the quick sort and you don't need your vector (or array) to be sorted.
And also the time complexity is O(n) (while for sorting you need to pay O(nlogn)).

拒绝两难 2024-09-20 19:30:07

对于排序列表,这里是 Java 代码,但我认为它很容易移植到 C++:

    if (input.length % 2 != 0) {
        return input[((input.length + 1) / 2 - 1)];
    } else {
        return 0.5d * (input[(input.length / 2 - 1)] + input[(input.length / 2 + 1) - 1]);
    }

For a sortet list, here it is in java code, but i assume, its very easy to port to c++:

    if (input.length % 2 != 0) {
        return input[((input.length + 1) / 2 - 1)];
    } else {
        return 0.5d * (input[(input.length / 2 - 1)] + input[(input.length / 2 + 1) - 1]);
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文