当前位置：文江博客话题详情

algorithm Java median

如何计算 Map 的中位数？

发布于 2024-09-05 20:56:04 字数 471 浏览 11 评论 0 原文

对于一个映射，其中键代表序列中的一个数字，而值代表该数字在序列中出现的频率，那么 Java 中算法的实现如何计算中位数？

例如：

1,1,2,2,2,2,3,3,3,4,5,6,6,6,7,7

在地图中：

Map<Int,Int> map = ...
map.put(1,2)
map.put(2,4)
map.put(3,3)
map.put(4,1)
map.put(5,1)
map.put(6,3)
map.put(7,2)

double median = calculateMedian(map);
print(median);

会导致：

> print(median);
3
>

所以我正在寻找的是calculateMedian的java实现。

原文

For a map where the key represents a number of a sequence and the value the count how often this number appeared in the squence, how would an implementation of an algorithm in java look like to calculate the median?

For example:

1,1,2,2,2,2,3,3,3,4,5,6,6,6,7,7

in a map:

Map<Int,Int> map = ...
map.put(1,2)
map.put(2,4)
map.put(3,3)
map.put(4,1)
map.put(5,1)
map.put(6,3)
map.put(7,2)

double median = calculateMedian(map);
print(median);

would result in:

> print(median);
3
>

So what i am looking for is a java implementation of calculateMedian.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

许仙没带伞 2024-09-12 20:56:09

使用 SortedMap，即 TreeMap
迭代一次映射来计算元素总数，即所有出现次数的总和
再次迭代并累加出现次数，直到完成达到了总数的一半。导致总和超过总数一半的数字是中位数
广泛测试相差一误差

回复收藏 0 原文

浅紫色的梦幻 2024-09-12 20:56:09

对于简单但可能不太高效的算法，我会这样做：

1。将地图展开为列表。

实际上：迭代地图并将键“value-times”添加到新列表中。最后对列表进行排序。

//...
List<Integer> field = new ArrayList<Integer>();
for (Integer key:map) {
  for (int i = 0; i < map.get(key); i++) {
    field.add(key);
  }
}
Collections.sort(field);

2.计算中位数

现在您必须实现方法intcalculateMedian(Listsorted)。这取决于您需要的中位数类型。如果只是样本中位数，则结果是最中间的值（对于具有奇数个元素的列表）或两个最中间值的平均值（对于具有偶数长度的列表）。请注意，该列表需要排序！

（参考：样本中位数/维基百科）

好吧，好吧，尽管克里斯没有提到效率，这里有一个想法如何在不扩展地图的情况下计算样本中位数（！）...

Set<Integer> sortedKeys = new TreeSet<Integer>(map.keySet()); // just to be sure ;)
Integer median = null;  // Using Integer to have a 'invalid/not found/etc' state
int total = 0;
for (Integer key:sortedKeys) {
  total += map.get(key);
}
if (isOddNumber(total)) { // I don't have to implement everything, do I?
  int counter = total / 2;  // index starting with 0
  for (Integer key:sortedKeys) {
    middleMost -= map.get(key);
    if (counter < 0) {
      // the sample median was in the previous bin
      break;
    }
    median = key;
  }
} else {
  int lower = total/2;
  int upper = lower + 1;
  for (Integer key:sortedKeys) {
    lower -= map.get(key);
    upper -= map.get(key);
    if (lower < 0 && upper < 0) {
      // both middlemost values are in the same bin
      break;
    } else (lower < 0 || upper < 0) {
      // lower is in the previous, upper in the actual bin
      median = (median + key) / 2; // now we need the average
      break;
    }
    median = key;
  }
}

（我手头没有编译器 - 如果它有很多语法错误，请将其视为伪代码；））

For in easy but maybe not-so-efficient algorithm I'd do it like this:

1. expand the map to a list.

practically spoken: iterate through the map and add the key 'value-times' to the new list. Finally sort the list.

//...
List<Integer> field = new ArrayList<Integer>();
for (Integer key:map) {
  for (int i = 0; i < map.get(key); i++) {
    field.add(key);
  }
}
Collections.sort(field);

2. calculate the median

now you have to implement a method int calculateMedian(List<Integer> sorted). This depends on the kind of median you need. If it's just the sample median, then the result is either the middlemost value (for lists with an odd number of elements) or the average of the two middlemost values (for lists with an even length). Note, that the list needs to be sorted!

(Ref: Sample Median / wikipedia)

OK, OK, even though Chris didn't mention efficiency, here's an idea how to calculate the sample median (!) without expanding the map...

Set<Integer> sortedKeys = new TreeSet<Integer>(map.keySet()); // just to be sure ;)
Integer median = null;  // Using Integer to have a 'invalid/not found/etc' state
int total = 0;
for (Integer key:sortedKeys) {
  total += map.get(key);
}
if (isOddNumber(total)) { // I don't have to implement everything, do I?
  int counter = total / 2;  // index starting with 0
  for (Integer key:sortedKeys) {
    middleMost -= map.get(key);
    if (counter < 0) {
      // the sample median was in the previous bin
      break;
    }
    median = key;
  }
} else {
  int lower = total/2;
  int upper = lower + 1;
  for (Integer key:sortedKeys) {
    lower -= map.get(key);
    upper -= map.get(key);
    if (lower < 0 && upper < 0) {
      // both middlemost values are in the same bin
      break;
    } else (lower < 0 || upper < 0) {
      // lower is in the previous, upper in the actual bin
      median = (median + key) / 2; // now we need the average
      break;
    }
    median = key;
  }
}

(I have no compiler at hand - if it has to many syntax errors, treat it as pseudo code, please ;) )

回复收藏 0 原文

扮仙女 2024-09-12 20:56:08

使用 Guava：

Multiset<Integer> values = TreeMultiset.create();
Collections.addAll(values, 1,1,2,2,2,2,3,3,3,4,5,6,6,6,7,7);

现在您问题的答案是：

return Iterables.get(values, (values.size() - 1) / 2);

真的。就是这样。（或者检查大小是否均匀，并平均两个中心值，准确地说。）

如果计数特别大，使用多重集的 entrySet 并保持连续总和，但最简单的方法通常就可以了。

Using Guava:

Multiset<Integer> values = TreeMultiset.create();
Collections.addAll(values, 1,1,2,2,2,2,3,3,3,4,5,6,6,6,7,7);

Now the answer to your question is:

return Iterables.get(values, (values.size() - 1) / 2);

Really. That's it. (Or check if size is even and average the two central values, to be precise about it.)

If the counts are particularly large, it would be faster to use the multiset's entrySet and keep a running sum, but the simplest way is usually fine.

回复收藏 0 原文

相思碎 2024-09-12 20:56:07

线性时间

如果您知道数字总数（在您的情况下为 16），您可以从地图的开头或结尾开始计算计数的总和，直到达到 round(n/第 2) 个元素，或者如果总和等于第 (n/2) 个元素和第 (n/2) 个元素的平均值 = 中位数。

如果您不知道总数，则必须至少将所有内容都检查一遍。

次线性时间

如果您可以决定数据结构并可以进行预处理，请参阅维基百科

编辑：
因此，假设我们有一个带有计数的序列，我们可以做的是

在插入 key ->; count 对维护另一个映射 - key -> running_total
这样，您将拥有一个结构，在该结构中，您可以通过查看最后一个键的 running_total 来获取total_count
并且您将能够进行二分搜索来定位元素运行总计接近total_count/2，

这将使内存使用量加倍，但中位数的性能为O(log n)，total_count的性能为O(1)。

回复收藏 0 原文

~没有更多了~

关于作者

栖迟

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何计算 Map 的中位数？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

佚名

羁客

天天爱笑的徐老师

星

夏日落

隐诗

友情链接

如何计算 Map 的中位数？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

佚名

羁客

天天爱笑的徐老师

星

夏日落

隐诗

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。