计算最常见的值

发布于 2024-08-13 02:16:43 字数 104 浏览 3 评论 0原文

如果我有一个矩阵 A,其中 n 个值从 65:90 开始。如何获得 A 中 10 个最常见的值?我希望结果是一个 10x2 矩阵 B,其中第一列中包含 10 个公共值,第二列中包含它出现的次数。

If i have a matrix A with n values spanning from 65:90. How do i get the 10 most common values in A? I want the result to be a 10x2 matrix B with the 10 common values in the first column and the times it appears in the second column.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

羁〃客ぐ 2024-08-20 02:16:43
A = [65 82 65 90; 90 70 72 82]; % Your data
range = 65:90;
res = [range; histc(A(:)', range)]'; % res has values in first column, counts in second.

现在您所要做的就是按第二列对 res 数组进行排序,并获取前 10 行。

sortedres = sortrows(res, -2); % sort by second column, descending
first10 = sortedres(1:10, :)
A = [65 82 65 90; 90 70 72 82]; % Your data
range = 65:90;
res = [range; histc(A(:)', range)]'; % res has values in first column, counts in second.

Now all you’ve got to do is sort the res array by the second column and take the first 10 rows.

sortedres = sortrows(res, -2); % sort by second column, descending
first10 = sortedres(1:10, :)
雨后彩虹 2024-08-20 02:16:43

使用 arrayfun() 可以轻松解决这个问题

A = [...]; % Your target matrix with values 65:90
labels = 65:90 % Possible values to look for
nTimesOccured = arrayfun(@(x) sum(A(:) == x), labels);
[sorted sortidx] = sort(nTimesOccured, 'descend');

B = [labels(sortidx(1:10))' sorted(1:10)'];

This is easily solved using arrayfun()

A = [...]; % Your target matrix with values 65:90
labels = 65:90 % Possible values to look for
nTimesOccured = arrayfun(@(x) sum(A(:) == x), labels);
[sorted sortidx] = sort(nTimesOccured, 'descend');

B = [labels(sortidx(1:10))' sorted(1:10)'];
我要还你自由 2024-08-20 02:16:43

我们可以使用统计工具箱中的制表添加第四个选项:

A = randi([65 90], [1000 1]);   %# thousand random integers in the range 65:90
t = sortrows(tabulate(A), -2);  %# compute sorted frequency table
B = t(1:10, 1:2);               %# take the top 10

We can add a fourth option using tabulate from the Statistics Toolbox:

A = randi([65 90], [1000 1]);   %# thousand random integers in the range 65:90
t = sortrows(tabulate(A), -2);  %# compute sorted frequency table
B = t(1:10, 1:2);               %# take the top 10
埖埖迣鎅 2024-08-20 02:16:43

哎呀,这是另一个解决方案,所有简单的内置命令

[V, I] = unique(sort(A(:)));
M = sortrows([V, diff([0; I])], -2);
Top10 = M(1:10, :);

第一行:对所有值进行排序,然后查找每个新值在排序列表中开始的偏移量。
第二行:计算每个唯一值的偏移差异,并对这些结果进行排序。

顺便说一句,如果可能的数字范围非常大,例如 [0,1E8],我只会建议使用此方法。在这种情况下,其他一些方法可能会出现内存不足错误。

Heck, here is another solution, all simple builtin commands

[V, I] = unique(sort(A(:)));
M = sortrows([V, diff([0; I])], -2);
Top10 = M(1:10, :);

First line: sorts all values, and then looks for the offset where each new values starts in the sorted list.
Second line: compute the offset differences per unique value, and sort those results.

BTW, I would only suggest this method if the range possible numbers is really large, such as [0,1E8]. In that case, some of the other methods might get an out-of-memory error.

遥远的她 2024-08-20 02:16:43

这也可以用 AccumArray 来解决

ncounts = accumarray(A(:),1);  %ncounts should now be a 90 x 1 vector of counts
[vals,sidx] = sort(ncounts,'descend');   %vals has the counts, sidx has the number
B = [sidx(1:10),vals(1:10)];

AccumArray 的速度没有应有的快,但通常比其他同类操作更快。我花了很多时间扫描它的帮助页面才明白它到底在做什么。出于您的目的,它可能比 histc 解决方案慢一些,但更直接一点。

--edit:忘记了accumarray调用中的“1”。

this can also be solved with accumarray

ncounts = accumarray(A(:),1);  %ncounts should now be a 90 x 1 vector of counts
[vals,sidx] = sort(ncounts,'descend');   %vals has the counts, sidx has the number
B = [sidx(1:10),vals(1:10)];

accumarray is not as fast as it should be, but often faster than other operations of its type. it took me a number of scans of its help page to understand what the hell it is doing. for your purposes, it is probably slower than the histc solution, but a little more straight-forward.

--edit: forgot the '1' in the accumarray call.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文