在 MATLAB 中获取 Kmeans 聚类中最接近质心的数据点的索引

发布于 2024-10-06 12:30:24 字数 185 浏览 6 评论 0原文

我正在 MATLAB 中使用 K-means 进行一些聚类。您可能知道用法如下：

[IDX,C] = kmeans(X,k)

其中 IDX 给出 X 中每个数据点的簇号，C 给出每个簇的质心。我需要获取该簇的索引（实际数据集 X 中的行号）距离质心最近的数据点。有谁知道我该怎么做？谢谢

原文

I am doing some clustering using K-means in MATLAB. As you might know the usage is as below:

[IDX,C] = kmeans(X,k)

where IDX gives the cluster number for each data point in X, and C gives the centroids for each cluster.I need to get the index(row number in the actual data set X) of the closest datapoint to the centroid. Does anyone know how I can do that?
Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浮世清欢 2024-10-13 12:30:25

正如 @Dima 将如下所示

%# loop through all clusters
for iCluster = 1:max(IDX)
    %# find the points that are part of the current cluster
    currentPointIdx = find(IDX==iCluster);
    %# find the index (among points in the cluster)
    %# of the point that has the smallest Euclidean distance from the centroid
    %# bsxfun subtracts coordinates, then you sum the squares of
    %# the distance vectors, then you take the minimum
    [~,minIdx] = min(sum(bsxfun(@minus,X(currentPointIdx,:),C(iCluster,:)).^2,2));
    %# store the index into X (among all the points)
    closestIdx(iCluster) = currentPointIdx(minIdx);
end

要获取最接近聚类中心 k 的点的坐标，请使用

X(closestIdx(k),:)

The "brute-force approach", as mentioned by @Dima would go as follows

%# loop through all clusters
for iCluster = 1:max(IDX)
    %# find the points that are part of the current cluster
    currentPointIdx = find(IDX==iCluster);
    %# find the index (among points in the cluster)
    %# of the point that has the smallest Euclidean distance from the centroid
    %# bsxfun subtracts coordinates, then you sum the squares of
    %# the distance vectors, then you take the minimum
    [~,minIdx] = min(sum(bsxfun(@minus,X(currentPointIdx,:),C(iCluster,:)).^2,2));
    %# store the index into X (among all the points)
    closestIdx(iCluster) = currentPointIdx(minIdx);
end

To get the coordinates of the point that is closest to the cluster center k, use

X(closestIdx(k),:)

回复收藏 0 原文

只为守护你 2024-10-13 12:30:25

强力方法是运行 k 均值，然后将簇中的每个数据点与质心进行比较，并找到最接近它的一个。这在 matlab 中很容易做到。

另一方面，您可能想尝试 k-medoids 聚类算法，该算法给出您将一个数据点作为每个簇的“中心”。这是一个 matlab 实现。

回复收藏 0 原文

贪恋 2024-10-13 12:30:25

实际上，如果我理解正确的话，kmeans 已经给了你答案：

[IDX,C, ~, D] = kmeans(X,k); % D is the distance of each datapoint to each of  the clusters
[minD, indMinD] = min(D); % indMinD(i) is the index (in X) of closest point to the i-th centroid

Actually, kmeans already gives you the answer, if I understand you right:

[IDX,C, ~, D] = kmeans(X,k); % D is the distance of each datapoint to each of  the clusters
[minD, indMinD] = min(D); % indMinD(i) is the index (in X) of closest point to the i-th centroid

回复收藏 0 原文

~没有更多了~