如何计算重构误差?
如何计算重建误差以及在哪里可以找到有关它的信息? (我将在K-means算法之后计算我的数据的重建误差)
How to calculate reconstruction error and where can I find information about it? (I will calculate reconstruction error of my data after K-means algorithm)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
需要计算每个簇中每个点到中心点的距离。
Needed to calculate every points distance to center points at each cluster.
根据给定向量计算重构误差的一种方法是计算它与其表示形式之间的欧几里德距离。在 K 均值中,每个向量都由其最近的中心表示。
因此,运行 k 后意味着:对于每个向量,计算该向量的误差作为该向量与其质心之间的欧几里德距离。将每个向量的误差相加,就得到了训练集上的误差。较低的错误往往会产生更好的总体聚类结果。
事实上,K-Means 算法本身试图优化这个度量,如果让它运行到收敛,它会找到欧几里德距离重建误差的局部最小值。
One way to calculate the reconstruction error from a given vector is to compute the euclidean distance between it and its representation. In K-means, each vector is represented by its nearest center.
So after running k means: For each vector, calculate the error for the vector as the euclidean distance between that vector and its centroid. Sum them up the errors for every vector, and you have the error on your training set. Lower errors will tend to give better clusterings overall.
Indeed, the K-Means algorithm is itself tries to optimize this very metric, and if you let it run to convergence, it will find a local minimum on for the euclidean distance reconstruction error.