K 均值聚类和矢量量化有什么区别?
K 均值聚类和矢量量化有什么区别?
他们看起来很相似。
我正在处理隐马尔可夫模型,我需要从特征向量中提取符号。
为了提取符号,我需要进行矢量量化还是 k 均值聚类?
What is the difference between K-means clustering and vector quantization?
They seem to be very similar.
I'm dealing with Hidden Markov Models and I need to extract symbols from feature vectors.
In order to extract symbols, do I do vector quantization or k-means clustering?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
据我理解,K-means 是矢量量化的一种。
The way I understand it, K-means is one type of vector quantization.
K 均值算法是著名的“Lloyd I”量化算法针对经验分布情况的专门化。 (参见 Lloyd)
Lloyd I 算法被证明可以产生二次失真递减的量化器序列。然而,除了一维对数凹分布的特殊情况外,它并不总是收敛到二次最优量化器。 (量化误差存在局部最小值,特别是在处理经验分布时,即聚类问题。)
(总是)收敛到最佳量化器的方法是所谓的 CLVQ 算法,它也推广到更多的问题一般 L^p 量化。它是一种随机梯度方法。 (参见 Pagès)
还有一些基于遗传算法的方法。 (参见 Hamida 等人),和/或收敛速度更快的一维情况的经典优化程序(Pagès、Printems)。
The K-means algorithms is the specialization of the celebrated "Lloyd I" quantization algorithm to the case of empirical distributions. (cf. Lloyd)
The Lloyd I algorithm is proved to yield a sequence of quantizers with a decreasing quadratic distortion. However, except in the special case of one-dimensional log-concave distributions, it dos not always converge to a quadratic optimal quantizer. (There are local minimums for the quantization error, especially when dealing with empirical distribution i.e. for the clustering problem.)
A method that converges (always) toward an optimal quantizer is the so-called CLVQ algorithms, which also generalizes to the problem of more general L^p quantization. It is a kind of Stochastic Gradient method. (cf. Pagès)
There are also some approaches based on genetic algorithms. (cf. Hamida et al.), and/or classical optimization procedures for the one dimensional case that converge faster (Pagès, Printems).