如何实现MFCC特征的K-Means聚类算法?
我用MFCC算法得到了一些声音变量的特征。我想用 K-Means 对它们进行聚类。我有 70 帧,每一帧都有一个语音样本的 9 个倒谱系数。这意味着我有一个类似 70*9 大小的矩阵。
假设 A、B 和 C 是语音记录,因此
A 是:
List<List<Double>> -> 70*9 array (I can use Vector instead of List)
并且 B 和 C 也具有相同的长度。
我不想对每个帧进行聚类,我想对每个帧块进行聚类(在我的示例中,一组有 70 个帧)。
如何在 Java 中使用 K-Means 来实现它?
I got the features of some sound variables with MFCC Algorithm. I want to cluster them with K-Means. I have 70 frames and every frame has 9 cepstral coefficients for one voice sample. It means that I have something like a 70*9 size matrix.
Let's assume that A, B and C are the voice records so
A is:
List<List<Double>> -> 70*9 array (I can use Vector instead of List)
and also B and C has same lengths too.
I don't want to cluster each frame, I want to cluster each frame block(at my example one group has 70 frames).
How can I implement it with K-Means at Java?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这就是您对问题领域的了解变得至关重要的地方。您可能只使用 70*9 矩阵之间的距离,但您可能可以更好。我不知道您提到的特定功能,但一些通用示例可能是每个功能 70 个值的平均值、标准差。您基本上希望减少维度数,既可以提高速度,又可以使度量对简单转换具有鲁棒性,例如将所有值偏移一步
Here's where your knowledge of the problem domain becomes crucial. You might just use a distance between the 70*9 matrices but you can probably better. I don't know the particular features you mention, but some generic examples might be average, standard deviation of the 70 values per feature. You're basically looking to reduce the num of dimensions, both to improve speed but also to make the measure robust against sImple transformations, like offsetting all values by one step
K 均值对您的数据有一些非常严格的假设。我不相信您的数据适合对其运行 k 均值。
旁注:远离基本类型(例如 Double)的 Java 泛型。它会降低性能。使用双[][]。
K-Means has some pretty tough assumptions on your data. I'm not convinced that your data is appropriate to run k-means on it.
Side note: keep away from Java generics for primitive type such as Double. It kills performance. Use
double[][]
.