为测试添加噪音
我编写了一些聚类算法来理解它们。他们运行得很好。但我想知道当添加噪音时它们的工作效果如何。我不太确定如何向我的数据添加噪音。
对每个项目进行小扰动是否足够,例如
原始:1、2.34、3.2346、4.234、5.235、6.245、7.45
2、3.54、4.2646、2.24、4.25、6.25、4.5 ....
新函数会找到每列的方差,然后将其添加到该列的每个元素。
或者我是否添加一个远离每个集群的新项目集?如果是这样我会怎么做?
I have written a few clustering algorithms to understand them. They run perfectly fine. But I would like to know how well they work when noise is added. I'm not really sure how to add noise to my data.
Is it enough to take a small perturbation in each item such as
Original: 1, 2.34, 3.2346, 4.234, 5.235, 6.245, 7.45
2, 3.54, 4.2646, 2.24, 4.25, 6.25, 4.5 ....
The new would find the variance of each column and then add that to each element of the column.
Or do I add a new item set which would be away from each cluster? If so how would I do that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论