传感器网络中SVM的降维
我正在寻找有关我当前面临的问题的一些建议。
我有一组传感器 S1-S100,当执行某些事件 E1-E20 时会触发它。假设通常E1触发S1-S20,E2触发S15-S30,E3触发S20-s50等等,E1-E20是完全独立的事件。有时,事件 E 可能会触发任何其他不相关的传感器。
我使用 20 个 svm 的集合来分别分析每个事件。我的特征是传感器频率 F1-F100、每个传感器触发的次数以及其他一些相关特征。
我正在寻找一种可以降低传感器特征维度的技术(F1-F100)/或一些包含所有传感器并降低维度的技术(过去几天我一直在寻找一些信息论概念)。我不认为平均、最大化是一个好主意,因为我冒着丢失信息的风险(它没有给我带来好的结果)。
有人可以建议我在这里缺少什么吗?一篇论文或一些起始想法...
提前致谢。
I am looking for some suggestions on a problem that I am currently facing.
I have a set of sensor say S1-S100 which is triggered when some event E1-E20 is performed. Assume, normally E1 triggers S1-S20, E2 triggers S15-S30, E3 triggers S20-s50 etc and E1-E20 are completely independent events. Occasionally an event E might trigger any other unrelated sensor.
I am using ensemble of 20 svm to analyze each event separately. My features are sensor frequency F1-F100, number of times each sensor is triggered and few other related features.
I am looking for a technique that can reduce the dimensionality of the sensor feature(F1-F100)/ or some techniques that encompasses all of the sensor and reduces the dimension too(i was looking for some information theory concept for last few days) . I dont think averaging, maximization is a good idea as I risk loosing information(it did not give me good result).
Can somebody please suggest what am I missing here? A paper or some starting idea...
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许您可能想从线性判别分析开始,这是一个相当简单的算法,并且或多或少做了一些事情您正在寻找什么:降维和/或分类。它假设每个类都是高斯分布,具有不同的均值但具有相同的协方差。预先绘制一些数据以确保此假设是合理的可能是个好主意。我之前在 R 中使用过 LDA 实现。然而,它具有大约十几个功能。我不确定它如何扩展到 100 维。
了解为什么要减少数据维度也可能有所帮助。 SVM 通常具有数十万个(稀疏)特征,那么您遇到的困难是什么?
Perhaps you might like to start with Linear Discriminant Analysis, it's a fairly simple algorithm and does more or less what you are looking for: dimensionality reduction and/or classification. It assumes each class is Gaussian distributed with different means but the same covariance. It's probably a good idea to plot some of the data beforehand to make sure that this assumption is reasonable. I've used the LDA implementation in R before. This was with about a dozen features, however. I'm not sure how it would scale to 100 dimensions.
It might also help to know why you want to reduce the dimension of the data. SVMs are commonly used with hundreds of thousands of (sparse) features, so what is the difficulty you have?
这是一篇与您的问题相关的精彩文章:http://en.wikipedia.org/wiki/Nonlinear_Dimensionality_reduction
此外,正如 @StompChicken 提到的,让 SVM 与数百个功能一起工作应该不会有任何问题。您应该开始看到数以万计的功能的(操作)问题。
卡洛斯
This is a great article related to your question: http://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction
Also, as @StompChicken mentions you shouldn't have any trouble making an SVM work with a few hundred features. You should start seeing (operational) issues at tens of thousands of features.
Carlos