我想要我可以尝试用于示例实验的预测/预测算法列表
我想知道我可以做实验的算法列表,根据一组输入来预测患者患癌症或发烧或其他疾病的概率...请假设我有数百万的数据,所以我想尝试最好的算法预测......我对数据挖掘和机器学习真的很陌生......
I want to know the list of algorithms which I can do experiment to predict the probability of cancer or fever or whatever in patient based from set of inputs...Please assume that I have data in millions so I want to try the best algorithms to predict that...I am really new to data mining and machine learning....
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
当前最流行的预测和分类算法之一是 Leo Breiman 的随机森林 (RF) 。它的实现也可以在 weka 中实现。
One of the current and most popular algorithms for prediction and classification is Random Forests (RF) by Leo Breiman. Its implementation is available in weka, too.
如果您专门研究估计某事物的概率,那么您需要使用生成概率的机器学习方法。大多数只生成一个类标签:是/否。
最著名的概率估计算法是逻辑回归。 Weka 中提供了实现。
If you are looking specifically at estimating probabilities of something, then you need to use a machine learning approach that generates probabilities. Most only generate a class label: yes/no.
The most well-known algorithm for estimating probabilities is Logistic Regression. An implementation is available in Weka.
这个问题有点模糊,我只能给出一个模糊的答案:使用全能的SVM!将数百万个输入向量输入 SVM 分类器,之后它应该能够为您提供最先进的预测。
如果您正在寻找 SVM 的实现,请查看 libsvm,几乎所有像样的编程语言都有包装器。
The question being a bit vague, I can only give a vague answer : use the almighty SVM! Feed the SVM classifier with your millions of input vectors, and it should be able to give you state-of-the-art predictions afterwards.
If you're looking for an implementation of SVM, have a look at libsvm, which has wrappers in almost every decent programming language.
用于开始大量机器学习库实验的最流行工具是 Weka。在这里您可以上传数据并尝试多种算法。它的弱点是可扩展性,但处理数据不是问题。
Most popular tool for starting experiment with large amount of machine learning libs is Weka. Here you can upload your data and try many algorithms. It's weaknes is scalability, but it is not problem for plaing with data.