基础机器学习

发布于 2024-10-14 13:27:51 字数 1436 浏览 4 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

命比纸薄 2024-10-21 13:27:51

有一个关于机器学习的很好的斯坦福公开课程,包括视频讲座等。
请查看此处

There is a good Stanford Open Course about machine learning with video lectures etc.
Take a look here.

咽泪装欢 2024-10-21 13:27:51

如果您想从简单的事情开始,请考虑拟线性模型,例如逻辑回归或线性判别分析:它们很容易理解,并且互联网上到处都有它们的代码。还要考虑一些更简单的(单节点)神经模型(感知器、Delta 规则等):它们非常容易编程。如果你想追求这个目标,我建议你买一本书,比如 Weiss 和 Kulikowski 写的《Computer Systems That Learn》。

If you want to begin with something simple, consider a quasi-linear model, such as logistic regression or linear discriminant analysis: they are easy to understand, and there is code for them all over the Internet. Also consider some of the simpler (single node) neural models (perceptron, delta rule, etc.): they are very easy to program. If you want to pursue this, I suggest getting a book, such as "Computer Systems That Learn", by Weiss and Kulikowski.

我ぃ本無心為│何有愛 2024-10-21 13:27:51

也许您可以首先在维基百科上搜索各种分类算法,例如k-nearest-neighbour, SVM神经网络

Maybe you can start by searching wikipedia for various classification algorithms like k-nearest-neighbour, SVM or neural network

£烟消云散 2024-10-21 13:27:51

我还会从 K 最近邻开始 - 它们是最简单的 - 并且可以尝试不同的数据预处理、距离测量等。它们也会带来非常好的(尽管非常非常慢)预测。

I'd also start with K-Nearest-Neighbours - they are most simple - and one can experiment with different data-preprocessings, distance-measures, etc. They also lead to very good (although very very slow) predictions.

风轻花落早 2024-10-21 13:27:51

如果要预测的变量是连续变量,那么回归模型是关键。
有许多回归技术,包括最小二乘法、多项式模型、ANN 和 SVM。
当然,每种技术都可能有其假设或参数。

MATLAB 是有据可查的计算环境之一。
我建议访问有关非线性回归的 MATLAB 文档的以下页面:
http://www.mathworks.com/help/stats /nonlinear-regression-1.html#btcgzas-1

您可以首先使用全局搜索方法(例如遗传算法 GA)来调整给定多项式回归模型的参数。

为了预测离散变量,也可以在给定阈值的情况下应用列出的回归模型。决策树可能是一个不错的选择。

If the variable to be predicted is a continuous one, then regression models are the key.
Many regression techniques are there including least squares, polynomial models, ANN and SVM.
Of course, every technique may have its assumption or parameters.

MATLAB is one of the well-documented computing environments.
I would advise visiting the following page of the MATLAB documentation on nonlinear regression:
http://www.mathworks.com/help/stats/nonlinear-regression-1.html#btcgzas-1

You may start by using a global search method such as genetic algorithms GAs to tune the parameters of a given polynomial regression model.

For predicting discrete variables, the listed regression models can be applied also given a threshold. Decision trees can be a good alternative.

倾城°AllureLove 2024-10-21 13:27:51

Weka 会满足您的需求。它具有回归功能并且是用 Java 实现的。

Weka would fit your need. It has regression and is implemented in Java.

素手挽清风 2024-10-21 13:27:51

听起来多元线性回归可以完成这项工作。

sounds like multi variate linear regression would do the job .

铁憨憨 2024-10-21 13:27:51

在深入研究代码之前,由于您是初学者,我建议您阅读基础知识并牢牢掌握它。您不需要阅读博士论文,但至少支持向量机、逻辑回归和神经网络的基本术语会有帮助。通过斯坦福大学、Coursera 课程和其他答案中建议的书籍,互联网上有大量材料。

尽管互联网上有现成的代码可供您使用,但我之所以说您需要阅读基础知识是因为在典型的分类器(例如 SVM、神经网络甚至逻辑回归)中,有各种参数您将需要进行调整,并且如果不了解基础知识,那么使用这些包将会很困难且令人困惑。当我还是一个初学者时,我也经历过同样的事情。

牢牢掌握如何在 SVM 中处理倾斜数据集、如何调整 Logistic 回归的参数,甚至如何减少数据集的维度,这将使您的实施更快、更高效 - 这样您就可以获得更好的准确性。否则,直接深入代码可能会让您再次带着一些基本问题回到这里。我希望这有帮助!

Before diving into the code, since you are a beginner, I would suggest you read on the fundamentals and gain a strong hold on that. You needn't read a PhD thesis but at least the basic terminologies in SVMs, Logistic Regression and Neural Networks would be helpful. There is plenty of material on internet via Stanford, Coursera courses and books suggested in other answers.

Even though there is ready made code available for you to use on the internet, the reason why I am saying you need to read the fundamentals is because in a typical classifier such as SVM, Neural Network or even Logistic Regression, there are various parameters that you would be required to tune, and without an understanding of the fundamentals, it would be difficult and confusing to use these packages. I experienced the same when I was a beginner.

With a strong hold on how to handle a skewed data set in SVM, how to tune the parameters of a Logistic regression, and even how to reduce the dimensions of your dataset, it would make your implementation faster and more efficient - that way you can get better accuracy. Otherwise, diving straight into code may make you come back here with some basic questions again. I hope this was helpful!

偏爱自由 2024-10-21 13:27:51

如果这是一个回归问题,我建议你从 Matlab 中的逻辑回归或线性回归开始。有一些库,您可以获取它的所有代码。通过这种方式,首先通过比较样本内误差(来自您考虑用于生产的数据)和样本外误差(根据未考虑进行这些预测的数据测试您的预测)来测试和查找以下项的数量和顺序:您需要的功能和训练数据量。如果训练数据较少,请使用较少的特征或正则化。如果特征的数量和顺序非常大并且难以确定,请转向中性网络或 SVM(看看是否有 Java 的 SVM 库),当您在 Matlab 中有一个完美的系统时,然后在 Java 中部署它。
据我所知,机器学习系统在适合实际使用之前需要进行大量的手动微调,而像 Matlab/Ocatve 这样的环境是这种微调的最佳平台。

If it's a regression problem, I would suggest you to start with things like logistic or linear regression in Matlab. There are libraries and you can get code all around for it. By this way, first test and find by comparing in-sample error (from data you consider for production) and out-of-sample error (to test your predictions against data that was not considered for making those predictions) the number and order of features and amount of training data you need. If training data is less, use less features or regularization. If number and order of features is very large and difficult to determine, move to neutral networks or SVM(See, if there is an SVM library fo java) and when you have a perfect system in Matlab then deploy it in Java.
As far as I have seen, ML systems require a good bit of manual fine tuning before they become fit for practical use and environments like Matlab/Ocatve are best platforms for this fine tuning.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文