机器学习预测分类
我有以下问题。我有一个包含一系列数字的训练数据集。每个数字都属于某个类别。有五个班。
范围: 1...10
训练数据集: {1,5,6,6,10,2,3,4,1,8,6,...}
类: [1,2][3,4][5,6][7,8][9,10]
是否可以使用机器学习算法来查找类别预测的可能性以及哪种算法适合于此?
最好的,美国
I have the following problem. I have a training dataset comprising of a range of numbers. Each number belongs to a certain class. There are five classes.
Range:
1...10
Training Dataset:
{1,5,6,6,10,2,3,4,1,8,6,...}
Classes:
[1,2][3,4][5,6][7,8][9,10]
Is it possible to use a machine learning algorithm to find likelihoods for class prediction and what algorithm would be suited for this?
best, US
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如问题评论中所述,
我想根据给定的训练集分布计算某个类出现的可能性
,这个问题很微不足道,几乎不是机器学习问题:
简单地统计“训练集”中每个类别的出现次数,Count_12、Count_34、...Count_910。给定类 xy 出现的可能性简单地由
一个更有趣的问题...
给出
...将把训练集视为一个序列,并猜测该序列中的下一项是什么。下一个项目来自给定类别的概率不仅基于该类别的先验(上面计算的 P(xy)),而且还会考虑序列中位于该类别之前的项目。这个问题的有趣部分之一就是弄清楚要查看“多远”以及给予前面的项目序列多少“权重”。
编辑(现在OP表示他/她对“更有趣的问题”感兴趣)。
这个“给定前序预测”问题几乎直接映射到
machine-learning-algorithm-for-predicting-order-of -StackOverflow事件问题。
细微的差别在于,这里的字母表有 10 个不同的代码(另一个问题中有 4 个),而且这里我们尝试预测代码的类,而不仅仅是代码本身。对于这里每个类 2 个代码的聚合,我们有几种选择:
我个人的选择是首先尝试使用代码预测器(仅在最后聚合),如果从最初的尝试中获得的某种见解告诉我们可以简化或改进逻辑或其性能,则可以从那里进行调整我们更早进行汇总。事实上,可以使用完全相同的预测器来尝试这两种方法,只需更改输入流,将所有偶数替换为其前面的奇数。我猜测当我们提前聚合时,有价值的信息(为了猜测即将到来的代码)会丢失。
As described in the question's comment,
I want to calculate the likelihood of a certain class to appear based on the given distribution of the training set
,the problem is trivial and hardly a machine learning one:
Simply count the number of occurrences of each class in the "training set", Count_12, Count_34, ... Count_910. The likelihood that a given class xy would appear is simply given by
A more interesting problem...
...would be to consider the training set as a sequence and to guess what would the next item in that sequence be. The probability that the next item be from a given category would then not only be based on the prior for that category (the P(xy) computed above), but it would also be taking into account the items which precede it in the sequence. One of the interesting parts of this problem would then be to figure out how "far back" to look and much "weight" to give to the preceding sequences of items.
Edit (now that OP indicated his/her interest for the "more interesting problem").
This "prediction-given-preceding-sequence" problem maps almost directly to the
machine-learning-algorithm-for-predicting-order-of-events StackOverflow question.
The slight differences being that the alphabet here has 10 distinct code (4 in the other question) and the fact that here we try and predict a class of codes, rather that just the code itself. With regards to this aggregation of, here, 2 codes per class, we have several options:
My personal choice would be to try first with the code predictor (only aggregating at the very end), and maybe adapt from there if somehow insight gained from this initial attempt were to tell us that the logic or its performance could be simplified or improved would we aggregate earlier. Indeed the very same predictor could be used to try both approaches, one would simply need to alter the input stream, replacing all even numbers by the odd number preceding it. I'm guessing that valuable information (for the purpose of guessing upcoming codes) is lost when we aggregate early.