用于预测事件顺序的机器学习算法?
简单的机器学习问题。可能有很多方法可以解决这个问题:
有一个包含 4 个可能事件的无限流:
'event_1'、'event_2'、'event_4'、'event_4'
这些事件不以完全随机的顺序出现。我们假设大多数事件发生的顺序存在一些复杂的模式,而其余事件只是随机的。但我们并不提前知道这些模式。
收到每个事件后,我想根据过去事件发生的顺序来预测下一个事件将是什么。所以我的问题是:我应该为这个预测器使用什么机器学习算法?
然后预测器将被告知下一个事件实际上是什么:
Predictor=new_predictor()
prev_event=False
while True:
event=get_event()
if prev_event is not False:
Predictor.last_event_was(prev_event)
predicted_event=Predictor.predict_next_event(event)
问题是预测器应该维持多长时间的历史记录,因为维持无限的历史是不可能的。我将这个问题留给你来回答。但出于实用性,答案不能是无限的。
所以我相信预测必须以某种滚动的历史来完成。因此,添加新事件和使旧事件过期应该相当高效,并且不需要重建整个预测器模型。
对我来说,具体的代码(而不是研究论文)将为您的回复添加巨大的价值。 Python 或 C 库都不错,但任何东西都可以。
更新:如果每一轮可以同时发生多个事件怎么办?这会改变解决方案吗?
Simple machine learning question. Probably numerous ways to solve this:
There is an infinite stream of 4 possible events:
'event_1', 'event_2', 'event_4', 'event_4'
The events do not come in in completely random order. We will assume that there are some complex patterns to the order that most events come in, and the rest of the events are just random. We do not know the patterns ahead of time though.
After each event is received, I want to predict what the next event will be based on the order that events have come in in the past. So my question is: What machine learning algorithm should I use for this predictor?
The predictor will then be told what the next event actually was:
Predictor=new_predictor()
prev_event=False
while True:
event=get_event()
if prev_event is not False:
Predictor.last_event_was(prev_event)
predicted_event=Predictor.predict_next_event(event)
The question arises of how long of a history that the predictor should maintain, since maintaining infinite history will not be possible. I'll leave this up to you to answer. The answer can't be infinte though for practicality.
So I believe that the predictions will have to be done with some kind of rolling history. Adding a new event and expiring an old event should therefore be rather efficient, and not require rebuilding the entire predictor model, for example.
Specific code, instead of research papers, would add for me immense value to your responses. Python or C libraries are nice, but anything will do.
Update: And what if more than one event can happen simultaneously on each round. Does that change the solution?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这本质上是一个序列预测问题,因此您需要循环神经网络或隐马尔可夫模型。
如果你只有固定的时间回顾,时间窗口逼近可能就足够了。您获取序列数据并将其分割为长度为 n 的重叠窗口。 (例如,您将序列 ABCDEFG 拆分为 ABC、BCD、CDE、DEF、EFG)。然后训练函数逼近器(例如神经网络或线性回归)以将该窗口的前 n-1 部分映射到第 n 部分。
你的预测者将无法回顾比你的窗口大小更长的时间。 RNN 和 HMM 理论上可以做到这一点,但很难调整或者有时根本不起作用。
(最先进的 RNN 实现可以在 PyBrain http://pybrain.org 中找到)
更新:这是 pybrain 代码对于你的问题。 (我还没有测试过,可能有一些拼写错误之类的东西,但整体结构应该可以工作。)
这将训练循环网络 1000 个 epoch,并在每个 epoch 后打印出错误。然后,您可以像这样检查正确的预测:
这将为每个事件打印一个布尔值数组。
This is essentially a sequence prediction problem, so you want Recurrent neural networks or hidden Markov models.
If you only have a fixed time to look back, time window approaches might suffice. You take the sequence data and split it into overlapping windows of length n. (eg. you split a sequence ABCDEFG into ABC, BCD, CDE, DEF, EFG). Then you train a function approximator (e.g. neural network or linear regression) to map the first n-1 parts of that window onto the nth part.
Your predictor will not be able to look back in time longer than the size of your window. RNNs and HMMs can do so in theory, but are hard to tune or sometimes just don't work.
(State of the art RNN implementations can be found in PyBrain http://pybrain.org)
Update: Here is the pybrain code for your problem. (I haven't tested it, there might be some typos and stuff, but the overall structure should work.)
This will train the recurrent network for 1000 epochs and print out the error after every epochs. Afterwards you can check for correct predictions like this:
This will print an array of booleans for every event.
我们可以保留有关过去的聚合信息,而不是保留完整的历史记录(以及相对较短的滑动历史记录,用作预测器逻辑的输入)。
暂定的实施可以是这样的:
简而言之:管理一组递增顺序的马尔可夫链,并对它们的预测进行评分和平均
表开始为空,在观察第二个事件时,我们可以存储第一个二元组,计数为 1。在第三个事件时,由第二个和第三个事件组成的二元组被“添加”到表中:或者增加现有的二元组或添加原始计数 1,作为新的(迄今为止从未见过的)二元组。等等
同时,在表中保留二元组的总数。
该表和总计数允许根据前一个事件计算给定事件的概率。
上述一般逻辑可能有几种变化。特别是在选择用于对各个 N 元长度的预测质量进行“分级”的特定度量时。
关于检测和适应事件分布中可能的变化还应考虑其他因素(以上假设一般遍历事件源)。一种可能的方法是使用两组表(相应地组合概率),并定期删除其中一组的所有表的内容。为这些重置选择正确的周期是一件棘手的事情,本质上是平衡对统计上显着的历史量的需求和足够短的周期的需求,以免我错过较短的调制......
Rather than keeping a full history, one can keep aggregated information about the past (along with a relatively short sliding history, to be used as input to the Predictor logic).
A tentative implementation could go like this:
In a nutshell: Managing a set of Markov chains of increasing order, and grading and averaging their predictions
Table starts empty, upon the second event observe, we can store the first bigram, with a count of 1. upond the third event, the bigram made of the 2nd and 3rd events is "added" to the table: either incrementing the count of an existing bigram or added with original count 1, as a new (never-seen-so-far) bigram. etc.
In parallel, keep a total count of bigrams in the table.
This table and the total tally allow calculating the probability of a given event, based on the one preceding event.
There can be several variations on the general logic described above. In particular in the choice of the particular metric used to "grade" the quality of prediction of the individual N-Gram lengths.
Other considerations should be put with regards to detecting and adapting to possible shifts in the events distribution (the above assumes a generally ergodic event source). One possible approach is to use two sets of tables (combining the probabilities accordingly), and periodically dropping the contents of all tables of one of the sets. Choosing the right period for these resets is a tricky business, essentially balancing the need for statistically significant volumes of history and the need for short enough period lest me miss on the shorter modulations...
唯一的答案是“这取决于”。
这取决于这需要有多准确。我不相信这个策略能够 100% 准确,即使有无限的历史。尝试 10 的历史记录,您将获得 x% 的准确度,然后尝试 100,您将获得 y% 的准确度,等等……
最终您应该发现系统要么像您希望的那样准确,要么您会发现准确性的提高不值得历史长度的增加(以及内存使用量、处理时间等的增加......)。此时要么工作完成,要么你需要找到新的策略。
就其价值而言,我认为研究一个简单的“软”神经网络可能是一个更好的计划。
The only answer is "it depends".
It depends on how accurate this needs to be. I don't believe this strategy could ever be 100% accurate even with an infinite history. Try out a history of 10 and you'll get x% accuracy, then try 100 and you'll get y% accuracy, etc etc...
Eventually you should find either the system is as accurate as you desire it to be or you will find the rise in accuracy will not be worth the increase in history length (and increased memory usage, processing time etc...). At this point either job done, or you need to find a new strategy.
For what it is worth i think looking into a simple "soft" neural net might be a better plan.
我们刚刚研究了计算机体系结构中的分支预测器(因为处理器实际需要很长时间才能评估条件 if(EXPRESSION),它会尝试“猜测”并以这种方式节省一些时间)。我确信在这个领域已经做了更多的研究,但目前我能想到的就这些了。
我还没有见过像你这样的独特设置,所以我认为你可能需要自己做一些初步的实验。尝试运行您的解决方案 X 秒,具有 N 个槽的历史记录,正确率是多少?并将其与相同的固定 X 和不同的 N 历史槽进行比较,以尝试找到最佳的内存历史比率(将它们绘制出来)。
如果多个事件可以同时发生......这有点令人费解,那里必须有一些限制:如果一次发生无限多个事件怎么办?呃,这对你来说在计算上是不可能的。我会尝试使用与一次仅预测一个事件相同的方法,除非启用预测器一次预测多个事件。
We just studied about branch-predictors in computer architecture (Because the processor would take too long to actually evaluate a condition if(EXPRESSION), it tries to 'guess' and save some time that way). I am sure more research has been done in this area, but that's all I can think of at the moment.
I haven't seen a unique setup like yours, so I think you might need to do some preliminary experimentation on your own. Try running your solution for X number of seconds with a history of N slots, what is the correctness ratio? And compare that with the same fixed X and varying N history slots to try to find the best memory-history ratio (graphing them out ).
If more than one event can happen simulataneously... that's a little mind bending, there has to be some constraints there : what if infinite number of events happen at a time? Uhoh, that's computationally impossible for you. I'd try the same approach as just one event at a time, except where the predictor is enabled predict multiple events at a time.
处理器使用一些非常轻量级的技巧来预测分支语句是否会分支。这有助于他们实现高效的管道衬里。例如,它们可能不像马尔可夫模型那么通用,但由于其简单性而很有趣。 这是有关分支预测的 Wikipedia 文章。请参阅饱和计数器和两级自适应预测器
Processors use a few really lightweight tricks to predict whether a branch statement will branch or not. This helps them with efficient pipe-lining. They may not be as general as Markov models for instance, but they are interesting because of their simplicity. Here is the Wikipedia article on branch prediction. See the Saturating Counter, and the Two-Level Adaptive Predictor