神经网络输入顺序
这可能看起来是一个愚蠢的问题。
我正在通过一些网球数据运行神经网络。网络的目标是确定每个玩家赢得比赛的概率。大约有 40 个输入和一个输出(玩家 A 获胜的概率,玩家 B 的(1 - 输出))。
输入是每个球员在过去 n 场比赛中的各种统计数据和表现指标。我编写了从网球比赛结果数据库中提取这些数字的代码,然后将其输入神经网络。
我遇到的问题如下:
在训练集中,与网络分析的比赛获胜者相关的输入值将始终通过相同的输入神经元馈送。因此,期望的输出将始终为 1,因为玩家 A 总是获胜(这就是我的数据库的结构,玩家 A 是比赛的获胜者,玩家 B 是失败者)。
我怎样才能克服这个问题?这只是将玩家 A 和玩家 B 的顺序随机化的情况吗?
希望这个问题有意义。
非常感谢
This may seem like a silly question.
I am running a neural network through some tennis data. The objective of the network is to determine the probability of each player winning the match. There are around 40 inputs, and one output (being the probability of player A winning, (1 - output) for player B).
The inputs are various statistics and performance measures of each player over the last n matches. I've written the code that extracts these numbers from my database of tennis match results, which are then fed into the neural network.
The problem I have is as follows:
In the training set, the input values relating to the winner of the match being analysed by the network, will always be fed through the same input neurons. Because of this, the desired output will always be 1, because player A always wins (this is how my database is structured, player A is the winner of the match and player B is the loser).
How can I overcome this issue? Is it simply a case of randomising the player A and player B orders?
Hope this question makes sense.
Many Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会将每场比赛训练两次,一次使用输入“获胜者 - 失败者”和输出“1”,一次使用输入“失败者 - 获胜者”和所需的输出“0”。
(哦,我不认为神经网络输出可以被解释为概率,因为如果 ANN 用输出 0.9 预测某个结果,那么 10 次中有 9 次是正确的。)
I would train every match twice, once with the input Winner - Loser and the output '1', once with the input Loser - Winner and the desired output '0'.
(Oh, and I don't think a neural network output can be interpreted as a probability, in the sense that if the ANN predicts some outcome with output 0.9 it will be right 9 out of 10 times.)
为什么不进行简单的 50/50 分割呢?让一半的获胜者通过你通常运行它们的输入神经元运行,而另一半的获胜者通过其他输入神经元运行,这样你就绝对没有偏见。您甚至可以通过在您训练的每个实例上交替获胜者和宽松者来交错/条纹它们:
随机化也可以有所帮助,但我认为它会引入偏差(尽管这将是非常小的偏差)。归根结底,您不会知道神经网络是否正在学习预测随机化函数,或者是否正在学习预测数据,因此只需使其简单并保证自己会学到正确的东西即可。
Why don't you do a simple 50/50 split? Run half of the winners through the input neurons which you normally run them on and the other half of the winners through the other input neurons, that way you will have absolutely no bias. You can even stagger/stripe them by alternating the winner and looser on every single instance you train it on:
Randomization can help too, but I think that it will introduce bias (although it will be REALLY SMALL bias). At the end of the day you wouldn't know if the neural network is learning to predict the randomization function or if it's learning to predict the data, so just make it simple and guarantee yourself that it will learn the right thing.
我认为某种洗牌(随机或其他方式)是有意义的。
如果你试图训练任何类型的学习器从一对玩家中选出获胜者,并且你总是将第一个玩家呈现为获胜者,那么它知道第一个玩家总是获胜者是完全合理的。
解决此问题的一种简单方法是在双倍大小的数据集上进行训练:使用
(A, B)
和(B, A)
对,其中A
是获胜者。I think that some kind of shuffling (random or otherwise) makes sense.
If you're trying to train any kind of learner to pick the winner out of a pair of players, and you always present the first player as the winner, then it's entirely reasonable for it to learn that the first player is always the winner.
One simple way to fix this is to train on a double-sized data set: use both the pairs
(A, B)
and(B, A)
whereA
is the winner.在您描述的成对建模中,通常是: 1. 每个事件按每个顺序向网络显示一次,或者 2. 每个事件按照某种规范顺序(“家”、“离开”)显示一次。
In pairwise modeling such as you describe, usually either: 1. each event is shown to the network once in each order, or 2. each event is shown once, in some canonical order ("home", "away").