强化学习的良好实施?
对于一个人工智能类项目,我需要实现一个强化学习算法,该算法可以击败简单的俄罗斯方块游戏。 该游戏是用 Java 编写的,我们有源代码。 我了解强化学习理论的基础知识,但想知道 SO 社区中是否有人有此类事情的实践经验。
- 对于在俄罗斯方块游戏中实施强化学习,您推荐的读物是什么?
- 是否有任何好的开源项目可以完成类似的事情值得一试?
编辑:越具体越好,但欢迎有关该主题的一般资源。
跟进:
认为如果我发布跟进就更好了。
这是我最终为未来的学生提供的解决方案(代码和文章):)。
For an ai-class project I need to implement a reinforcement learning algorithm which beats a simple game of tetris. The game is written in Java and we have the source code. I know the basics of reinforcement learning theory but was wondering if anyone in the SO community had hands on experience with this type of thing.
- What would your recommended readings be for an implementation of reinforced learning in a tetris game?
- Are there any good open source projects that accomplish similar things that would be worth checking out?
Edit: The more specific the better, but general resources about the subject are welcomed.
Follow up:
Thought it would be nice if I posted a followup.
Here's the solution (code and writeup) I ended up with for any future students :).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
看看 2009 年 RL 竞赛。 问题域之一是俄罗斯方块游戏。 前年也出现过俄罗斯方块问题。 这是52页最终报告< /a> 来自当年的第五名决赛入围者,其中详细介绍了代理的工作原理。
Take a look at the 2009 RL-competition. One of the problem domains is a tetris game. There was a tetris problem the year before too. Here’s the 52-page final report from that year’s fifth-place finalist, which goes into a lot of detail about how the agent worked.
Heaton Research 电子书非常擅长解释神经网络网络概念(带代码)。 第 4 章专门介绍机器学习和网络的各种训练方法。 有一个可下载的库和示例应用程序供您查看。
The Heaton Research ebook is quite good at explaining neural network concepts (with code). Chapter 4 is dedicated to machine learning and the various training methods for your networks. There is a downloadable library and sample applications for you to look at.
这是一本关于该主题的好书:
机器学习和数据挖掘:原理和算法简介
作者:Igor Kononenko、Matjaz Kukar(2007 年 6 月)
另请查看这些开源项目:
Here is a good book on the subject:
Machine Learning and Data Mining: Introduction to Principles and Algorithms
by Igor Kononenko, Matjaz Kukar (June, 2007)
Also take a look at these open source projects:
TD-Gammon、gnubackgammon 或任何其他类似项目在游戏领域取得了巨大成功。
萨顿与 Barto 的书《强化学习:简介》还有一些其他案例研究。
TD-Gammon, gnubackgammon, or any other similar project were massive successes in games.
Sutton & Barto's book "Reinforcement Learning: An Introduction" also has some other Case Studies.
这个问题确实很老了,但对于 2018 年阅读此文章的人来说,如果您对现有 RL 算法的可靠参考感兴趣,我强烈建议您使用 OpenAI Baselines。 这些算法是由 OpenAI 的一群真正了解这些东西的员工实现的,并且经过了广泛的微调和调试。
公平地说,俄罗斯方块不需要这些,但现在我怀疑家庭作业问题可能涉及一些更复杂的环境。
https://github.com/openai/baselines
更新:
2019年,我还推荐 rlpyt:
< a href="https://github.com/astooke/rlpyt" rel="nofollow noreferrer">https://github.com/astooke/rlpyt
This question is really old, but for anyone reading this in 2018, I highly recommend you use OpenAI Baselines if you're interested in solid references of existing RL algorithms. These algorithms are implemented by a group of employees at OpenAI who really know this stuff, and have been extensively fine-tuned and debugged.
To be fair, you don't need these for Tetris, but nowadays I suspect homework questions may involve some more sophisticated environments.
https://github.com/openai/baselines
UPDATE:
in 2019, I also recommend rlpyt:
https://github.com/astooke/rlpyt
这并不是专门针对强化学习的,但是斯坦福大学在 Youtube 上的机器学习上有一系列精彩的讲座< /a> 和 iTunes。
该链接指向第一堂课,大约需要 30 分钟才能深入了解内容。
This is not specific to reinforced learning, but Stanford has a great series of lectures on machine learning on Youtube and iTunes.
The link is to the first lecture which takes approximately 30 minutes to dive into the content.
Burlap 是一个最新的 Java 库,它提供了许多常见强化学习算法的实现以及一些环境和有用的工具。
Burlap is a recent Java library that provides implementations of many common reinforcement learning algorithms as well as a few environments and useful tools.
我建议学习基于java的RL4J。
我正在使用这个,我很惊讶事情是如何顺利进行的,你甚至可以使用 Actor Critic 算法(称为 A3C)在强化学习算法中学习 LSTM 网络,
这是链接:
https://github.com/deeplearning4j/dl4j-examples/blob /master/rl4j-examples/
I would suggest to learn RL4J which is java based.
I was using this and I was amazed how things works smoothly and you can learn even LSTM networks in a reinforcement learning algorithm with Actor Critic algorithm (called A3C)
Here is the link:
https://github.com/deeplearning4j/dl4j-examples/blob/master/rl4j-examples/
我注意到这个问题已经过时了(10 年前),现代 RL 框架和环境的集合在这里可能很有用。 我为此创建了 GitHub 存储库,并打算定期更新它。
https://github.com/TheMTank/RL-code-resources
I noticed that this question is quite outdated (10 years old) and that a collection of modern RL frameworks and environments could be useful here. I created GitHub repo for this and intend to update it regularly.
https://github.com/TheMTank/RL-code-resources