如何训练人工神经网络使用视觉输入玩《暗黑破坏神 2》？

绿阴红影里的.如风往事 2024-11-24 21:22:01

我可以看出您担心如何训练人工神经网络，但是这个项目隐藏了您可能没有意识到的复杂性。通过图像处理对计算机游戏进行对象/角色识别是一项极具挑战性的任务（对于 FPS 和 RPG 游戏来说并不是疯狂）。我不怀疑你的技能，我也不是说这是不可能完成的，但你可以轻松地花 10 倍多的时间来识别东西，而不是实现 ANN 本身（假设你已经有数字化的经验）图像处理技术）。

我认为你的想法非常有趣，也非常雄心勃勃。此时您可能需要重新考虑。我感觉这个项目是你为大学规划的项目，所以如果工作的重点确实是人工神经网络，你可能应该选择另一个游戏，一些更简单的游戏。

我记得不久前，有人来寻找有关不同但有些相似的项目的技巧。值得一试。

另一方面，如果您接受建议，可能会有更好/更简单的方法来识别游戏中的对象。但首先，我们将这个项目命名为您想要的名称：智能机器人。

实现机器人的一种方法访问游戏客户端的内存以查找相关信息，例如角色在屏幕上的位置及其健康状况。读取计算机内存很简单，但弄清楚要在内存中查找的确切位置却并非易事。像 Cheat Engine 这样的内存扫描器对此非常有帮助。

另一种在游戏中起作用的方法涉及操纵渲染信息。游戏的所有对象都必须渲染到屏幕上。这意味着所有 3D 对象的位置最终都会发送到显卡进行处理。准备好进行一些认真的调试。

在这个答案中，我简要描述了两种通过图像处理来完成你想要的事情的方法。如果您对它们感兴趣，您可以在利用在线游戏（第 6 章）中找到更多关于它们的信息，这是一本很棒的书关于这个问题。

回复收藏 0 原文

千仐 2024-11-24 21:22:01

更新2018-07-26：就是这样！我们现在已经接近解决此类游戏的时刻了！使用 OpenAI 并基于 DotA 2 游戏，团队可以开发出能够击败半职业游戏玩家的 AI在 5v5 游戏中。如果您了解 DotA 2，您就会知道这款游戏在机制方面与暗黑破坏神类游戏非常相似，但有人可能会说，由于团队合作，它更加复杂。

正如预期的那样，这一目标的实现要归功于深度学习强化学习的最新进展，以及使用 OpenAI 等开放式游戏框架，这简化了 AI 的开发，因为您可以获得简洁的 API，还因为您可以加速游戏（AI 玩的游戏）相当于 180 年来每天都在与自己进行游戏！）。

2018年8月5日（10天内！），计划入坑这个人工智能对抗顶级 DotA 2 玩家。如果这成功了，预计会发生一场大革命，也许不像围棋游戏的解决那么媒体化，但它仍然将是游戏人工智能的一个巨大里程碑！

2017-01 更新：自从 AlphaGo 取得成功以来，该领域发展非常迅速，几乎每个月都会有新的框架来促进游戏机器学习算法的开发。以下是我发现的最新列表：

OpenAI 的 Universe：一个的平台使用机器学习几乎可以玩任何游戏。 API是Python语言，它在VNC远程桌面环境下运行游戏，因此它可以捕获任何游戏的图像！你或许可以通过机器学习算法使用Universe来玩暗黑破坏神II！
OpenAI 的 Gym：与 Universe 类似，但专门针对强化学习算法（所以它是一种AlphaGo 使用的框架的泛化，但适用于更多游戏）。 Udemy 上有一门课程，涵盖机器学习在 Breakout 或 Doom 等游戏中的应用使用 OpenAI Gym。
TorchCraft：Torch（机器学习框架）和星际争霸：母巢之战。
pyGTA5：一个仅使用屏幕截图在 GTA5 中构建自动驾驶汽车的项目（包含大量在线视频）。

非常激动人心的时刻！

重要更新（2016-06）：正如OP所指出的，训练人工网络仅使用视觉输入来玩游戏的问题现在正在由几个严肃的机构解决，并取得了相当有希望的结果，例如< a href="https://github.com/kuz/DeepMind-Atari-Deep-Q-Learner" rel="noreferrer">DeepMind 深度 Q 学习网络 (DQN)。

现在，如果您想接受下一个级别的挑战，您可以使用各种人工智能视觉游戏开发平台之一，例如ViZDoom，一个高度优化的平台（7000 fps），用于训练网络仅使用视觉输入来玩 Doom：

ViZDoom 允许开发仅使用视觉信息（屏幕缓冲区）玩 Doom 的 AI 机器人。它主要用于机器视觉学习，特别是深度强化学习的研究。
ViZDoom 基于 ZDoom 提供游戏机制。

结果非常惊人，查看他们网页上的视频和不错的教程（Python）在这里！

Quake 3 Arena 也有一个类似的项目，名为 Quagents，它也提供了简单的 API访问底层游戏数据，但您可以废弃它，仅使用屏幕截图和 API 来控制您的代理。

如果我们只使用屏幕截图，这样的平台为什么有用？即使您不访问底层游戏数据，这样的平台也可以提供：

高性能游戏实现（您可以用更少的时间生成更多的数据/游戏/学习代，以便您的学习算法可以更快地收敛）！）。
一个简单且响应式 API 来控制您的代理（即，如果您尝试使用人工输入来控制游戏，您的一些命令可能会丢失，因此您还需要处理输出的不可靠性问题...）。
轻松设置自定义场景。
可定制渲染（可用于“简化”您获得的图像以简化处理）
同步（“逐回合”）播放（因此您不需要您的算法首先实时工作，这大大降低了复杂性）。
额外的便利功能，例如跨平台兼容性、追溯兼容性（当有新的游戏更新时，您的机器人不会再冒着不再使用游戏的风险）等。

总而言之，这些平台的好处在于它们减轻了之前您必须处理的许多技术问题（如何操作游戏输入、如何设置场景等），因此您只需处理学习算法本身。

所以现在，开始工作，让我们成为有史以来最好的人工智能视觉机器人；）

旧文章描述仅依赖视觉输入开发人工智能的技术问题：

与上面的一些同事相反，我确实这样做不觉得这个问题很棘手。但这肯定是一件非常困难的事情！

上面指出的第一个问题是游戏状态的表示：你不能仅用单个图像来表示完整的状态，你需要维护某种记忆（健康状况，还有装备的物品和可供使用的物品、任务和目标等）。要获取此类信息，您有两种方法：直接访问游戏数据，这是最可靠和简单的；或者您可以通过实施一些简单的过程（打开库存、截取屏幕截图、提取数据）来创建这些信息的抽象表示。当然，从屏幕截图中提取数据要么需要您放入一些监督程序（您完全定义），要么采用无监督程序（通过机器学习算法，但随后它会增加很多复杂性......）。对于无监督机器学习，您将需要使用一种称为结构学习算法的最新算法（它学习数据的结构，而不是如何对数据进行分类或预测值）。其中一种算法是 Richard Socher 的递归神经网络（不要与递归神经网络混淆）：http://techtalks .tv/talks/54422/

然后，另一个问题是，即使您已经获取了所需的所有数据，游戏也只是部分可观察。因此，您需要注入一个抽象的世界模型，并为其提供经过处理的游戏信息，例如您的化身的位置，以及屏幕外的任务物品、目标和敌人的位置。为此，您可以查看 Vermaak 2003 的 Mixture Particle Filters。

此外，您还需要一个自主代理，并动态生成目标。您可以尝试的一个众所周知的架构是 BDI 代理，但您可能需要对其进行调整才能使该架构在您的实际情况下工作。作为替代方案，还有递归 Petri 网，您可以将其与 Petri 网的各种变体结合起来以实现您想要的目标，因为它是一个经过充分研究且灵活的框架，具有出色的形式化和证明程序。

最后，即使您完成了上述所有操作，您也需要找到一种以加速方式模拟游戏的方法（使用视频可能很好，但问题是您的算法只会不受控制地观看，能够自己尝试对于学习非常重要）。事实上，众所周知，当前最先进的算法需要花费更多的时间来学习人类可以学习的同样的东西（对于强化学习更是如此），因此如果不能加快这个过程（也就是说，如果你不能加快游戏时间），你的算法甚至不会在一个生命周期内收敛......

总而言之，你想要在这里实现的目标是在极限（也许有点超出极限））当前最先进的算法。我认为也许有可能，但即使是这样，你也会花费大量时间，因为这不是一个理论问题，而是一个实际问题正在接近这里，因此您需要实施并组合许多不同的人工智能方法来解决它。

整个团队进行数十年的研究可能还不够，因此，如果您独自一人兼职（因为您可能有一份谋生的工作），您可能会花费一生而无法达到任何目标一个可行的解决方案。

因此，我在这里最重要的建议是，您降低您的期望，并尝试通过使用所有可以使用的信息来降低问题的复杂性，并尽可能避免依赖屏幕截图（即，尝试直接挂钩到游戏中，寻找DLL注入），并通过实施监督程序来简化一些问题，不要让你的算法学习一切（即，暂时放弃图像处理并依赖内部游戏信息，稍后如果你的算法运行良好，你可以替换一些你的人工智能程序的一部分具有图像处理功能，从而逐渐实现你的全部目标，例如，如果你能让某些东西很好地工作，你可以尝试复杂化你的问题，并通过屏幕截图上的无监督机器学习算法来替换监督程序和记忆游戏数据）。

祝你好运，如果有效的话，一定要发表一篇文章，你一定可以因为解决如此困难的实际问题而闻名！

UPDATE 2018-07-26: That's it! We are now approaching the point where this kind of game will be solvable! Using OpenAI and based on the game DotA 2, a team could make an AI that can beat semi-professional gamers in a 5v5 game. If you know DotA 2, you know this game is quite similar to Diablo-like games in terms of mechanics, but one could argue that it is even more complicated because of the team play.

As expected, this was achieved thanks to the latest advances in reinforcement learning with deep learning, and using open game frameworks like OpenAI which eases the development of an AI since you get a neat API and also because you can accelerate the game (the AI played the equivalent of 180 years of gameplay against itself everyday!).

On the 5th of August 2018 (in 10 days!), it is planned to pit this AI against top DotA 2 gamers. If this works out, expect a big revolution, maybe not as mediatized as the solving of the Go game, but it will nonetheless be a huge milestone for games AI!

UPDATE 2017-01: The field is moving very fast since AlphaGo's success, and there are new frameworks to facilitate the development of machine learning algorithms on games almost every months. Here is a list of the latest ones I've found:

OpenAI's Universe: a platform to play virtually any game using machine learning. The API is in Python, and it runs the games behind a VNC remote desktop environment, so it can capture the images of any game! You can probably use Universe to play Diablo II through a machine learning algorithm!
OpenAI's Gym: Similar to Universe but targeting reinforcement learning algorithms specifically (so it's kind of a generalization of the framework used by AlphaGo but to a lot more games). There is a course on Udemy covering the application of machine learning to games like breakout or Doom using OpenAI Gym.
TorchCraft: a bridge between Torch (machine learning framework) and StarCraft: Brood War.
pyGTA5: a project to build self-driving cars in GTA5 using only screen captures (with lots of videos online).

Very exciting times!

IMPORTANT UPDATE (2016-06): As noted by OP, this problem of training artificial networks to play games using only visual inputs is now being tackled by several serious institutions, with quite promising results, such as DeepMind Deep-Qlearning-Network (DQN).

And now, if you want to get to take on the next level challenge, you can use one of the various AI vision game development platforms such as ViZDoom, a highly optimized platform (7000 fps) to train networks to play Doom using only visual inputs:

ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular.
ViZDoom is based on ZDoom to provide the game mechanics.

And the results are quite amazing, see the videos on their webpage and the nice tutorial (in Python) here!

There is also a similar project for Quake 3 Arena, called Quagents, which also provides easy API access to underlying game data, but you can scrap it and just use screenshots and the API only to control your agent.

Why is such a platform useful if we only use screenshots? Even if you don't access underlying game data, such a platform provide:

high performance implementation of games (you can generate more data/plays/learning generations with less time so that your learning algorithms can converge faster!).
a simple and responsive API to control your agents (ie, if you try to use human inputs to control a game, some of your commands may be lost, so you'd also deal with unreliability of your outputs...).
easy setup of custom scenarios.
customizable rendering (can be useful to "simplify" the images you get to ease processing)
synchronized ("turn-by-turn") play (so you don't need your algorithm to work in realtime at first, that's a huge complexity reduction).
additional convenience features such as crossplatform compatibility, retrocompatibility (you don't risk your bot not working with the game anymore when there is a new game update), etc.

To summarize, the great thing about these platforms is that they alleviate much of the previous technical issues you had to deal with (how to manipulate game inputs, how to setup scenarios, etc.) so that you just have to deal with the learning algorithm itself.

So now, get to work and make us the best AI visual bot ever ;)

Old post describing the technical issues of developping an AI relying only on visual inputs:

Contrary to some of my colleagues above, I do not think this problem is intractable. But it surely is a hella hard one!

The first problem as pointed out above is that of the representation of the state of the game: you can't represent the full state with just a single image, you need to maintain some kind of memorization (health but also objects equipped and items available to use, quests and goals, etc.). To fetch such informations you have two ways: either by directly accessing the game data, which is the most reliable and easy; or either you can create an abstract representation of these informations by implementing some simple procedures (open inventory, take a screenshot, extract the data). Of course, extracting data from a screenshot will either have you to put in some supervised procedure (that you define completely) or unsupervised (via a machine learning algorithm, but then it'll scale up a lot the complexity...). For unsupervised machine learning, you will need to use a quite recent kind of algorithms called structural learning algorithms (which learn the structure of data rather than how to classify them or predict a value). One such algorithm is the Recursive Neural Network (not to confuse with Recurrent Neural Network) by Richard Socher: http://techtalks.tv/talks/54422/

Then, another problem is that even when you have fetched all the data you need, the game is only partially observable. Thus you need to inject an abstract model of the world and feed it with processed information from the game, for example the location of your avatar, but also the location of quest items, goals and enemies outside the screen. You may maybe look into Mixture Particle Filters by Vermaak 2003 for this.

Also, you need to have an autonomous agent, with goals dynamically generated. A well-known architecture you can try is BDI agent, but you will probably have to tweak it for this architecture to work in your practical case. As an alternative, there is also the Recursive Petri Net, which you can probably combine with all kinds of variations of the petri nets to achieve what you want since it is a very well studied and flexible framework, with great formalization and proofs procedures.

And at last, even if you do all the above, you will need to find a way to emulate the game in accelerated speed (using a video may be nice, but the problem is that your algorithm will only spectate without control, and being able to try for itself is very important for learning). Indeed, it is well-known that current state-of-the-art algorithm takes a lot more time to learn the same thing a human can learn (even more so with reinforcement learning), thus if can't speed up the process (ie, if you can't speed up the game time), your algorithm won't even converge in a single lifetime...

To conclude, what you want to achieve here is at the limit (and maybe a bit beyond) of current state-of-the-art algorithms. I think it may be possible, but even if it is, you are going to spend a hella lot of time, because this is not a theoretical problem but a practical problem you are approaching here, and thus you need to implement and combine a lot of different AI approaches in order to solve it.

Several decades of research with a whole team working on it would may not suffice, so if you are alone and working on it in part-time (as you probably have a job for a living) you may spend a whole lifetime without reaching anywhere near a working solution.

So my most important advice here would be that you lower down your expectations, and try to reduce the complexity of your problem by using all the information you can, and avoid as much as possible relying on screenshots (ie, try to hook directly into the game, look for DLL injection), and simplify some problems by implementing supervised procedures, do not let your algorithm learn everything (ie, drop image processing for now as much as possible and rely on internal game informations, later on if your algorithm works well, you can replace some parts of your AI program with image processing, thus gruadually attaining your full goal, for example if you can get something to work quite well, you can try to complexify your problem and replace supervised procedures and memory game data by unsupervised machine learning algorithms on screenshots).

Good luck, and if it works, make sure to publish an article, you can surely get renowned for solving such a hard practical problem!

回复收藏 0 原文

电影里的梦 2024-11-24 21:22:01

你所追求的问题在你定义的方式上是棘手的。认为神经网络会“神奇地”学习问题的丰富表示通常是错误的。在决定 ANN 是否是完成某项任务的正确工具时，需要记住的一个事实是，它是一种插值方法。想一想，您是否可以将您的问题描述为找到一个函数的近似值，您可以从该函数中获得许多点，并且有大量时间来设计网络和训练它。

您提出的问题未通过此测试。游戏控制不是屏幕上图像的功能。玩家必须记住很多信息。举个简单的例子，通常情况下，每次你进入游戏中的商店时，屏幕看起来都是一样的。不过，买什么取决于具体情况。无论网络多么复杂，如果屏幕像素是它的输入，它在进入商店时总是会执行相同的操作。

此外，还有规模问题。您提出的任务太复杂，无法在合理的时间内完成。您应该访问 aigamedev.com 了解游戏 AI 的工作原理。人工神经网络已在一些游戏中成功使用，但方式非常有限。游戏人工智能的开发非常困难，而且通常成本高昂。如果有构建功能神经网络的通用方法，业界很可能会抓住它。我建议您从非常非常简单的示例开始，例如井字游戏。

回复收藏 0 原文

是伱的 2024-11-24 21:22:01

作为第一步，您可能会查看连续帧的差异。你必须区分背景和实际的怪物精灵。我猜这个世界可能也有动画。为了找到这些，我会让角色四处移动，并将与世界一起移动的所有东西收集到一个大的背景图像/动画中。

您可以检测并识别具有相关性的敌人（使用 FFT）。但是，如果动画精确地重复像素，那么仅查看几个像素值会更快。您的主要任务将是编写一个强大的系统，该系统将识别新对象何时出现在屏幕上，并逐渐将所有帧的精灵帧存储到数据库中。也许您还必须构建武器效果模型。这些可以被减去，这样它们就不会弄乱你的对手数据库。

回复收藏 0 原文

叫嚣ゝ 2024-11-24 21:22:01

假设在任何时候你都可以从一组所有可能的“动作”中生成一组“结果”（可能涉及概率），并且游戏中存在一些一致性的概念（例如，你可以一遍又一遍地玩X级）再次），您可以从具有随机权重的 N 个神经网络开始，并让每个神经网络按以下方式玩游戏：

1）对于每个可能的“移动”，生成可能的“结果”列表（具有相关的概率）
2) 对于每个结果，使用神经网络确定“结果”的相关“价值”（分数）（例如 -1 到 1 之间的数字，1 是最好的可能结果，-1 是最差的结果）
3) 选择导致最高概率 * 分数的“移动”
4) 如果移动导致“赢”或“输”，则停止，否则返回步骤 1。

经过一定时间（或“赢”/“输”）后，评估神经网络的接近程度到“目标”（这可能会涉及一些领域知识）。然后丢弃距离目标最远的 50%（或其他百分比）的 NN，对前 50% 进行交叉/变异，然后再次运行新的 NN 集。继续运行，直到得到满意的神经网络。

回复收藏 0 原文

微凉 2024-11-24 21:22:01

我认为你最好的选择是一个涉及几个/可能网络的复杂架构：即一个用于识别和响应物品，一个用于商店，一个用于战斗（也许在这里你需要一个用于敌人识别，一个用于攻击），等等然后

尝试想一下最简单的《暗黑破坏神 II》游戏玩法，可能是野蛮人。然后一开始就保持简单，就像第一幕，仅第一个区域。

然后我猜有价值的“目标”将是敌人物体的消失和生命值的减少（得分成反比）。

一旦处理完这些单独的、“更简单”的任务，您就可以使用“主”人工神经网络来决定激活哪个子人工神经网络。

至于训练，我只看到三个选项：您可以使用上述的进化方法，但是您需要手动选择“获胜者”，除非您为此编写一个完整的单独程序。您可以让网络“观看”某人玩游戏。在这里他们将学习模仿一名球员或一组球员的风格。网络尝试预测玩家的下一步行动，并为正确的猜测而得到强化，等等。如果您确实获得了您想要的 ANN，则可以通过视频游戏来完成，无需实际的现场游戏。最后你可以让网络玩游戏，将敌人死亡、升级、恢复生命值等作为正强化，将玩家死亡、失去生命值等作为负强化。但是，即使是一个简单的网络也需要数千个具体的训练步骤来学习甚至简单的任务，因此您需要对此有很大的耐心。

总而言之，您的项目非常雄心勃勃。但我个人认为，只要有足够的时间，“理论上是可以完成的”。

回复收藏 0 原文

如何训练人工神经网络使用视觉输入玩《暗黑破坏神 2》？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

狼性发作

美煞众生

黑凤梨

慕巷

virou

两仪

友情链接

如何训练人工神经网络使用视觉输入玩《暗黑破坏神 2》？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

狼性发作

美煞众生

黑凤梨

慕巷

virou

两仪

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。