在OpenAI健身环境中,15x15 NP阵列的正确观察形状是什么?

发布于 2025-02-06 14:37:52 字数 2748 浏览 3 评论 0原文

我正在创建一个健身房环境,它只能观察到15x15网格。网格最初用0填充,并且随着游戏的进行,内容的变化为0到255之间。有225个可能的操作,每个操作都与位置相对应。我目前的 Init 的代码是:

 self.action_space = Discrete(225)
 self.observation_shape = Box(low=-1000,high=10000,shape=(15,15,),dtype=np.uint8) 

但是,当运行稳定稳定的基线3代码时:


import stable_baselines3
from stable_baselines3 import DQN

model = DQN("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)

I get the error 
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-129-d01d55d2bc81> in <module>()
      2 from stable_baselines3 import DQN
      3 
----> 4 model = DQN("MultiInputPolicy", env, verbose=1)
      5 model.learn(total_timesteps=10000, log_interval=4)
      6 model.save("battleship_dqn")

5 frames
/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/preprocessing.py in get_obs_shape(observation_space)
    156 
    157     else:
--> 158         raise NotImplementedError(f"{observation_space} observation space is not supported")
    159 
    160 

NotImplementedError: [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] observation space is not supported

这会使我认为我使用了错误的OpenAI健身房,或者在盒子中错误地定义了Shape()。定义15x15 numpy数组的正确方法是什么?

另外,如果有用,错误消息的上下文是


if isinstance(observation_space, spaces.Box):
        return observation_space.shape
    elif isinstance(observation_space, spaces.Discrete):
        # Observation is an int
        return (1,)
    elif isinstance(observation_space, spaces.MultiDiscrete):
        # Number of discrete features
        return (int(len(observation_space.nvec)),)
    elif isinstance(observation_space, spaces.MultiBinary):
        # Number of binary features
        return (int(observation_space.n),)
    elif isinstance(observation_space, spaces.Dict):
        return {key: get_obs_shape(subspace) for (key, subspace) in observation_space.spaces.items()}

    else:
        raise NotImplementedError(f"{observation_space} observation space is not supported")

编辑我找不到任何解决方案,因此我尝试使用Keras-Rl2。使用KERAS-RL2,我最终使用了(1,15,15)而不是(15,15),但我不知道这是否在稳定的基本线2中起作用。

I am creating a gym enviroment which has a observation of just a 15x15 grid. The grid is filled initially with 0s, and as the game progresses the contents change to between 0 and 255. There are 225 possible actions, each of which corresponding to a location. My current code for init is:

 self.action_space = Discrete(225)
 self.observation_shape = Box(low=-1000,high=10000,shape=(15,15,),dtype=np.uint8) 

however when running stable stable baselines 3 code:


import stable_baselines3
from stable_baselines3 import DQN

model = DQN("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)

I get the error 
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-129-d01d55d2bc81> in <module>()
      2 from stable_baselines3 import DQN
      3 
----> 4 model = DQN("MultiInputPolicy", env, verbose=1)
      5 model.learn(total_timesteps=10000, log_interval=4)
      6 model.save("battleship_dqn")

5 frames
/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/preprocessing.py in get_obs_shape(observation_space)
    156 
    157     else:
--> 158         raise NotImplementedError(f"{observation_space} observation space is not supported")
    159 
    160 

NotImplementedError: [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] observation space is not supported

This leads me to think that i have used the wrong openai gym space, or have defined shape() within box incorrectly. What would be the correct way to define a 15x15 numpy array?

Also, if it is useful, the context of the error message is


if isinstance(observation_space, spaces.Box):
        return observation_space.shape
    elif isinstance(observation_space, spaces.Discrete):
        # Observation is an int
        return (1,)
    elif isinstance(observation_space, spaces.MultiDiscrete):
        # Number of discrete features
        return (int(len(observation_space.nvec)),)
    elif isinstance(observation_space, spaces.MultiBinary):
        # Number of binary features
        return (int(observation_space.n),)
    elif isinstance(observation_space, spaces.Dict):
        return {key: get_obs_shape(subspace) for (key, subspace) in observation_space.spaces.items()}

    else:
        raise NotImplementedError(f"{observation_space} observation space is not supported")

Edit I couldn't find any solution to this, so i tried to use Keras-rl2 instead. With Keras-rl2, i ended up using (1,15,15) instead of (15,15,), but I don't know if that will work in Stable Baselines 2.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文