在OpenAI健身环境中，15x15 NP阵列的正确观察形状是什么？

发布于 2025-02-06 14:37:52 字数 2748 浏览 3 评论 0原文

我正在创建一个健身房环境，它只能观察到15x15网格。网格最初用0填充，并且随着游戏的进行，内容的变化为0到255之间。有225个可能的操作，每个操作都与位置相对应。我目前的 Init 的代码是：

 self.action_space = Discrete(225)
 self.observation_shape = Box(low=-1000,high=10000,shape=(15,15,),dtype=np.uint8)

但是，当运行稳定稳定的基线3代码时：


import stable_baselines3
from stable_baselines3 import DQN

model = DQN("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)


I get the error 
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-129-d01d55d2bc81> in <module>()
      2 from stable_baselines3 import DQN
      3 
----> 4 model = DQN("MultiInputPolicy", env, verbose=1)
      5 model.learn(total_timesteps=10000, log_interval=4)
      6 model.save("battleship_dqn")

5 frames
/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/preprocessing.py in get_obs_shape(observation_space)
    156 
    157     else:
--> 158         raise NotImplementedError(f"{observation_space} observation space is not supported")
    159 
    160 

NotImplementedError: [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] observation space is not supported

这会使我认为我使用了错误的OpenAI健身房，或者在盒子中错误地定义了Shape（）。定义15x15 numpy数组的正确方法是什么？

另外，如果有用，错误消息的上下文是


if isinstance(observation_space, spaces.Box):
        return observation_space.shape
    elif isinstance(observation_space, spaces.Discrete):
        # Observation is an int
        return (1,)
    elif isinstance(observation_space, spaces.MultiDiscrete):
        # Number of discrete features
        return (int(len(observation_space.nvec)),)
    elif isinstance(observation_space, spaces.MultiBinary):
        # Number of binary features
        return (int(observation_space.n),)
    elif isinstance(observation_space, spaces.Dict):
        return {key: get_obs_shape(subspace) for (key, subspace) in observation_space.spaces.items()}

    else:
        raise NotImplementedError(f"{observation_space} observation space is not supported")

编辑我找不到任何解决方案，因此我尝试使用Keras-Rl2。使用KERAS-RL2，我最终使用了（1,15,15）而不是（15,15），但我不知道这是否在稳定的基本线2中起作用。

原文

I am creating a gym enviroment which has a observation of just a 15x15 grid. The grid is filled initially with 0s, and as the game progresses the contents change to between 0 and 255. There are 225 possible actions, each of which corresponding to a location. My current code for init is:

 self.action_space = Discrete(225)
 self.observation_shape = Box(low=-1000,high=10000,shape=(15,15,),dtype=np.uint8)

however when running stable stable baselines 3 code:


import stable_baselines3
from stable_baselines3 import DQN

model = DQN("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)


I get the error 
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-129-d01d55d2bc81> in <module>()
      2 from stable_baselines3 import DQN
      3 
----> 4 model = DQN("MultiInputPolicy", env, verbose=1)
      5 model.learn(total_timesteps=10000, log_interval=4)
      6 model.save("battleship_dqn")

5 frames
/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/preprocessing.py in get_obs_shape(observation_space)
    156 
    157     else:
--> 158         raise NotImplementedError(f"{observation_space} observation space is not supported")
    159 
    160 

NotImplementedError: [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] observation space is not supported

This leads me to think that i have used the wrong openai gym space, or have defined shape() within box incorrectly. What would be the correct way to define a 15x15 numpy array?

Also, if it is useful, the context of the error message is


if isinstance(observation_space, spaces.Box):
        return observation_space.shape
    elif isinstance(observation_space, spaces.Discrete):
        # Observation is an int
        return (1,)
    elif isinstance(observation_space, spaces.MultiDiscrete):
        # Number of discrete features
        return (int(len(observation_space.nvec)),)
    elif isinstance(observation_space, spaces.MultiBinary):
        # Number of binary features
        return (int(observation_space.n),)
    elif isinstance(observation_space, spaces.Dict):
        return {key: get_obs_shape(subspace) for (key, subspace) in observation_space.spaces.items()}

    else:
        raise NotImplementedError(f"{observation_space} observation space is not supported")

Edit I couldn't find any solution to this, so i tried to use Keras-rl2 instead. With Keras-rl2, i ended up using (1,15,15) instead of (15,15,), but I don't know if that will work in Stable Baselines 2.

分享到QQ

分享到微博