在OpenAI健身环境中,15x15 NP阵列的正确观察形状是什么?
我正在创建一个健身房环境,它只能观察到15x15网格。网格最初用0填充,并且随着游戏的进行,内容的变化为0到255之间。有225个可能的操作,每个操作都与位置相对应。我目前的 Init 的代码是:
self.action_space = Discrete(225)
self.observation_shape = Box(low=-1000,high=10000,shape=(15,15,),dtype=np.uint8)
但是,当运行稳定稳定的基线3代码时:
import stable_baselines3
from stable_baselines3 import DQN
model = DQN("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)
I get the error
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-129-d01d55d2bc81> in <module>()
2 from stable_baselines3 import DQN
3
----> 4 model = DQN("MultiInputPolicy", env, verbose=1)
5 model.learn(total_timesteps=10000, log_interval=4)
6 model.save("battleship_dqn")
5 frames
/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/preprocessing.py in get_obs_shape(observation_space)
156
157 else:
--> 158 raise NotImplementedError(f"{observation_space} observation space is not supported")
159
160
NotImplementedError: [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] observation space is not supported
这会使我认为我使用了错误的OpenAI健身房,或者在盒子中错误地定义了Shape()。定义15x15 numpy数组的正确方法是什么?
另外,如果有用,错误消息的上下文是
if isinstance(observation_space, spaces.Box):
return observation_space.shape
elif isinstance(observation_space, spaces.Discrete):
# Observation is an int
return (1,)
elif isinstance(observation_space, spaces.MultiDiscrete):
# Number of discrete features
return (int(len(observation_space.nvec)),)
elif isinstance(observation_space, spaces.MultiBinary):
# Number of binary features
return (int(observation_space.n),)
elif isinstance(observation_space, spaces.Dict):
return {key: get_obs_shape(subspace) for (key, subspace) in observation_space.spaces.items()}
else:
raise NotImplementedError(f"{observation_space} observation space is not supported")
编辑我找不到任何解决方案,因此我尝试使用Keras-Rl2。使用KERAS-RL2,我最终使用了(1,15,15)而不是(15,15),但我不知道这是否在稳定的基本线2中起作用。
I am creating a gym enviroment which has a observation of just a 15x15 grid. The grid is filled initially with 0s, and as the game progresses the contents change to between 0 and 255. There are 225 possible actions, each of which corresponding to a location. My current code for init is:
self.action_space = Discrete(225)
self.observation_shape = Box(low=-1000,high=10000,shape=(15,15,),dtype=np.uint8)
however when running stable stable baselines 3 code:
import stable_baselines3
from stable_baselines3 import DQN
model = DQN("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)
I get the error
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-129-d01d55d2bc81> in <module>()
2 from stable_baselines3 import DQN
3
----> 4 model = DQN("MultiInputPolicy", env, verbose=1)
5 model.learn(total_timesteps=10000, log_interval=4)
6 model.save("battleship_dqn")
5 frames
/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/preprocessing.py in get_obs_shape(observation_space)
156
157 else:
--> 158 raise NotImplementedError(f"{observation_space} observation space is not supported")
159
160
NotImplementedError: [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] observation space is not supported
This leads me to think that i have used the wrong openai gym space, or have defined shape() within box incorrectly. What would be the correct way to define a 15x15 numpy array?
Also, if it is useful, the context of the error message is
if isinstance(observation_space, spaces.Box):
return observation_space.shape
elif isinstance(observation_space, spaces.Discrete):
# Observation is an int
return (1,)
elif isinstance(observation_space, spaces.MultiDiscrete):
# Number of discrete features
return (int(len(observation_space.nvec)),)
elif isinstance(observation_space, spaces.MultiBinary):
# Number of binary features
return (int(observation_space.n),)
elif isinstance(observation_space, spaces.Dict):
return {key: get_obs_shape(subspace) for (key, subspace) in observation_space.spaces.items()}
else:
raise NotImplementedError(f"{observation_space} observation space is not supported")
Edit I couldn't find any solution to this, so i tried to use Keras-rl2 instead. With Keras-rl2, i ended up using (1,15,15) instead of (15,15,), but I don't know if that will work in Stable Baselines 2.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论