Pytorch-如何指定输入层?默认情况下包括吗?
我正在努力在稳定的Baselines中处理强化学习问题3,但我认为这对这个问题并不重要。 SB3基于Pytorch。
我有101个输入功能,即使我设计了一个神经体系结构,第一层只有64个节点,但网络仍然可以工作。以下是我的模型体系结构的屏幕截图:
我很担心,因为我认为神经网络的第一层需要具有等于输入功能数量的节点。
Pytorch是否默认包含输入层,并且不显示它?如果是这样,我怎么能知道并控制输入层的激活功能等是什么?
编辑: 这是我的进口和基本代码,以回应迈克尔的评论。
import gym
from gym import Env
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from gym import spaces
from gym.utils import seeding
from stable_baselines3.common.vec_env import DummyVecEnv, SubprocVecEnv
from stable_baselines3.common.utils import set_random_seed
from stable_baselines3.common.evaluation import evaluate_policy
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3 import PPO
import math
import random
import torch as th
from sb3_contrib.common.maskable.policies import MaskableActorCriticPolicy
from sb3_contrib.common.wrappers import ActionMasker
from sb3_contrib.ppo_mask import MaskablePPO
from sb3_contrib.common.envs import InvalidActionEnvDiscrete
from sb3_contrib.common.maskable.evaluation import evaluate_policy
from sb3_contrib.common.maskable.utils import get_action_masks
env = MyCustomEnv(....)
env = ActionMasker(env, mask_fn) # Wrap to enable masking
# Defining custom neural network architecture
mynetwork = dict(activation_fn=th.nn.LeakyReLU,
net_arch=[dict(pi=[64, 64], vf=[64, 64])])
# Maskable PPO behaves just like regular PPO
model = MaskablePPO(MaskableActorCriticPolicy, env, verbose=1, learning_rate=0.0005, gamma=0.975, seed=10, batch_size=256, clip_range=0.2,
tensorboard_log="./log1/", policy_kwargs=mynetwork)
# To get the screenshot I gave
print(model.policy)
I am working on a Reinforcement Learning problem in StableBaselines3, but I don't think that really matters for this question. SB3 is based on PyTorch.
I have 101 input features, and even though I designed a neural architecture with the first layer having only 64 nodes, the network still works. Below is a screenshot of my model architecture:
I am concerned because I thought that the first layer of the neural network needed to have a number of nodes equal to the number of input features.
Does PyTorch include an input layer by default, and doesn't display it? If so, how can I know and control what the activation functions etc. are for the input layer?
EDIT:
Here are my imports and basic code, in response to Michael's comment.
import gym
from gym import Env
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from gym import spaces
from gym.utils import seeding
from stable_baselines3.common.vec_env import DummyVecEnv, SubprocVecEnv
from stable_baselines3.common.utils import set_random_seed
from stable_baselines3.common.evaluation import evaluate_policy
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3 import PPO
import math
import random
import torch as th
from sb3_contrib.common.maskable.policies import MaskableActorCriticPolicy
from sb3_contrib.common.wrappers import ActionMasker
from sb3_contrib.ppo_mask import MaskablePPO
from sb3_contrib.common.envs import InvalidActionEnvDiscrete
from sb3_contrib.common.maskable.evaluation import evaluate_policy
from sb3_contrib.common.maskable.utils import get_action_masks
env = MyCustomEnv(....)
env = ActionMasker(env, mask_fn) # Wrap to enable masking
# Defining custom neural network architecture
mynetwork = dict(activation_fn=th.nn.LeakyReLU,
net_arch=[dict(pi=[64, 64], vf=[64, 64])])
# Maskable PPO behaves just like regular PPO
model = MaskablePPO(MaskableActorCriticPolicy, env, verbose=1, learning_rate=0.0005, gamma=0.975, seed=10, batch_size=256, clip_range=0.2,
tensorboard_log="./log1/", policy_kwargs=mynetwork)
# To get the screenshot I gave
print(model.policy)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们可以在稳定基线的源代码中进行一些挖掘。
在
maskableAtctorCriticpolicy
,我们可以看到它通过初始化mlpextractor
并创建policy> policy_net
sub-network 在这里定义其层次的定义在。最终,图层特征尺寸由“ Pi”
和“ VF”
list snet_arch
( 。如果追溯此信息,您会注意到这些参数可以被修改为
net_arch
> code> maskableActorCriticpolicy
header”> header 。We can do a little bit of digging inside Stable Baselines' source code.
Looking inside the
MaskableActorCriticPolicy
, we can see it builds a MLP extractor by initializing an instance ofMlpExtractor
and creating thepolicy_net
sub-network here whose layers are defined in this loop. Ultimately the layers feature sizes are dictated by the"pi"
and"vf"
lists in side ofnet_arch
(see here).If you trace back this you will notice those parameters can be modified as the
net_arch
argument onMaskableActorCriticPolicy
header.