- 概览
- 安装
- 教程
- 算法接口文档
- 简易高效的并行接口
- APIS
- FREQUENTLY ASKED QUESTIONS
- EVOKIT
- 其他
- parl.algorithms.paddle.policy_gradient
- parl.algorithms.paddle.dqn
- parl.algorithms.paddle.ddpg
- parl.algorithms.paddle.ddqn
- parl.algorithms.paddle.oac
- parl.algorithms.paddle.a2c
- parl.algorithms.paddle.qmix
- parl.algorithms.paddle.td3
- parl.algorithms.paddle.sac
- parl.algorithms.paddle.ppo
- parl.algorithms.paddle.maddpg
- parl.core.paddle.model
- parl.core.paddle.algorithm
- parl.remote.remote_decorator
- parl.core.paddle.agent
- parl.remote.client
parl.Algorithm
- class Algorithm(model=None)[源代码]¶
- alias:
parl.Algorithm
alias:parl.core.fluid.algorithm.Algorithm
Algorithm
defines the way how to update the parameters ofthe
PARL has implemented various algorithms(DQN/DDPG/PPO/A3C/IMPALA) thatModel
. This is where we define loss functions and the optimizer of the neural network. AnAlgorithm
has at least a model.can be reused quickly, which can be accessed with
parl.algorithms
.Example:
import parl model = Model() dqn = parl.algorithms.DQN(model, lr=1e-3)
- 变量:
model (
parl.Model
) – a neural network that represents a policyfunction. (or a Q-value) –
- Pulic Functions:
get_weights
: return a Python dictionary containing parameters
of the current model. -
set_weights
: copy parameters fromget_weights()
to the model. -sample
: return a noisy action to perform exploration according to the policy. -predict
: return an action given current observation. -learn
: define the loss function and create an optimizer to minized the loss.
- __init__(model=None)[源代码]¶
- 参数:
model (
parl.Model
) – a neural network that represents a policy or a Q-value function.
- get_weights()[源代码]¶
Get weights of self.model.
- 返回:
a Python dict containing the parameters of self.model.
- 返回类型:
weights (dict)
- predict(*args, **kwargs)[源代码]¶
Refine the predicting process, e.g,. use the policy model to predict actions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论