TensorFlow Applip_gradients()有多个损失
我正在训练一个具有中间输出的模型(VAEGAN),我有两个损失,
- 从输出层计算的 KL 散度损失,
- 从中间层计算的相似性(rec)损失。
我可以简单地总结它们并应用如下所示的渐变吗?
with tf.GradientTape() as tape:
z_mean, z_log_sigma, z_encoder_output = self.encoder(real_images, training = True)
kl_loss = self.kl_loss_fn(z_mean, z_log_sigma) * kl_loss_coeff
fake_images = self.decoder(z_encoder_output)
fake_inter_activations, logits_fake = self.discriminator(fake_images, training = True)
real_inter_activations, logits_real = self.discriminator(real_images, training = True)
rec_loss = self.rec_loss_fn(fake_inter_activations, real_inter_activations) * rec_loss_coeff
total_encoder_loss = kl_loss + rec_loss
grads = tape.gradient(total_encoder_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads, self.encoder.trainable_weights))
或者我是否需要像下面一样将它们分开,同时保持胶带持久?
with tf.GradientTape(persistent = True) as tape:
z_mean, z_log_sigma, z_encoder_output = self.encoder(real_images, training = True)
kl_loss = self.kl_loss_fn(z_mean, z_log_sigma) * kl_loss_coeff
fake_images = self.decoder(z_encoder_output)
fake_inter_activations, logits_fake = self.discriminator(fake_images, training = True)
real_inter_activations, logits_real = self.discriminator(real_images, training = True)
rec_loss = self.rec_loss_fn(fake_inter_activations, real_inter_activations) * rec_loss_coeff
grads_kl_loss = tape.gradient(kl_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads_kl_loss, self.encoder.trainable_weights))
grads_rec_loss = tape.gradient(rec_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads_rec_loss, self.encoder.trainable_weights))
I am training a model(VAEGAN) with intermediate outputs and I have two losses,
- KL Divergence loss I compute from output layer
- Similarity (rec) loss I compute from an intermediate layer.
Can I simply sum them up and apply gradients like below?
with tf.GradientTape() as tape:
z_mean, z_log_sigma, z_encoder_output = self.encoder(real_images, training = True)
kl_loss = self.kl_loss_fn(z_mean, z_log_sigma) * kl_loss_coeff
fake_images = self.decoder(z_encoder_output)
fake_inter_activations, logits_fake = self.discriminator(fake_images, training = True)
real_inter_activations, logits_real = self.discriminator(real_images, training = True)
rec_loss = self.rec_loss_fn(fake_inter_activations, real_inter_activations) * rec_loss_coeff
total_encoder_loss = kl_loss + rec_loss
grads = tape.gradient(total_encoder_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads, self.encoder.trainable_weights))
or do I need to seperate them like below while keeping tape persistent?
with tf.GradientTape(persistent = True) as tape:
z_mean, z_log_sigma, z_encoder_output = self.encoder(real_images, training = True)
kl_loss = self.kl_loss_fn(z_mean, z_log_sigma) * kl_loss_coeff
fake_images = self.decoder(z_encoder_output)
fake_inter_activations, logits_fake = self.discriminator(fake_images, training = True)
real_inter_activations, logits_real = self.discriminator(real_images, training = True)
rec_loss = self.rec_loss_fn(fake_inter_activations, real_inter_activations) * rec_loss_coeff
grads_kl_loss = tape.gradient(kl_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads_kl_loss, self.encoder.trainable_weights))
grads_rec_loss = tape.gradient(rec_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads_rec_loss, self.encoder.trainable_weights))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,您通常可以总结损失并计算单个梯度。由于总和的梯度是相应梯度的总和,因此总损失所采取的步骤与一个接一个地采取这两个步骤相同。
这是一个简单的例子:假设您有两个权重,而您当前处于(1,3)(“起点”)。损失1的梯度为(2,-4),损失2的梯度为(1,2)。
Yes, you can generally sum the losses and compute a single gradient. Since the gradient of a sum is the sum of the respective gradients, so the step taken by the summed loss is the same as taking both steps one after another.
Here's a simple example: Say you have two weights, and you are currently at the point (1, 3) ("starting point"). The gradient for loss 1 is (2, -4) and the gradient for loss 2 is (1, 2).