训练期间改变正则化因子

发布于 2025-01-24 04:08:11 字数 533 浏览 0 评论 0原文

我想知道有一种简单的方法吗?

例如,使用tf.keras.optimizers.schedules可以轻松完成学习率:

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(0.001)
optimizer = tf.keras.optimizers.SGD(learning_rate=lr_schedule)

是否有一种简单的方法可以使用正则化因子进行相同的操作?这样:

r_schedule = tf.keras.optimizers.schedules.ExponentialDecay(0.1)
regularizer = tf.keras.regularizers.L2(l2=r_schedule)

如果没有,我该如何以最小的努力逐渐改变正则化因素?

I wonder, is there an easy way?

For example, changing learning rate can be easily done using tf.keras.optimizers.schedules:

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(0.001)
optimizer = tf.keras.optimizers.SGD(learning_rate=lr_schedule)

Is there an easy way to do the same with regularization factor? Like this:

r_schedule = tf.keras.optimizers.schedules.ExponentialDecay(0.1)
regularizer = tf.keras.regularizers.L2(l2=r_schedule)

If not, how can I gradually change regularization factor with minimal effort?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

祁梦 2025-01-31 04:08:11

iiuc,我认为您应该能够使用自定义回调并实现相同/相似的逻辑 tf.keras.optimizers.schedules.schedules.exponentialdecay (但它可以超越最小的努力):

import tensorflow as tf

class Decay(tf.keras.callbacks.Callback):

  def __init__(self, l2, decay_steps, decay_rate, staircase):
    super().__init__()
    self.l2 = l2
    self.decay_steps = decay_steps
    self.decay_rate = decay_rate
    self.staircase = staircase

  def on_epoch_end(self, epoch, logs=None):
    global_step_recomp = self.params.get('steps')
    p = global_step_recomp / self.decay_steps
    if self.staircase:
      p = tf.floor(p)
    self.l2.assign(tf.multiply(
        self.l2, tf.pow(self.decay_rate, p)))
     
l2 = tf.Variable(initial_value=0.01, trainable=False)

def l2_regularizer(weights):
    tf.print(l2)
    loss = l2 * tf.reduce_sum(tf.square(weights))
    return loss

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(1, kernel_regularizer=l2_regularizer))
model.compile(optimizer='adam', loss='mse')
model.fit(tf.random.normal((50,1 )), tf.random.normal((50,1 )), batch_size=4, callbacks=[Decay(l2,
    decay_steps=100000,
    decay_rate=0.56,
    staircase=False)], epochs=3)
Epoch 1/3
0.01
 1/13 [=>............................] - ETA: 8s - loss: 0.63850.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
 9/13 [===================>..........] - ETA: 0s - loss: 2.13940.01
0.01
0.01
0.01
13/13 [==============================] - 1s 6ms/step - loss: 2.4884
Epoch 2/3
0.00999924541
 1/13 [=>............................] - ETA: 0s - loss: 1.97210.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
 9/13 [===================>..........] - ETA: 0s - loss: 2.37490.00999924541
0.00999924541
0.00999924541
0.00999924541
13/13 [==============================] - 0s 7ms/step - loss: 2.4541
Epoch 3/3
0.00999849103
 1/13 [=>............................] - ETA: 0s - loss: 0.81400.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
 7/13 [===============>..............] - ETA: 0s - loss: 2.71970.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
13/13 [==============================] - 0s 10ms/step - loss: 2.4195
<keras.callbacks.History at 0x7f7a5a4ff5d0>

IIUC, I think you should be able to use a custom callback and implement the same / similar logic used by tf.keras.optimizers.schedules.ExponentialDecay (but it could go beyond minimal effort):

import tensorflow as tf

class Decay(tf.keras.callbacks.Callback):

  def __init__(self, l2, decay_steps, decay_rate, staircase):
    super().__init__()
    self.l2 = l2
    self.decay_steps = decay_steps
    self.decay_rate = decay_rate
    self.staircase = staircase

  def on_epoch_end(self, epoch, logs=None):
    global_step_recomp = self.params.get('steps')
    p = global_step_recomp / self.decay_steps
    if self.staircase:
      p = tf.floor(p)
    self.l2.assign(tf.multiply(
        self.l2, tf.pow(self.decay_rate, p)))
     
l2 = tf.Variable(initial_value=0.01, trainable=False)

def l2_regularizer(weights):
    tf.print(l2)
    loss = l2 * tf.reduce_sum(tf.square(weights))
    return loss

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(1, kernel_regularizer=l2_regularizer))
model.compile(optimizer='adam', loss='mse')
model.fit(tf.random.normal((50,1 )), tf.random.normal((50,1 )), batch_size=4, callbacks=[Decay(l2,
    decay_steps=100000,
    decay_rate=0.56,
    staircase=False)], epochs=3)
Epoch 1/3
0.01
 1/13 [=>............................] - ETA: 8s - loss: 0.63850.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
 9/13 [===================>..........] - ETA: 0s - loss: 2.13940.01
0.01
0.01
0.01
13/13 [==============================] - 1s 6ms/step - loss: 2.4884
Epoch 2/3
0.00999924541
 1/13 [=>............................] - ETA: 0s - loss: 1.97210.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
0.00999924541
 9/13 [===================>..........] - ETA: 0s - loss: 2.37490.00999924541
0.00999924541
0.00999924541
0.00999924541
13/13 [==============================] - 0s 7ms/step - loss: 2.4541
Epoch 3/3
0.00999849103
 1/13 [=>............................] - ETA: 0s - loss: 0.81400.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
 7/13 [===============>..............] - ETA: 0s - loss: 2.71970.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
0.00999849103
13/13 [==============================] - 0s 10ms/step - loss: 2.4195
<keras.callbacks.History at 0x7f7a5a4ff5d0>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文