如何为Keras Gru解释get_weights?

发布于 2025-02-11 16:41:43 字数 1867 浏览 4 评论 0 原文

我无法解释GRU层的get_weights的结果。这是我的代码 -

#Modified from - https://machinelearningmastery.com/understanding-simple-recurrent-neural-networks-in-keras/
from pandas import read_csv
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN, GRU
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import math
import matplotlib.pyplot as plt

model = Sequential()
model.add(GRU(units = 2, input_shape = (3,1), activation = 'linear'))
model.add(Dense(units = 1, activation = 'linear'))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')

initial_weights = model.layers[0].get_weights()
print("Shape = ",initial_weights)

我熟悉GRU概念。此外,我了解get_weights如何适用于Keras Simple RNN层,其中第一个阵列代表输入权重,第二个数组代表激活权重的第二个阵列,而第三个则是偏差。但是,我因GRU输出而迷失了方向,这是在下面给出的 -

Shape =  [array([[-0.64266175, -0.0870676 , -0.25356603, -0.03685969,  0.22260845,
        -0.04923642]], dtype=float32), array([[ 0.01929092, -0.4932567 ,  0.3723044 , -0.6559699 , -0.33790302,
         0.27062896],
       [-0.4214194 ,  0.46456426,  0.27233726, -0.00461334, -0.6533575 ,
        -0.32483965]], dtype=float32), array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]], dtype=float32)]

我假设它与Gru Gates有关。

更新:7/4-此说Keras Gru有3个大门,更新,重置和输出。但是,基于 /a>,GRU不应具有输出门。

I am unable to interpret the results of get_weights from a GRU layer. Here's my code -

#Modified from - https://machinelearningmastery.com/understanding-simple-recurrent-neural-networks-in-keras/
from pandas import read_csv
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN, GRU
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import math
import matplotlib.pyplot as plt

model = Sequential()
model.add(GRU(units = 2, input_shape = (3,1), activation = 'linear'))
model.add(Dense(units = 1, activation = 'linear'))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')

initial_weights = model.layers[0].get_weights()
print("Shape = ",initial_weights)

I am familiar with GRU concepts. In addition, I understand how the get_weights work for Keras Simple RNN layer, where the first array represents the input weights, the second the activation weights and the third the bias. However, I am lost with output of GRU, which is given below -

Shape =  [array([[-0.64266175, -0.0870676 , -0.25356603, -0.03685969,  0.22260845,
        -0.04923642]], dtype=float32), array([[ 0.01929092, -0.4932567 ,  0.3723044 , -0.6559699 , -0.33790302,
         0.27062896],
       [-0.4214194 ,  0.46456426,  0.27233726, -0.00461334, -0.6533575 ,
        -0.32483965]], dtype=float32), array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]], dtype=float32)]

I am assuming it has something to do with GRU gates.

Update:7/4 - This page says that keras GRU has 3 gates, update, reset and output. However, based on this, GRU shouldn't have the output gate.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

木落 2025-02-18 16:41:43

Best way I know would be to track the add_weight() calls in the

让我们以示例模型为例,

model = tf.keras.models.Sequential(
    [
     tf.keras.layers.GRU(32, input_shape=(5, 10), name='gru'),
     tf.keras.layers.Dense(10)
    ]
)

我们将如何打印一些有关 stoges = model.get_layer('gru')的元数据。get_weights()。这给出了,

Number of arrays in weights: 3
Shape of each array in weights: [(10, 96), (32, 96), (2, 96)]

让我们回到 grucell 定义的权重。我们得到了,

self.kernel = self.add_weight(
    shape=(input_dim, self.units * 3),
    ...
)
self.recurrent_kernel = self.add_weight(
    shape=(self.units, self.units * 3),
    ...
)

    ...
    bias_shape = (2, 3 * self.units)
    self.bias = self.add_weight(
        shape=bias_shape,
        ...
    )

这就是您所看到的重量(按此顺序)。这就是为什么它们会像这样。 gru计算概述在这里

strape 中的第一个矩阵(Shape [10,96] )是 WZ | WR | WR | WH (按此顺序)。这些都是 [10,32] 大小的张量。串联给出了 [10,32*3 = 96] 大小的张量。

同样,第二个矩阵是 uz | ur | uh 的串联。这些都是 [32,32] 大小的张量,在串联后变成 [32,96]
您可以看到他们如何将此组合的权重矩阵打破 z r h 组件

终于偏见了。它包含2个偏见,即 [2,96] 尺寸张量; recurrent_bias 。同样,所有门/权重的偏见都合并为一个张量。通常,仅使用 input_bias 。但是,如果您拥有 reset_after (确定如何应用重置门)设置为 true ,则使用 recurrent_bias 使用。这是一个实现细节。

Best way I know would be to track the add_weight() calls in the build() function of the GRUCell.

Let's take an example model,

model = tf.keras.models.Sequential(
    [
     tf.keras.layers.GRU(32, input_shape=(5, 10), name='gru'),
     tf.keras.layers.Dense(10)
    ]
)

How we'll print some metadata about what's returned by weights = model.get_layer('gru').get_weights(). Which gives,

Number of arrays in weights: 3
Shape of each array in weights: [(10, 96), (32, 96), (2, 96)]

Let's go back to what weights defined by the GRUCell. We got,

self.kernel = self.add_weight(
    shape=(input_dim, self.units * 3),
    ...
)
self.recurrent_kernel = self.add_weight(
    shape=(self.units, self.units * 3),
    ...
)

    ...
    bias_shape = (2, 3 * self.units)
    self.bias = self.add_weight(
        shape=bias_shape,
        ...
    )

This is what you're seeing as weights (in that order). Here's why they are shaped like this. GRU computations are outlined here.

GRU computations

The first matrix in weights (of shape [10, 96]) is a concatenation of Wz|Wr|Wh (in that order). Each of these is a [10, 32] sized tensor. Concatenation gives a [10, 32*3=96] sized tensor.

Similarly, the second matrix is a concatenation of Uz|Ur|Uh. Each of these is a [32, 32] sized tensor which becomes [32, 96] after concatenation.
You can see how they break this combined weight matrix to each of z, r and h components here.

Finally the bias. It contains 2 biases i.e. [2, 96] sized tensor; input_bias and recurrent_bias. Again, biases from all gates/weights are combined to a single tensor. Typically, only the input_bias is used. But if you have reset_after (decides how the reset gate is applied) set to True, then the recurrent_bias gets used. It's an implementation detail.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文