如何在 PyTorch 中对此函数建模
我想训练一个具有单个隐藏层的前馈神经网络,该网络对以下方程进行建模。
h = g(W1.input1 + V1.input2 + b)
output1 = f(W2.h + b_w)
output2 = f(V2.h + b_v)
f 和 g
是激活函数,h
是隐藏表示,W1, W2, V1, V2
是权重矩阵,b, b_w, b_v
是各自的偏差。
我无法连接 2 个输入,因为这将产生一个权重矩阵。我无法训练两个单独的神经网络,因为潜在表示会错过两个输入之间的交互。非常感谢任何帮助。我还附上了下面的 NN 图
I want to train a feed-forward Neural Network with a single hidden layer that models the below equation.
h = g(W1.input1 + V1.input2 + b)
output1 = f(W2.h + b_w)
output2 = f(V2.h + b_v)
f and g
are activation functions, h
is the hidden representation, W1, W2, V1, V2
are Weight matrices, b, b_w, b_v
are respective biases.
I can't concatenate 2 inputs because that will result in a single Weight matrix. I can't train two separate NNs because the latent representation will miss the interaction between 2 inputs. Any help is much appreciated. I have also attached the NN diagram below
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
PyTorch 让您以函数形式定义前向实现,因此您可以执行以下操作:
然后将其用作:
PyTorch let's you define your forward implementation in functional form, so you could do:
And then have it being used as:
我决定编写自己的线性层来计算 h = g(W1.input1 + V1.input2 + b) 我通过创建 2 个参数 W1 和 V1 将 input1 和 input2 与这 2 个参数相乘来实现此目的然后添加所有内容。代码如下:
输出 7 个待训练参数:
感谢 @aretor、@Ivan 和 @DerekG 的所有输入
I decided to write my own Linear layer which calculates
h = g(W1.input1 + V1.input2 + b)
I do this by creating 2 parameters W1 and V1 multiply input1 and input2 with the 2 parameters and then add everything. The code is given below:output 7 parameters to be trained:
Thanks for all the inputs @aretor, @Ivan, and @DerekG
我将指出,您可以组合 W_1 和 V_1,前提是您只需将它们对角平铺在一个更大的矩阵中,并将所有其他值设置为 0。然后您可以在每个优化步骤之后修剪权重,以将这些参数限制为 0。或者您可以使用稀疏张量来仅表示您想要更改的权重。无论如何:
让 W1 为 mxn 矩阵,V1 为 qxp 矩阵:
并且:(
请原谅维度触发器,如果存在的话,我没有仔细检查)。所得的乘法结果将为您提供您关心的中间值。
h = g(Ca + b)
同样,我相信第二个操作与普通的全连接层相同。您也可以连接最终的偏差项,因为偏差项已定义为每个输出的一个参数。
b_cat = [b_w, b_v]
。也就是说,以下命令会产生相同的结果:因此,唯一的新颖之处在于您需要将初始级联权重矩阵 C 中的某些参数限制为 0。
I will point out that you CAN combine W_1 and V_1, provided that you simply tile them diagonally in a larger matrix and setting all other values to 0. You can then clip the weights after each optimization step to constrain these parameters to remain 0. Or you could use a sparse tensor to represent only the weights you care to change. In any case:
as in, let W1 be a mxn matrix and V1 be a qxp matrix:
and:
(Please excuse dimension flip-flops if they exist, I didn't double-check). The resulting multiplication gives you the intermediate value you care about.
h = g(Ca + b)
And likewise, I believe the second operation is identical to a normal fully connected layer. You can concatenate the final bias terms too, as the bias term is already defined as one parameter per output.
b_cat = [b_w, b_v]
. That is, the following yield the same result:So the only novelty is that you need to constrain some of the parameters in the initial concatenated weight matrix
C
to be 0.