神经网络的通用背部传播算法?

发布于 2025-02-06 16:28:21 字数 1056 浏览 0 评论 0原文

我正在从头开始制作一个神经网络程序,并且正在尝试使用Python和Numpy进行广泛的梯度下降和背部传播算法。现在看起来像这样:

def back_prop(y, layers, lr=10e-8):
    for i in range(len(weights) - 1, -1, -1):
        cost = -1.0*(y - layers[i+1])
        for j in range(len(weights[i])):
            for k in range(len(weights[i][0])):
                weights[i][j][k] -= lr*2*cost[j]*layers[i][k]
                
        for j in range(len(biases[i])):
            biases[i][j] -= lr*2*cost[j]
        y = la.inv(weights[i].T @ weights[i]) @ weights[i].T @ (y - biases[i])
    return 0

在这里,y表示标签y或实际y,而层则表示向前传播后神经网络的层。该代码似乎适用于没有激活功能(或线性激活函数)的1层神经网络。 1层神经网络只是一个重量矩阵和一个偏置向量。如果我尝试制作更多的层,并且包括激活功能,则无效。 我写的线: y = la.inv(weights [i] .t @ weights [i]) @ weights [i] .t @(y- sibiases [i]) 是基于我在白板上写的一些数学,但现在似乎是错误的。我不确定如何修复此算法或如何使其与激活功能一起工作,除了线性激活函数。有人有建议吗?

编辑: 包括一些划痕工作 f 的梯度 上一层

I'm making a neural network program from scratch, and I'm trying to make a Generalized Gradient Descent and Back Propagation Algorithm with Python and numpy. Right now it looks like this:

def back_prop(y, layers, lr=10e-8):
    for i in range(len(weights) - 1, -1, -1):
        cost = -1.0*(y - layers[i+1])
        for j in range(len(weights[i])):
            for k in range(len(weights[i][0])):
                weights[i][j][k] -= lr*2*cost[j]*layers[i][k]
                
        for j in range(len(biases[i])):
            biases[i][j] -= lr*2*cost[j]
        y = la.inv(weights[i].T @ weights[i]) @ weights[i].T @ (y - biases[i])
    return 0

Here, y represents the label y or actual y, and layers represents the layers of the neural network after forward propagation. This code seems to work for a 1-layer neural network with no activation function (or a linear activation function). A 1-layer neural network is simply just one Weight Matrix and One bias vector. If I try to make more layers and if I include activation functions it doesn't work.
The line I wrote:
y = la.inv(weights[i].T @ weights[i]) @ weights[i].T @ (y - biases[i])
is based off some math I wrote on a whiteboard, but now it seems to be wrong. I'm not sure how to go about fixing this algorithm or how to make it work alongside Activation Functions besides just linear activation function. Anyone have any advice?

Edit:
including some scratch work
Gradient of F
Previous Layers

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

后来的我们 2025-02-13 16:28:21

我重写了数学,并弄清楚了我的问题。由此,我重写了代码,以便它现在起作用。这是新代码:

def back_prop(y, layers, lr=10e-8):
    cost = -1.0 * (y - layers[len(weights)])
    for i in range(len(weights) - 1, -1, -1):
        newcost = 1.0 * weights[i].T @ cost
        for j in range(len(weights[i])):
            for k in range(len(weights[i][j])):
                weights[i][j][k] -= lr*2*cost[j]*layers[i][k]
                
        for j in range(len(biases[i])):
            biases[i][j] -= lr*2*cost[j]
        cost = newcost
    return 0

I rewrote my math and figured out my issue. From this I have rewrote the code so that it works now. Here's the new code:

def back_prop(y, layers, lr=10e-8):
    cost = -1.0 * (y - layers[len(weights)])
    for i in range(len(weights) - 1, -1, -1):
        newcost = 1.0 * weights[i].T @ cost
        for j in range(len(weights[i])):
            for k in range(len(weights[i][j])):
                weights[i][j][k] -= lr*2*cost[j]*layers[i][k]
                
        for j in range(len(biases[i])):
            biases[i][j] -= lr*2*cost[j]
        cost = newcost
    return 0
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文