valueerror:形状(784,32)和(10,784)未对齐:32(DIM 1)!= 10(DIM 0)的神经网络

发布于 2025-02-08 17:58:11 字数 5422 浏览 1 评论 0原文

我正在尝试从与Keras相似的从头开始构建一个简单的神经网络库,但是我遇到的问题要使培训正常工作。自从我从头开始写NN而不是使用库以来已经有一段时间了,所以我认为这是一个很好的做法。

我不太确定是否适当地设置了构造函数,因为没有输入形状,并且我一直跑入“ valueerror:形状x and y”问题,而不管我通过多少神经元,我都不对准。或输入形状。 这是追溯:

Traceback (most recent call last):  File "NNfromScratch.py", line 551, in <module>
    model.train(X_train, y_train, epochs=100, batch_size=10, verbose=True)
  File "NNfromScratch.py", line 427, in train
    self.forward(batch_inputs)
  File "NNfromScratch.py", line 395, in forward
    self.outputs = layer.forward(self.outputs)
  File "NNfromScratch.py", line 153, in forward
    **self.outputs = np.dot(self.weights.T, inputs) + self.biases**
  File "<__array_function__ internals>", line 6, in dot
ValueError: shapes (784,32) and (10,784) not aligned: 32 (dim 1) != 10 (dim 0)

误差是从 forward 层的函数中丢弃的。

看到。

完整(可重复的)代码可以在此处 尽管:

import time
import numpy as np
import pandas as pd
import pickle as pkl
import matplotlib.pyplot as plt
import tensorflow.keras.datasets.mnist as mnist

...

class Layers:
    class Dense:
        def __init__(self, neurons=0, activation=Activations.ReLU, inputs=0, dropout_rate=1):
            # Initialize weights and biases
            self.weights = np.random.randn(neurons, inputs)
            self.biases = np.random.randn(1, neurons)
            self.activation = activation
            self.dropout_rate = dropout_rate
        
        # Forward-Propagation
        def forward(self, inputs):
            self.inputs = inputs
            self.outputs = np.dot(self.weights.T, inputs) + self.biases
            self.outputs = self.activation(self.outputs)
            self.outputs = self.dropout(self.outputs)
            return self.outputs
        
        # Backward-Propagation
        def backward(self, error, learning_rate):
            self.error = error
            self.delta = self.error * self.activation(self.outputs)
            self.delta = self.dropout(self.delta, derivative=True)
            self.weights -= learning_rate * np.dot(self.delta, self.inputs.T)
            self.biases -= learning_rate * np.sum(self.delta, axis=0, keepdims=True)
            return self.delta
        
        # Dropout
        def dropout(self, x, derivative=False):
            if derivative:
                return self.dropout_rate * (1 - self.dropout_rate) * x
            return self.dropout_rate * x


class NeuralNetwork:
    """..."""

    
    def forward(self, inputs):
        # Forward-Propagation
        self.inputs = inputs
        self.outputs = self.inputs
        for layer in self.layers:
            self.outputs = layer.forward(self.outputs)
        return self.outputs
    
    def backward(self, targets):
        # Backward-Propagation
        self.targets = targets
        self.error = self.loss(self.outputs, self.targets)
        self.delta = self.error
        for layer in reversed(self.layers):
            self.delta = layer.backward(self.delta, self.optimizer_kwargs)
        return self.delta
    
    def update_weights(self):
        # Update weights and biases
        for layer in self.layers:
            layer.update_weights(self.optimizer_kwargs)
    
    def train(self, inputs, targets, epochs=1, batch_size=1, verbose=False):
        self.epochs = epochs
        self.epoch_errors = []
        self.epoch_losses = []
        self.epoch_accuracies = []
        self.epoch_times = []
        start = time.time()
        for epoch in range(self.epochs):
            epoch_start = time.time()
            epoch_error = 0
            epoch_loss = 0
            epoch_accuracy = 0
            for i in range(0, inputs.shape[0], batch_size):
                batch_inputs = inputs[i:i+batch_size]
                batch_targets = targets[i:i+batch_size]
                self.forward(batch_inputs)
                self.backward(batch_targets)
                self.update_weights()
                epoch_error += self.error.sum()
                epoch_loss += self.loss(self.outputs, self.targets).sum()
                epoch_accuracy += self.accuracy(self.outputs, self.targets)
            epoch_time = time.time() - epoch_start
            self.epoch_errors.append(epoch_error)
            self.epoch_losses.append(epoch_loss)
            self.epoch_accuracies.append(epoch_accuracy)
            self.epoch_times.append(epoch_time)
            if verbose:
                print('Epoch: {}, Error: {}, Loss: {}, Accuracy: {}, Time: {}'.format(epoch, epoch_error, epoch_loss, epoch_accuracy, epoch_time))
        self.train_time = time.time() - start
        return self.epoch_errors, self.epoch_losses, self.epoch_accuracies, self.epoch_times



# Load and flatten data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape((X_train.shape[0], -1))
X_test = X_test.reshape((X_test.shape[0], -1))
# Build model
model = NeuralNetwork([
    Layers.Dense(32, Activations.ReLU, inputs=X_train.shape[1]),
    Layers.Dense(10, Activations.ReLU),
    Layers.Dense(1, Activations.Softmax)
], Losses.Categorical_Cross_Entropy, Optimizers.SGD, learning_rate=0.01)
model.train(X_train, y_train, epochs=100, batch_size=10, verbose=True)
model.evaluate(X_test, y_test)

I'm trying to build a simple Neural Network library from scratch similar to Keras, but I'm having issues getting the training to work properly. It's been a while since I've written a NN from scratch instead of using a library, so I thought it would be good practice.

I'm not quite sure I have the constructor set up properly for the case that no input shape is given, and I keep running into the "ValueError: shapes X and Y not aligned" issue regardless of what number of neurons I pass the layer or the input shape.
Here's the traceback:

Traceback (most recent call last):  File "NNfromScratch.py", line 551, in <module>
    model.train(X_train, y_train, epochs=100, batch_size=10, verbose=True)
  File "NNfromScratch.py", line 427, in train
    self.forward(batch_inputs)
  File "NNfromScratch.py", line 395, in forward
    self.outputs = layer.forward(self.outputs)
  File "NNfromScratch.py", line 153, in forward
    **self.outputs = np.dot(self.weights.T, inputs) + self.biases**
  File "<__array_function__ internals>", line 6, in dot
ValueError: shapes (784,32) and (10,784) not aligned: 32 (dim 1) != 10 (dim 0)

The error is thrown from the forward function of the Dense layer.

The full (reproducible) code can be seen here.

Here's the snippet of the most important parts, though:

import time
import numpy as np
import pandas as pd
import pickle as pkl
import matplotlib.pyplot as plt
import tensorflow.keras.datasets.mnist as mnist

...

class Layers:
    class Dense:
        def __init__(self, neurons=0, activation=Activations.ReLU, inputs=0, dropout_rate=1):
            # Initialize weights and biases
            self.weights = np.random.randn(neurons, inputs)
            self.biases = np.random.randn(1, neurons)
            self.activation = activation
            self.dropout_rate = dropout_rate
        
        # Forward-Propagation
        def forward(self, inputs):
            self.inputs = inputs
            self.outputs = np.dot(self.weights.T, inputs) + self.biases
            self.outputs = self.activation(self.outputs)
            self.outputs = self.dropout(self.outputs)
            return self.outputs
        
        # Backward-Propagation
        def backward(self, error, learning_rate):
            self.error = error
            self.delta = self.error * self.activation(self.outputs)
            self.delta = self.dropout(self.delta, derivative=True)
            self.weights -= learning_rate * np.dot(self.delta, self.inputs.T)
            self.biases -= learning_rate * np.sum(self.delta, axis=0, keepdims=True)
            return self.delta
        
        # Dropout
        def dropout(self, x, derivative=False):
            if derivative:
                return self.dropout_rate * (1 - self.dropout_rate) * x
            return self.dropout_rate * x


class NeuralNetwork:
    """..."""

    
    def forward(self, inputs):
        # Forward-Propagation
        self.inputs = inputs
        self.outputs = self.inputs
        for layer in self.layers:
            self.outputs = layer.forward(self.outputs)
        return self.outputs
    
    def backward(self, targets):
        # Backward-Propagation
        self.targets = targets
        self.error = self.loss(self.outputs, self.targets)
        self.delta = self.error
        for layer in reversed(self.layers):
            self.delta = layer.backward(self.delta, self.optimizer_kwargs)
        return self.delta
    
    def update_weights(self):
        # Update weights and biases
        for layer in self.layers:
            layer.update_weights(self.optimizer_kwargs)
    
    def train(self, inputs, targets, epochs=1, batch_size=1, verbose=False):
        self.epochs = epochs
        self.epoch_errors = []
        self.epoch_losses = []
        self.epoch_accuracies = []
        self.epoch_times = []
        start = time.time()
        for epoch in range(self.epochs):
            epoch_start = time.time()
            epoch_error = 0
            epoch_loss = 0
            epoch_accuracy = 0
            for i in range(0, inputs.shape[0], batch_size):
                batch_inputs = inputs[i:i+batch_size]
                batch_targets = targets[i:i+batch_size]
                self.forward(batch_inputs)
                self.backward(batch_targets)
                self.update_weights()
                epoch_error += self.error.sum()
                epoch_loss += self.loss(self.outputs, self.targets).sum()
                epoch_accuracy += self.accuracy(self.outputs, self.targets)
            epoch_time = time.time() - epoch_start
            self.epoch_errors.append(epoch_error)
            self.epoch_losses.append(epoch_loss)
            self.epoch_accuracies.append(epoch_accuracy)
            self.epoch_times.append(epoch_time)
            if verbose:
                print('Epoch: {}, Error: {}, Loss: {}, Accuracy: {}, Time: {}'.format(epoch, epoch_error, epoch_loss, epoch_accuracy, epoch_time))
        self.train_time = time.time() - start
        return self.epoch_errors, self.epoch_losses, self.epoch_accuracies, self.epoch_times



# Load and flatten data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape((X_train.shape[0], -1))
X_test = X_test.reshape((X_test.shape[0], -1))
# Build model
model = NeuralNetwork([
    Layers.Dense(32, Activations.ReLU, inputs=X_train.shape[1]),
    Layers.Dense(10, Activations.ReLU),
    Layers.Dense(1, Activations.Softmax)
], Losses.Categorical_Cross_Entropy, Optimizers.SGD, learning_rate=0.01)
model.train(X_train, y_train, epochs=100, batch_size=10, verbose=True)
model.evaluate(X_test, y_test)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

时常饿 2025-02-15 17:58:11

改变这一行:

self.outputs = np.dot(self.weights.T, inputs) + self.biases

原因

self.outputs = np.dot(inputs, self.weights.T) + self.biases

是内部维度需要对齐。您的输入是形状[b,784](其中b是批处理大小),您的权重为Shape [32,784]

Change this line:

self.outputs = np.dot(self.weights.T, inputs) + self.biases

to

self.outputs = np.dot(inputs, self.weights.T) + self.biases

The reason being is that the inner dimensions need to align. Your inputs is of shape [B,784] (where B is batch size) and your weights are of shape [32,784].

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文