对象定位MNIST TENSORFLOW到PYTORCH:损失并不降低

发布于 2025-02-03 21:15:28 字数 6548 浏览 3 评论 0原文

我正在尝试将TensorFlow对象本地化代码转换为Pytorch。在原始代码中,作者使用model.compile/model.fit训练模型,因此我不了解MNIST数字和框的分类损失如何回归有效。尽管如此,我还是试图在Pytorch实施自己的训练循环。 此处的目标是,经过一些预处理后,将MNIST数字随机地越过黑色正方形图像,然后对数字进行分类和本地化(边界框)。

我设置了两个损失:nn.Crossentropylossnn.mseloss,我做(lose_1+lose_2).backward()来计算梯度。我知道这是计算梯度两次损失的正确方法=“ https://blog.paperspace.com/object-localization-pytorch-2/” rel =“ nofollow noreferrer”>在这里。

但是,尽管如此,我的损失并没有减少,而它与TensorFlow代码恰当地崩溃了。我使用torchinfo.summary检查了模型,并且表现得像张力流的实现一样。

编辑 : 我寻找了模型的预测标签,但似乎根本没有改变。 This line of code label_preds, bbox_coords_preds = model(digits) always returns the same values

label_preds[0] = tensor([[0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156,0.0156,0.0156]],设备='cuda:0',grad_fn =< slicebackward0>)

这是我的问题:

  • 我的自定义网络设置正确吗?
  • 我的损失是否正确?
  • 为什么我的标签预测不会改变?
  • 我的训练循环以及.compile.fit tensorflow方法吗?

非常感谢!

Pytorch代码

class ConvNetwork(nn.Module):
    def __init__(self):
        super(ConvNetwork, self).__init__()
        self.conv2d_1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3)
        self.conv2d_2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3)
        self.conv2d_3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)
        self.avgPooling2D = nn.AvgPool2d((2,2))
        self.dense_1 = nn.Linear(in_features=3136, out_features=128)
        
        self.dense_classifier = nn.Linear(in_features=128, out_features=10)
        self.softmax = nn.Softmax(dim=0)
        self.dense_regression = nn.Linear(in_features=128, out_features=4)


    def forward(self, input):
        x = self.avgPooling2D(F.relu(self.conv2d_1(input)))
        x = self.avgPooling2D(F.relu(self.conv2d_2(x)))
        x = self.avgPooling2D(F.relu(self.conv2d_3(x)))
        x = nn.Flatten()(x)
        x = F.relu(self.dense_1(x))

        output_classifier = self.softmax(self.dense_classifier(x))
        output_regression = self.dense_regression(x)
        return [output_classifier, output_regression]

######################################################

learning_rate = 0.1
EPOCHS = 1
BATCH_SIZE = 64

model = ConvNetwork()
model = model.to(device)
optimizer = torch.optim.Adam(params=model.parameters(), lr=learning_rate)
classification_loss = nn.CrossEntropyLoss()
regression_loss = nn.MSELoss()

######################################################

begin_time = time.time()
for epoch in range(EPOCHS) : 
    tot_loss = 0
    train_start = time.time()
    training_losses = []
    
    print("-"*20)
    print(" "*5 + f"EPOCH {epoch+1}/{EPOCHS}")
    print("-"*20)

    model.train()
    for batch, (digits, labels, bbox_coords) in enumerate(training_dataset):
        digits, labels, bbox_coords = digits.to(device), labels.to(device), bbox_coords.to(device)
        optimizer.zero_grad()
        
        [label_preds, bbox_coords_preds] = model(digits)
        
        class_loss = classification_loss(label_preds, labels)
        box_loss = regression_loss(bbox_coords_preds, bbox_coords)

        training_loss = class_loss + box_loss
        training_loss.backward()
        
        optimizer.step()
        
        ######### print part #######################
        training_losses.append(training_loss.item())
        if batch+1 <= len_training_ds//BATCH_SIZE:
            current_training_sample = (batch+1)*BATCH_SIZE
        else:
            current_training_sample = (batch)*BATCH_SIZE + len_training_ds%BATCH_SIZE
        
        if (batch+1) == 1 or (batch+1)%100 == 0 or (batch+1) == len_training_ds//BATCH_SIZE +1:
            print(f"Elapsed time : {(time.time()-train_start)/60:.3f}",\
                  f" --- Digit : {current_training_sample}/{len_training_ds}",\
                  f" : loss = {training_loss:.5f}")
            if batch+1 == (len_training_ds//BATCH_SIZE)+1:
                print(f"Total elapsed time for training : {(time.time()-begin_time)/60:.3f}")

原始TensorFlow代码

def feature_extractor(inputs):
    x = tf.keras.layers.Conv2D(16, activation='relu', kernel_size=3, input_shape=(75, 75, 1))(inputs)
    x = tf.keras.layers.AveragePooling2D((2, 2))(x)
    x = tf.keras.layers.Conv2D(32,kernel_size=3,activation='relu')(x)
    x = tf.keras.layers.AveragePooling2D((2, 2))(x)
    x = tf.keras.layers.Conv2D(64,kernel_size=3,activation='relu')(x)
    x = tf.keras.layers.AveragePooling2D((2, 2))(x)
    return x

def dense_layers(inputs):
  x = tf.keras.layers.Flatten()(inputs)
  x = tf.keras.layers.Dense(128, activation='relu')(x)
  return x

def classifier(inputs):

  classification_output = tf.keras.layers.Dense(10, activation='softmax', name = 'classification')(inputs)
  return classification_output


def bounding_box_regression(inputs):
    bounding_box_regression_output = tf.keras.layers.Dense(units = '4', name = 'bounding_box')(inputs)
    return bounding_box_regression_output


def final_model(inputs):
    feature_cnn = feature_extractor(inputs)
    dense_output = dense_layers(feature_cnn)

    classification_output = classifier(dense_output)
    bounding_box_output = bounding_box_regression(dense_output)

    model = tf.keras.Model(inputs = inputs, outputs = [classification_output,bounding_box_output])
    return model
  
def define_and_compile_model(inputs):
  model = final_model(inputs)
  model.compile(optimizer='adam', 
              loss = {'classification' : 'categorical_crossentropy',
                      'bounding_box' : 'mse'
                     },
              metrics = {'classification' : 'accuracy',
                         'bounding_box' : 'mse'
                        })
  return model

    

inputs = tf.keras.layers.Input(shape=(75, 75, 1,))
model = define_and_compile_model(inputs)


EPOCHS = 10 # 45
steps_per_epoch = 60000//BATCH_SIZE  # 60,000 items in this dataset
validation_steps = 1

history = model.fit(training_dataset,
                    steps_per_epoch=steps_per_epoch, 
                    validation_data=validation_dataset, 
                    validation_steps=validation_steps, epochs=EPOCHS)

loss, classification_loss, bounding_box_loss, classification_accuracy, bounding_box_mse = model.evaluate(validation_dataset, steps=1)
print("Validation accuracy: ", classification_accuracy)

I am trying to convert a Tensorflow object localization code into Pytorch. In the original code, the author use model.compile / model.fit to train the model so I don't understand how the losses of classification of the MNIST digits and box regressions work. Still, I'm trying to implement my own training loop in Pytorch.
The goal here is, after some preprocessing, past the MNIST digits randomly into a black square image and then, classify and localize (bounding boxes) the digit.

I set two losses : nn.CrossEntropyLoss and nn.MSELoss and I do (loss_1+loss_2).backward() to compute the gradients. I know it's the right way to compute gradients with two losses from here and here.

But still, my loss doesn't decrease whereas it collapses quasi-imediately with the Tensorflow code. I checked the model with torchinfo.summary and it seems behaving as well as the Tensorflow implementation.

EDIT :
I looked for the predicted labels of my model and it doesn't seem to change at all.
This line of code label_preds, bbox_coords_preds = model(digits) always returns the same values

label_preds[0] = tensor([[0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156, 0.0156]], device='cuda:0', grad_fn=<SliceBackward0>)

Here are my questions :

  • Is my custom network set correctly ?
  • Are my losses set correctly ?
  • Why my label predictions don't change ?
  • Do my training loop work as well as the .compile and .fit Tensorflow methods ?

Thanks a lot !

PYTORCH CODE

class ConvNetwork(nn.Module):
    def __init__(self):
        super(ConvNetwork, self).__init__()
        self.conv2d_1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3)
        self.conv2d_2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3)
        self.conv2d_3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)
        self.avgPooling2D = nn.AvgPool2d((2,2))
        self.dense_1 = nn.Linear(in_features=3136, out_features=128)
        
        self.dense_classifier = nn.Linear(in_features=128, out_features=10)
        self.softmax = nn.Softmax(dim=0)
        self.dense_regression = nn.Linear(in_features=128, out_features=4)


    def forward(self, input):
        x = self.avgPooling2D(F.relu(self.conv2d_1(input)))
        x = self.avgPooling2D(F.relu(self.conv2d_2(x)))
        x = self.avgPooling2D(F.relu(self.conv2d_3(x)))
        x = nn.Flatten()(x)
        x = F.relu(self.dense_1(x))

        output_classifier = self.softmax(self.dense_classifier(x))
        output_regression = self.dense_regression(x)
        return [output_classifier, output_regression]

######################################################

learning_rate = 0.1
EPOCHS = 1
BATCH_SIZE = 64

model = ConvNetwork()
model = model.to(device)
optimizer = torch.optim.Adam(params=model.parameters(), lr=learning_rate)
classification_loss = nn.CrossEntropyLoss()
regression_loss = nn.MSELoss()

######################################################

begin_time = time.time()
for epoch in range(EPOCHS) : 
    tot_loss = 0
    train_start = time.time()
    training_losses = []
    
    print("-"*20)
    print(" "*5 + f"EPOCH {epoch+1}/{EPOCHS}")
    print("-"*20)

    model.train()
    for batch, (digits, labels, bbox_coords) in enumerate(training_dataset):
        digits, labels, bbox_coords = digits.to(device), labels.to(device), bbox_coords.to(device)
        optimizer.zero_grad()
        
        [label_preds, bbox_coords_preds] = model(digits)
        
        class_loss = classification_loss(label_preds, labels)
        box_loss = regression_loss(bbox_coords_preds, bbox_coords)

        training_loss = class_loss + box_loss
        training_loss.backward()
        
        optimizer.step()
        
        ######### print part #######################
        training_losses.append(training_loss.item())
        if batch+1 <= len_training_ds//BATCH_SIZE:
            current_training_sample = (batch+1)*BATCH_SIZE
        else:
            current_training_sample = (batch)*BATCH_SIZE + len_training_ds%BATCH_SIZE
        
        if (batch+1) == 1 or (batch+1)%100 == 0 or (batch+1) == len_training_ds//BATCH_SIZE +1:
            print(f"Elapsed time : {(time.time()-train_start)/60:.3f}",\
                  f" --- Digit : {current_training_sample}/{len_training_ds}",\
                  f" : loss = {training_loss:.5f}")
            if batch+1 == (len_training_ds//BATCH_SIZE)+1:
                print(f"Total elapsed time for training : {(time.time()-begin_time)/60:.3f}")

ORIGINAL TENSORFLOW CODE

def feature_extractor(inputs):
    x = tf.keras.layers.Conv2D(16, activation='relu', kernel_size=3, input_shape=(75, 75, 1))(inputs)
    x = tf.keras.layers.AveragePooling2D((2, 2))(x)
    x = tf.keras.layers.Conv2D(32,kernel_size=3,activation='relu')(x)
    x = tf.keras.layers.AveragePooling2D((2, 2))(x)
    x = tf.keras.layers.Conv2D(64,kernel_size=3,activation='relu')(x)
    x = tf.keras.layers.AveragePooling2D((2, 2))(x)
    return x

def dense_layers(inputs):
  x = tf.keras.layers.Flatten()(inputs)
  x = tf.keras.layers.Dense(128, activation='relu')(x)
  return x

def classifier(inputs):

  classification_output = tf.keras.layers.Dense(10, activation='softmax', name = 'classification')(inputs)
  return classification_output


def bounding_box_regression(inputs):
    bounding_box_regression_output = tf.keras.layers.Dense(units = '4', name = 'bounding_box')(inputs)
    return bounding_box_regression_output


def final_model(inputs):
    feature_cnn = feature_extractor(inputs)
    dense_output = dense_layers(feature_cnn)

    classification_output = classifier(dense_output)
    bounding_box_output = bounding_box_regression(dense_output)

    model = tf.keras.Model(inputs = inputs, outputs = [classification_output,bounding_box_output])
    return model
  
def define_and_compile_model(inputs):
  model = final_model(inputs)
  model.compile(optimizer='adam', 
              loss = {'classification' : 'categorical_crossentropy',
                      'bounding_box' : 'mse'
                     },
              metrics = {'classification' : 'accuracy',
                         'bounding_box' : 'mse'
                        })
  return model

    

inputs = tf.keras.layers.Input(shape=(75, 75, 1,))
model = define_and_compile_model(inputs)


EPOCHS = 10 # 45
steps_per_epoch = 60000//BATCH_SIZE  # 60,000 items in this dataset
validation_steps = 1

history = model.fit(training_dataset,
                    steps_per_epoch=steps_per_epoch, 
                    validation_data=validation_dataset, 
                    validation_steps=validation_steps, epochs=EPOCHS)

loss, classification_loss, bounding_box_loss, classification_accuracy, bounding_box_mse = model.evaluate(validation_dataset, steps=1)
print("Validation accuracy: ", classification_accuracy)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

撑一把青伞 2025-02-10 21:15:28

我对这个错误回答自己:

我发现的东西:
我认为在使用nn.Crossentropyloss()作为损失时,我在代码中使用softmax层。

这个问题引起的是什么:

  • 这种损失已经应用softmax doc
  • 应用softmax两次必须为损失增加一些噪音,并防止收敛

我所做的事情:
应该让线性层作为分类层的输出。
另一种方法是使用nllloss doc ),然后让softmax层中的层中的层。

另外:
我不完全了解.compile().fit() tensorflow方法有效,但我认为它应该以一种或另一种方式优化培训学习率)由于我必须将学习率降低到pytorch中的0.001才能“脱离”损失并减少。

I answering to myself about this bug :

What I found :
I figured that I use a Softmax layer in my code while I'm using the nn.CrossEntropyLoss() as a loss.

What this problem was causing :

  • This loss already apply a softmax (doc)
  • Apply a softmax twice must add some noise to the loss and preventing convergence

What I did :
One should let a linear layer as an output for the classification layer.
An other way is to use the NLLLoss (doc) instead and let the softmax layer in the model class.

Also :
I don't fully understand how the .compile() and .fit() Tensorflow methods work but I think it should optimize the training one way or another (I think about the learning rate) since I had to decrease the learning rate to 0.001 in Pytorch to "unstick" the loss and makes it decrease.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文