DENOISE AUTOCONEDER不正确培训

发布于 2025-01-24 11:05:59 字数 7155 浏览 5 评论 0原文

我正在尝试制作一个denoise自动编码器，其中编码器部分是vgg16 ，而解码器与VGG16（encoder）网络相反。我的数据集由灰度中的5K图像组成，这些是我遵循的步骤准备和归一化的步骤：

input_X = [], (list having 5K (noisy)images of dims 224,224,1)
input_Y = [], (list having 5K (ground truth)images of dims 224,224,1)
test_X = [] (list having 1K (noisy)images for testing of dims 224,224,1)

input_X = input_X/255.0
input_Y = input_Y/255.0
test_X = test_X/255.0

print(input_X.shape)
print(input_Y.shape)
print(test_X.shape)

input_X = np.repeat(input_X[..., np.newaxis], 3, -1) #creating 3 channels
input_Y = np.repeat(input_Y[..., np.newaxis], 3, -1)
test_X = np.repeat(test_X[..., np.newaxis], 3, -1)

print(input_X.shape) (5000,224,224,3)
print(input_Y.shape)
print(test_X.shape)

编码器网络：

vggmodel = keras.applications.vgg16.VGG16()
model_encoder = Sequential() 
num = 0
for i, layer in enumerate(vggmodel.layers):
    if i<19:
      model_encoder.add(layer)

model_encoder.summary()
for layer in model_encoder.layers:
  layer.trainable=False

输出：

Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0

解码器网络：

encoder_output = Input(shape=(7, 7, 512,))

x = Conv2D(512, (3, 3), activation='relu', padding='same')(encoder_output)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)

# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)

# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)     

# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)        

# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
x = UpSampling2D((2, 2))(x)
model_decoder = Model(inputs=encoder_output, outputs=x)

训练：

model_decoder.compile(optimizer='Adam', loss='mean_squared_error' , metrics=['accuracy'])
model_decoder.fit(trainx_encoded, train_Y,
                epochs=no_epocs,
                batch_size=batch_size,
         validation_split=validation_split)

现在训练时，损失和准确性不会改变。我可以想到减少最初解码器层中的过滤器，但我担心这会影响自动编码器。在这里，我对遵循哪种方法确实毫无意义。

训练输出：

Epoch 1/50
2022-04-27 22:10:05.336322: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
27/27 [==============================] - ETA: 0s - loss: 0.1678 - accuracy: 0.9689
2022-04-27 22:11:34.044172: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
27/27 [==============================] - 97s 4s/step - loss: 0.1678 - accuracy: 0.9689 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 2/50
27/27 [==============================] - 87s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 3/50
27/27 [==============================] - 86s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 4/50
27/27 [==============================] - 86s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 5/50
27/27 [==============================] - 91s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 6/50
27/27 [==============================] - 85s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 7/50
27/27 [==============================] - 87s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 8/50
27/27 [==============================] - 91s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 9/50
27/27 [==============================] - 85s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 10/50
27/27 [==============================] - ETA: 0s - loss: 0.1645 - accuracy: 1.0000

原文

I'm trying to make a denoise autoencoder wherein the encoder part is vgg16 and decoder is opposite of vgg16(encoder) network. My dataset consists of 5K images in grayscale and these are the steps i've followed to prepare and normalize:

input_X = [], (list having 5K (noisy)images of dims 224,224,1)
input_Y = [], (list having 5K (ground truth)images of dims 224,224,1)
test_X = [] (list having 1K (noisy)images for testing of dims 224,224,1)

input_X = input_X/255.0
input_Y = input_Y/255.0
test_X = test_X/255.0

print(input_X.shape)
print(input_Y.shape)
print(test_X.shape)

input_X = np.repeat(input_X[..., np.newaxis], 3, -1) #creating 3 channels
input_Y = np.repeat(input_Y[..., np.newaxis], 3, -1)
test_X = np.repeat(test_X[..., np.newaxis], 3, -1)

print(input_X.shape) (5000,224,224,3)
print(input_Y.shape)
print(test_X.shape)

encoder network:

vggmodel = keras.applications.vgg16.VGG16()
model_encoder = Sequential() 
num = 0
for i, layer in enumerate(vggmodel.layers):
    if i<19:
      model_encoder.add(layer)

model_encoder.summary()
for layer in model_encoder.layers:
  layer.trainable=False

output:

Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0

decoder network:

encoder_output = Input(shape=(7, 7, 512,))

x = Conv2D(512, (3, 3), activation='relu', padding='same')(encoder_output)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)

# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)

# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)     

# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2,2))(x)        

# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
x = UpSampling2D((2, 2))(x)
model_decoder = Model(inputs=encoder_output, outputs=x)

training:

model_decoder.compile(optimizer='Adam', loss='mean_squared_error' , metrics=['accuracy'])
model_decoder.fit(trainx_encoded, train_Y,
                epochs=no_epocs,
                batch_size=batch_size,
         validation_split=validation_split)

Now while training, the loss and accuracy doesn't changes. I can think of reducing filters in the initial decoder layers but i fear that's going to affect the autoencoder.
Here, i'm really clueless about what approach to follow.

training output:

Epoch 1/50
2022-04-27 22:10:05.336322: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
27/27 [==============================] - ETA: 0s - loss: 0.1678 - accuracy: 0.9689
2022-04-27 22:11:34.044172: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
27/27 [==============================] - 97s 4s/step - loss: 0.1678 - accuracy: 0.9689 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 2/50
27/27 [==============================] - 87s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 3/50
27/27 [==============================] - 86s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 4/50
27/27 [==============================] - 86s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 5/50
27/27 [==============================] - 91s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 6/50
27/27 [==============================] - 85s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 7/50
27/27 [==============================] - 87s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 8/50
27/27 [==============================] - 91s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 9/50
27/27 [==============================] - 85s 3s/step - loss: 0.1645 - accuracy: 1.0000 - val_loss: 0.1732 - val_accuracy: 1.0000
Epoch 10/50
27/27 [==============================] - ETA: 0s - loss: 0.1645 - accuracy: 1.0000

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

野味少女 2025-01-31 11:05:59

VGG16未经训练可以用作图像重建的编码器，它经过训练可以从图像中提取功能，我们可以在图像上执行分类任务。
这就是为什么，您不能将vgg16用作Denoise AutoCoder的编码部分。
但是，如果需要，您可以将VGG16的架构用作自动编码器的编码部分，只需重新训练VGG16的那些层，

vggmodel = keras.applications.vgg16.VGG16()
model_encoder = Sequential() 
num = 0
for i, layer in enumerate(vggmodel.layers):
    if i<19:
      model_encoder.add(layer)

model_encoder.summary()
for layer in model_encoder.layers:
  layer.trainable=True # Set encoder to trainable, and your autoencoder should work.

您当前的自动编码器的当前体系结构就会出现问题 - 它降低了太多，导致了一个大量数据丢失。较小的架构会更好。例如：

model = Sequential()
model.add(Input(shape=(224,224,3)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(MaxPool2D((2,2)))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(MaxPool2D((2,2)))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(UpSampling2D((2,2)))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(UpSampling2D((2,2)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(Conv2D(3, (3,3), activation='relu', padding='same'))

VGG16 is not trained to be used as an encoder for image reconstruction, it is trained to extract features from an image using which we can do classification task on the image.
This is why, you cannot use VGG16 as the encoder part of your denoise autoencoder.
However, if you want, you can use that architecture of VGG16 as the encoder part of your autoencoder, by just retraining those layers of VGG16,

vggmodel = keras.applications.vgg16.VGG16()
model_encoder = Sequential() 
num = 0
for i, layer in enumerate(vggmodel.layers):
    if i<19:
      model_encoder.add(layer)

model_encoder.summary()
for layer in model_encoder.layers:
  layer.trainable=True # Set encoder to trainable, and your autoencoder should work.

Again, there is a problem with your current architecture of autoencoder -it downsamples too much, leading to a significant loss of data. A smaller architecture would work better. For example:

model = Sequential()
model.add(Input(shape=(224,224,3)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(MaxPool2D((2,2)))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(MaxPool2D((2,2)))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(UpSampling2D((2,2)))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(UpSampling2D((2,2)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(Conv2D(3, (3,3), activation='relu', padding='same'))

回复收藏 0 原文

~没有更多了~