无效的转移学习

发布于 2025-02-06 00:54:35 字数 6118 浏览 4 评论 0原文

我正在尝试使用tf.keras.utils.image_dataset_from_directory方法在Keras上创建一个微调模型，以传递培训和验证数据。

这是我编写的代码：

import tensorflow as tf
from tensorflow import keras
import tensorflow

from keras.applications.mobilenet_v2 import MobileNetV2
from keras.models import Model

import numpy as np
import matplotlib.pyplot as plt

from glob import glob
import pathlib
from PIL import ImageFile

ImageFile.LOAD_TRUNCATED_IMAGES = True

img_height = 180
img_width = 180

epochs = 20
batch_size = 32

train_dir = '<TRAIN/DIR>'
valid_dir = '<VALIDATION/DIR'

image_files = glob(train_dir + '/*/*.jp*g')
valid_image_files = glob(valid_dir + '/*/*.jp*g')

folders = glob(train_dir + '/*')

train_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    validation_split=None,
    subset=None,
    image_size=(img_height, img_width),
    batch_size=batch_size)

valid_ds = tf.keras.utils.image_dataset_from_directory(
    valid_dir,
    validation_split=None,
    subset=None,
    image_size=(img_height, img_width),
    batch_size=batch_size)

normalization_layer = tf.keras.layers.Rescaling(1./255)

normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))          
image_batch, labels_batch = next(iter(normalized_ds))

mobilenet = MobileNetV2(input_shape=[img_height, img_width] + [3],
            weights='imagenet',
            include_top=False)

mobilenet.trainable = True

fine_tune_at = 80

for layer in mobilenet.layers[:fine_tune_at]:
    layer.trainable = False

x = keras.layers.Flatten()(mobilenet.output)
x = keras.layers.Dense(1000, activation='relu')(x)
prediction = keras.layers.Dense(len(folders), activation='softmax')(x)

model = Model(inputs=mobilenet.input, outputs=prediction)

model.summary()

model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='rmsprop',
    metrics=['accuracy']
)

checkpoint_path = "<CHECKPOINT/PATH>"
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                monitor='val_accuracy',
                                                mode='max',
                                                save_best_only=True,)
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=3)

r = model.fit(
  train_ds,
  validation_data=valid_ds,
  epochs=1,
  steps_per_epoch=len(image_files) // batch_size,
  validation_steps=len(valid_image_files) // batch_size,
  callbacks=[cp_callback, early_stopping_callback],
)

当我运行它时，它正常启动，如您在步骤676上看到的，它会引起以下错误。我进行了3次跑步，并且总是在第676步。

2022-06-10 01:25:12.458342: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:12.458469: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-10 01:25:22.297942: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:22.298779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2022-06-10 01:25:22.299615: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2022-06-10 01:25:22.300362: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2022-06-10 01:25:22.301098: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2022-06-10 01:25:22.301832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2022-06-10 01:25:22.302592: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2022-06-10 01:25:22.303313: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2022-06-10 01:25:22.303504: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-06-10 01:25:22.304168: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-10 02:46:09.613723: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
2022-06-10 03:45:17.995942: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
 676/1923 [=========>....................] - ETA: 20:12 - loss: 0.8916 - accuracy: 0.7308Traceback (most recent call last):

  Input In [40] in <cell line: 1>
    model.fit(

  File ~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
    raise e.with_traceback(filtered_tb) from None

  File ~\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,

InvalidArgumentError: Graph execution error:

jpeg::Uncompress failed. Invalid JPEG data or crop window.
     [[{{node decode_image/DecodeImage}}]]
     [[IteratorGetNext]] [Op:__inference_train_function_13724]

有帮助吗？谢谢你！

原文

I am trying to create a Fine Tuned model on Keras using the tf.keras.utils.image_dataset_from_directory method to pass in the training and validation data.

This is the code I have written:

import tensorflow as tf
from tensorflow import keras
import tensorflow

from keras.applications.mobilenet_v2 import MobileNetV2
from keras.models import Model

import numpy as np
import matplotlib.pyplot as plt

from glob import glob
import pathlib
from PIL import ImageFile

ImageFile.LOAD_TRUNCATED_IMAGES = True

img_height = 180
img_width = 180

epochs = 20
batch_size = 32

train_dir = '<TRAIN/DIR>'
valid_dir = '<VALIDATION/DIR'

image_files = glob(train_dir + '/*/*.jp*g')
valid_image_files = glob(valid_dir + '/*/*.jp*g')

folders = glob(train_dir + '/*')

train_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    validation_split=None,
    subset=None,
    image_size=(img_height, img_width),
    batch_size=batch_size)

valid_ds = tf.keras.utils.image_dataset_from_directory(
    valid_dir,
    validation_split=None,
    subset=None,
    image_size=(img_height, img_width),
    batch_size=batch_size)

normalization_layer = tf.keras.layers.Rescaling(1./255)

normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))          
image_batch, labels_batch = next(iter(normalized_ds))

mobilenet = MobileNetV2(input_shape=[img_height, img_width] + [3],
            weights='imagenet',
            include_top=False)

mobilenet.trainable = True

fine_tune_at = 80

for layer in mobilenet.layers[:fine_tune_at]:
    layer.trainable = False

x = keras.layers.Flatten()(mobilenet.output)
x = keras.layers.Dense(1000, activation='relu')(x)
prediction = keras.layers.Dense(len(folders), activation='softmax')(x)

model = Model(inputs=mobilenet.input, outputs=prediction)

model.summary()

model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='rmsprop',
    metrics=['accuracy']
)

checkpoint_path = "<CHECKPOINT/PATH>"
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                monitor='val_accuracy',
                                                mode='max',
                                                save_best_only=True,)
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=3)

r = model.fit(
  train_ds,
  validation_data=valid_ds,
  epochs=1,
  steps_per_epoch=len(image_files) // batch_size,
  validation_steps=len(valid_image_files) // batch_size,
  callbacks=[cp_callback, early_stopping_callback],
)

when I run it, it starts normally and as you can see at step 676 it raises the following error. I made the run 3 times and it's always on step 676.

2022-06-10 01:25:12.458342: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:12.458469: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-10 01:25:22.297942: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:22.298779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2022-06-10 01:25:22.299615: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2022-06-10 01:25:22.300362: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2022-06-10 01:25:22.301098: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2022-06-10 01:25:22.301832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2022-06-10 01:25:22.302592: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2022-06-10 01:25:22.303313: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2022-06-10 01:25:22.303504: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-06-10 01:25:22.304168: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-10 02:46:09.613723: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
2022-06-10 03:45:17.995942: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
 676/1923 [=========>....................] - ETA: 20:12 - loss: 0.8916 - accuracy: 0.7308Traceback (most recent call last):

  Input In [40] in <cell line: 1>
    model.fit(

  File ~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
    raise e.with_traceback(filtered_tb) from None

  File ~\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,

InvalidArgumentError: Graph execution error:

jpeg::Uncompress failed. Invalid JPEG data or crop window.
     [[{{node decode_image/DecodeImage}}]]
     [[IteratorGetNext]] [Op:__inference_train_function_13724]

Any help? Thank you!

分享到QQ

分享到微博