无效的转移学习
我正在尝试使用tf.keras.utils.image_dataset_from_directory方法在Keras上创建一个微调模型,以传递培训和验证数据。
这是我编写的代码:
import tensorflow as tf
from tensorflow import keras
import tensorflow
from keras.applications.mobilenet_v2 import MobileNetV2
from keras.models import Model
import numpy as np
import matplotlib.pyplot as plt
from glob import glob
import pathlib
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
img_height = 180
img_width = 180
epochs = 20
batch_size = 32
train_dir = '<TRAIN/DIR>'
valid_dir = '<VALIDATION/DIR'
image_files = glob(train_dir + '/*/*.jp*g')
valid_image_files = glob(valid_dir + '/*/*.jp*g')
folders = glob(train_dir + '/*')
train_ds = tf.keras.utils.image_dataset_from_directory(
train_dir,
validation_split=None,
subset=None,
image_size=(img_height, img_width),
batch_size=batch_size)
valid_ds = tf.keras.utils.image_dataset_from_directory(
valid_dir,
validation_split=None,
subset=None,
image_size=(img_height, img_width),
batch_size=batch_size)
normalization_layer = tf.keras.layers.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
mobilenet = MobileNetV2(input_shape=[img_height, img_width] + [3],
weights='imagenet',
include_top=False)
mobilenet.trainable = True
fine_tune_at = 80
for layer in mobilenet.layers[:fine_tune_at]:
layer.trainable = False
x = keras.layers.Flatten()(mobilenet.output)
x = keras.layers.Dense(1000, activation='relu')(x)
prediction = keras.layers.Dense(len(folders), activation='softmax')(x)
model = Model(inputs=mobilenet.input, outputs=prediction)
model.summary()
model.compile(
loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy']
)
checkpoint_path = "<CHECKPOINT/PATH>"
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
monitor='val_accuracy',
mode='max',
save_best_only=True,)
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=3)
r = model.fit(
train_ds,
validation_data=valid_ds,
epochs=1,
steps_per_epoch=len(image_files) // batch_size,
validation_steps=len(valid_image_files) // batch_size,
callbacks=[cp_callback, early_stopping_callback],
)
当我运行它时,它正常启动,如您在步骤676上看到的,它会引起以下错误。我进行了3次跑步,并且总是在第676步。
2022-06-10 01:25:12.458342: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:12.458469: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-10 01:25:22.297942: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:22.298779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2022-06-10 01:25:22.299615: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2022-06-10 01:25:22.300362: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2022-06-10 01:25:22.301098: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2022-06-10 01:25:22.301832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2022-06-10 01:25:22.302592: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2022-06-10 01:25:22.303313: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2022-06-10 01:25:22.303504: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-06-10 01:25:22.304168: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-10 02:46:09.613723: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
2022-06-10 03:45:17.995942: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
676/1923 [=========>....................] - ETA: 20:12 - loss: 0.8916 - accuracy: 0.7308Traceback (most recent call last):
Input In [40] in <cell line: 1>
model.fit(
File ~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
raise e.with_traceback(filtered_tb) from None
File ~\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
InvalidArgumentError: Graph execution error:
jpeg::Uncompress failed. Invalid JPEG data or crop window.
[[{{node decode_image/DecodeImage}}]]
[[IteratorGetNext]] [Op:__inference_train_function_13724]
有帮助吗?谢谢你!
I am trying to create a Fine Tuned model on Keras using the tf.keras.utils.image_dataset_from_directory method to pass in the training and validation data.
This is the code I have written:
import tensorflow as tf
from tensorflow import keras
import tensorflow
from keras.applications.mobilenet_v2 import MobileNetV2
from keras.models import Model
import numpy as np
import matplotlib.pyplot as plt
from glob import glob
import pathlib
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
img_height = 180
img_width = 180
epochs = 20
batch_size = 32
train_dir = '<TRAIN/DIR>'
valid_dir = '<VALIDATION/DIR'
image_files = glob(train_dir + '/*/*.jp*g')
valid_image_files = glob(valid_dir + '/*/*.jp*g')
folders = glob(train_dir + '/*')
train_ds = tf.keras.utils.image_dataset_from_directory(
train_dir,
validation_split=None,
subset=None,
image_size=(img_height, img_width),
batch_size=batch_size)
valid_ds = tf.keras.utils.image_dataset_from_directory(
valid_dir,
validation_split=None,
subset=None,
image_size=(img_height, img_width),
batch_size=batch_size)
normalization_layer = tf.keras.layers.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
mobilenet = MobileNetV2(input_shape=[img_height, img_width] + [3],
weights='imagenet',
include_top=False)
mobilenet.trainable = True
fine_tune_at = 80
for layer in mobilenet.layers[:fine_tune_at]:
layer.trainable = False
x = keras.layers.Flatten()(mobilenet.output)
x = keras.layers.Dense(1000, activation='relu')(x)
prediction = keras.layers.Dense(len(folders), activation='softmax')(x)
model = Model(inputs=mobilenet.input, outputs=prediction)
model.summary()
model.compile(
loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy']
)
checkpoint_path = "<CHECKPOINT/PATH>"
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
monitor='val_accuracy',
mode='max',
save_best_only=True,)
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=3)
r = model.fit(
train_ds,
validation_data=valid_ds,
epochs=1,
steps_per_epoch=len(image_files) // batch_size,
validation_steps=len(valid_image_files) // batch_size,
callbacks=[cp_callback, early_stopping_callback],
)
when I run it, it starts normally and as you can see at step 676 it raises the following error. I made the run 3 times and it's always on step 676.
2022-06-10 01:25:12.458342: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:12.458469: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-10 01:25:22.297942: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-10 01:25:22.298779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2022-06-10 01:25:22.299615: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2022-06-10 01:25:22.300362: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2022-06-10 01:25:22.301098: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2022-06-10 01:25:22.301832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2022-06-10 01:25:22.302592: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2022-06-10 01:25:22.303313: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2022-06-10 01:25:22.303504: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-06-10 01:25:22.304168: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-10 02:46:09.613723: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
2022-06-10 03:45:17.995942: E tensorflow/core/lib/jpeg/jpeg_mem.cc:324] Premature end of JPEG data. Stopped at line 176/800
676/1923 [=========>....................] - ETA: 20:12 - loss: 0.8916 - accuracy: 0.7308Traceback (most recent call last):
Input In [40] in <cell line: 1>
model.fit(
File ~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
raise e.with_traceback(filtered_tb) from None
File ~\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
InvalidArgumentError: Graph execution error:
jpeg::Uncompress failed. Invalid JPEG data or crop window.
[[{{node decode_image/DecodeImage}}]]
[[IteratorGetNext]] [Op:__inference_train_function_13724]
Any help? Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论