TensorFlow |创建图像数据集，按文件名标记

发布于 01-18 14:56 字数 588 浏览 4 评论 0原文

我正在尝试创建 Tensorflow 数据集来训练我的模型。我有一个充满标记照片的文件夹，标记是文件名的一部分。

您是否有合理的方法来加载数据集进行训练而不将其拆分到不同的目录？

例子：对于文件：

./dataset/path/img0_cat.bmp
./dataset/path/img1_dog.bmp
./dataset/path/img2_horse.bmp
./dataset/path/img3_cat.bmp
./dataset/path/img4_dog.bmp
./数据集/路径/img5_horse.bmp
./dataset/path/img6_dog.bmp
./dataset/path/img7_cat.bmp
./dataset/path/img8_horse.bmp
./dataset/path/img9_cat.bmp
./dataset/path/img10_dog.bmp

预期输出： tf.Dataset 标记为 (cat,狗、马）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

望喜 2025-01-25 14:56:17

感谢大家的回复

我解决这个问题的方法如下：

classNames = ['dog', 'cat', 'horse']

def getLabel(file_path):
    # Convert the path to a list of path components
    fileName = tf.strings.split(file_path, os.path.sep)[-1]
    # get label name from filename
    className = tf.strings.split(fileName, '_')[1]
    className = tf.strings.split(className, '.')[0]
    # get one_hot vector boolean
    one_hot = className == classNames
    # cast vector type to integer 
    return tf.cast(one_hot, dtype=tf.int8, name=None)

def getImage(file_path):
    # Load the raw data from the file as a string
    img = tf.io.read_file(file_path)
    # Convert the compressed string to a 3D uint8 tensor
    img = tf.io.decode_bmp(img, channels=3)
    # cast tf.Tensor type to uint8 
    return tf.cast(img, dtype=tf.uint8, name=None)

def process_path(file_path):
    label = getLabel(file_path)
    img = getImage(file_path)
    return img, label

path = './dataset/path/*.bmp'
ds = tf.data.Dataset.list_files(path)
ds = ds.map(process_path)

在此过程结束时，您将获得一个 Tensorflow 可训练数据集（批量大小需要更多配置，请参阅参考资料），标签为 one-hot 矢量。

运行时：

for image, label in ds.take(5):
     imageShape = image.numpy().shape
     label = label.numpy()
     labelName = class_names[np.argmax(label)]
     print('Image Shape: {}, Label: {}, LabelName: {}'.format(imageShape, label, labelName))

你得到：

Image Shape: (180, 180, 3), Label: [1 0 0], LabelName: dog
Image Shape: (180, 180, 3), Label: [0 1 0], LabelName: cat
Image Shape: (180, 180, 3), Label: [0 0 1], LabelName: horse
Image Shape: (180, 180, 3), Label: [1 0 0], LabelName: dog
Image Shape: (180, 180, 3), Label: [0 1 0], LabelName: cat

参考：
https://www.tensorflow.org/tutorials/load_data/images

Thanks all for your responses

The way I solved it is as follows:

classNames = ['dog', 'cat', 'horse']

def getLabel(file_path):
    # Convert the path to a list of path components
    fileName = tf.strings.split(file_path, os.path.sep)[-1]
    # get label name from filename
    className = tf.strings.split(fileName, '_')[1]
    className = tf.strings.split(className, '.')[0]
    # get one_hot vector boolean
    one_hot = className == classNames
    # cast vector type to integer 
    return tf.cast(one_hot, dtype=tf.int8, name=None)

def getImage(file_path):
    # Load the raw data from the file as a string
    img = tf.io.read_file(file_path)
    # Convert the compressed string to a 3D uint8 tensor
    img = tf.io.decode_bmp(img, channels=3)
    # cast tf.Tensor type to uint8 
    return tf.cast(img, dtype=tf.uint8, name=None)

def process_path(file_path):
    label = getLabel(file_path)
    img = getImage(file_path)
    return img, label

path = './dataset/path/*.bmp'
ds = tf.data.Dataset.list_files(path)
ds = ds.map(process_path)

At the end of this process you get a Tensorflow trainable dataset (batch size requires more configuration, see reference), labels as one-hot vector.

when running:

for image, label in ds.take(5):
     imageShape = image.numpy().shape
     label = label.numpy()
     labelName = class_names[np.argmax(label)]
     print('Image Shape: {}, Label: {}, LabelName: {}'.format(imageShape, label, labelName))

you got:

Image Shape: (180, 180, 3), Label: [1 0 0], LabelName: dog
Image Shape: (180, 180, 3), Label: [0 1 0], LabelName: cat
Image Shape: (180, 180, 3), Label: [0 0 1], LabelName: horse
Image Shape: (180, 180, 3), Label: [1 0 0], LabelName: dog
Image Shape: (180, 180, 3), Label: [0 1 0], LabelName: cat

Reference:
https://www.tensorflow.org/tutorials/load_data/images

回复收藏 0 原文

深海少女心 2025-01-25 14:56:17

您可以尝试根据您在训练集中使用的任何ID来分配ID，并根据您使用的任何ID来收集路径。

如果您使用的是tensorflow，则 dataset 文档具有信息的方法。加载数据。具体来说，

dataset_dog = tf.data.dataset.list_files（“ ./ dataset/path/path/*dog.bmp）

回复收藏 0 原文

尽揽少女心 2025-01-25 14:56:17

这是整个辣酱玉米饼馅。我更喜欢将 pandas 数据集与 ImageDataGenerator.flow_from_dataframe 一起使用，因为它很灵活。我创建了一个目录，其中包含 10 张起重机图像和 10 张信天翁图像。文件名的形式为
0_crane.jpf、1_crane.jpg 等... 10_albatross、11_albatross ..... 下面的代码处理此目录。创建一个数据帧 df，然后将其拆分为 train_df、valid_df 和 test_df。然后为train_gen、test_gen和valid_gen创建3个图像数据生成器。我使用了我常用的标准模型并训练了该模型。然后以 100% 的准确度评估测试步骤。代码如下

import numpy as np
import pandas as pd
import os
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Activation,Dropout,Conv2D, MaxPooling2D,BatchNormalization
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras import regularizers
from tensorflow.keras.models import Model

sdir = r'c:\temp\single'
filepaths=[]
labels=[]
flist=os.listdir(sdir)
for f in flist:
    fpath=os.path.join(sdir,f)   
    filepaths.append(fpath)
    index1=f.rfind('_')+1
    index2=f.rfind('.')    
    klass=f[index1:index2]     
    labels.append(klass)
Fseries=pd.Series(filepaths, name='filepaths')
Lseries=pd.Series(labels, name='labels')
df=pd.concat([Fseries, Lseries], axis=1)
# now you can use train_test_split to create a train_df, a test_df and a valid_df
trsplit=.8
vsplit=.1
dsplit=vsplit/(1-trsplit)
train_df, dummy_df=train_test_split (df, train_size=trsplit, shuffle=True, random_state=123, stratify=df['labels'])
valid_df, test_df= train_test_split(dummy_df, train_size=dsplit, shuffle=True, random_state=123, stratify =dummy_df['labels'])
print ('train_df length: ', len(train_df), ' test_df length: ', len(test_df) ,'  valid_df length: ', len(valid_df))

# create a train_gen,  a valid_gen and a test_gen
# for trgen you can specify augmentations like horizontal_flip, vertical_flip etc
img_size=(224,224)
batch_size=16
trgen=ImageDataGenerator(horizontal_flip=True,rotation_range=20, width_shift_range=.2,
                                  height_shift_range=.2, zoom_range=.2   )
tvgen=ImageDataGenerator()
train_gen=trgen.flow_from_dataframe(train_df, x_col='filepaths', y_col='labels',target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=True, batch_size=batch_size)
valid_gen=tvgen.flow_from_dataframe(valid_df, x_col='filepaths', y_col='labels',target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=False, batch_size=2)
test_gen=tvgen.flow_from_dataframe(test_df, x_col='filepaths', y_col='labels',target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=False, batch_size=2)
train_files=train_gen.filenames
classes=list(train_gen.class_indices.keys())
class_count=len(classes)
labels=train_gen.labels
images, labels=next(train_gen)
plt.figure(figsize=(20, 12))
for i in range (len(images)):
    plt.subplot(4,4, i+1) 
    index=np.argmax(labels[i])
    class_name=classes[index]
    plt.title(class_name, color='yellow', fontsize=18)   
    plt.axis('off')
    plt.imshow(images[i]/255)
plt.show()
# build a model
img_shape=(img_size[0], img_size[1], 3)
model_name='EfficientNetB5'
base_model=tf.keras.applications.efficientnet.EfficientNetB5(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max') 
# Note you are always told NOT to make the base model trainable initially- that is WRONG you get better results leaving it trainable
base_model.trainable=True
x=base_model.output
x=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(1024, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.3, seed=123)(x)
x = Dense(128, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.45, seed=123)(x)        
output=Dense(class_count, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
lr=.001 # start with this learning rate
model.compile(Adamax(learning_rate=lr), loss='categorical_crossentropy', metrics=['accuracy']) 

epochs=5
history=model.fit(x=train_gen,  epochs=epochs, verbose=1,  validation_data=valid_gen,
               validation_steps=None,  shuffle=False,  initial_epoch=0)

loss, acc =model.evaluate(test_gen)
print (' accuracy on test set is: ', acc* 100, '%')

here is the whole enchilada. I prefer to use pandas datasets along with the ImageDataGenerator.flow_from_dataframe because it is flexible. I created a directory single with 10 images of cranes and 10 images of albaross. Filenames are of the form
0_crane.jpf, 1_crane.jpg etc ... 10_albatross, 11_albatross ..... Code below process this directory. Create a dataframe df, then splits it into a train_df, valid_df and a test_df. Then 3 image data generators are created for train_gen, test_gen and valid_gen. I used a standard model I usually use and trained the model. Then evaluated the test ste with 100% accuracy. Code is below

import numpy as np
import pandas as pd
import os
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Activation,Dropout,Conv2D, MaxPooling2D,BatchNormalization
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras import regularizers
from tensorflow.keras.models import Model

sdir = r'c:\temp\single'
filepaths=[]
labels=[]
flist=os.listdir(sdir)
for f in flist:
    fpath=os.path.join(sdir,f)   
    filepaths.append(fpath)
    index1=f.rfind('_')+1
    index2=f.rfind('.')    
    klass=f[index1:index2]     
    labels.append(klass)
Fseries=pd.Series(filepaths, name='filepaths')
Lseries=pd.Series(labels, name='labels')
df=pd.concat([Fseries, Lseries], axis=1)
# now you can use train_test_split to create a train_df, a test_df and a valid_df
trsplit=.8
vsplit=.1
dsplit=vsplit/(1-trsplit)
train_df, dummy_df=train_test_split (df, train_size=trsplit, shuffle=True, random_state=123, stratify=df['labels'])
valid_df, test_df= train_test_split(dummy_df, train_size=dsplit, shuffle=True, random_state=123, stratify =dummy_df['labels'])
print ('train_df length: ', len(train_df), ' test_df length: ', len(test_df) ,'  valid_df length: ', len(valid_df))

# create a train_gen,  a valid_gen and a test_gen
# for trgen you can specify augmentations like horizontal_flip, vertical_flip etc
img_size=(224,224)
batch_size=16
trgen=ImageDataGenerator(horizontal_flip=True,rotation_range=20, width_shift_range=.2,
                                  height_shift_range=.2, zoom_range=.2   )
tvgen=ImageDataGenerator()
train_gen=trgen.flow_from_dataframe(train_df, x_col='filepaths', y_col='labels',target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=True, batch_size=batch_size)
valid_gen=tvgen.flow_from_dataframe(valid_df, x_col='filepaths', y_col='labels',target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=False, batch_size=2)
test_gen=tvgen.flow_from_dataframe(test_df, x_col='filepaths', y_col='labels',target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=False, batch_size=2)
train_files=train_gen.filenames
classes=list(train_gen.class_indices.keys())
class_count=len(classes)
labels=train_gen.labels
images, labels=next(train_gen)
plt.figure(figsize=(20, 12))
for i in range (len(images)):
    plt.subplot(4,4, i+1) 
    index=np.argmax(labels[i])
    class_name=classes[index]
    plt.title(class_name, color='yellow', fontsize=18)   
    plt.axis('off')
    plt.imshow(images[i]/255)
plt.show()
# build a model
img_shape=(img_size[0], img_size[1], 3)
model_name='EfficientNetB5'
base_model=tf.keras.applications.efficientnet.EfficientNetB5(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max') 
# Note you are always told NOT to make the base model trainable initially- that is WRONG you get better results leaving it trainable
base_model.trainable=True
x=base_model.output
x=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(1024, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.3, seed=123)(x)
x = Dense(128, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.45, seed=123)(x)        
output=Dense(class_count, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
lr=.001 # start with this learning rate
model.compile(Adamax(learning_rate=lr), loss='categorical_crossentropy', metrics=['accuracy']) 

epochs=5
history=model.fit(x=train_gen,  epochs=epochs, verbose=1,  validation_data=valid_gen,
               validation_steps=None,  shuffle=False,  initial_epoch=0)

loss, acc =model.evaluate(test_gen)
print (' accuracy on test set is: ', acc* 100, '%')

回复收藏 0 原文

~没有更多了~