TensorFlow-Directml与TensorFlow-CPU

发布于 2025-01-19 12:03:39 字数 932 浏览 1 评论 0原文

我目前开始使用TensorFlow在Python学习CNN。我确实知道TensorFlow使用CUDA，因此我尝试使用TensorFlow-DirectMl，因为我使用的是AMD GPU（RX 580和I3 10100F CPU）。我尝试使用此模型使用CIFAR-10数据集构建一个基本模型，以用于对象检测：

model = models.Sequential()
model.add(layers.Conv2D( 32, (3,3), activation='relu', input_shape=(32,32,3) ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))

model.add(layers.Conv2D( 64, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))

model.add(layers.Conv2D( 128, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))

model.add(layers.Flatten())
model.add(layers.Dense(128, activation= 'relu'))
model.add(layers.Dense(10))

注意：使用默认学习率的Adam Optimizer处理。

我的问题不是关于什么是正确的模型，而是为什么在TensorFlow-CPU执行速度比TensorFlow-DirectMl更快的情况下进行了重要的性能。我的CPU进行了约3分钟的1个时代，并带有50000个训练数据，而DirectML则在1个时期使用了约13分钟，并使用50000个培训数据。似乎导致这种性能差异，在什么情况下我应该使用GPU或CPU？

原文

I'm currently starting to study CNN in Python with Tensorflow. I do understand that Tensorflow uses CUDA, so I instead tried using Tensorflow-directml because I'm using an AMD gpu (RX 580 and I3 10100f CPU). I tried to build a basic model for an object detection using CIFAR-10 dataset with this model:

model = models.Sequential()
model.add(layers.Conv2D( 32, (3,3), activation='relu', input_shape=(32,32,3) ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))

model.add(layers.Conv2D( 64, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))

model.add(layers.Conv2D( 128, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))

model.add(layers.Flatten())
model.add(layers.Dense(128, activation= 'relu'))
model.add(layers.Dense(10))

Note: processed with Adam optimizer with default learning rate.

My question is not about what is the proper model, but instead why significant performance where tensorflow-CPU is performing faster than tensorflow-directml. It takes ~3 mins with my CPU for 1 epoch with 50000 training data where as directml took ~13 mins for 1 epoch with 50000 training data. What seems to cause this performance difference and on what cases should i use my GPU or CPU?

分享到QQ

分享到微博