TensorFlow-Directml与TensorFlow-CPU
我目前开始使用TensorFlow在Python学习CNN。我确实知道TensorFlow使用CUDA,因此我尝试使用TensorFlow-DirectMl,因为我使用的是AMD GPU(RX 580和I3 10100F CPU)。我尝试使用此模型使用CIFAR-10数据集构建一个基本模型,以用于对象检测:
model = models.Sequential()
model.add(layers.Conv2D( 32, (3,3), activation='relu', input_shape=(32,32,3) ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))
model.add(layers.Conv2D( 64, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))
model.add(layers.Conv2D( 128, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation= 'relu'))
model.add(layers.Dense(10))
注意:使用默认学习率的Adam Optimizer处理。
我的问题不是关于什么是正确的模型,而是为什么在TensorFlow-CPU执行速度比TensorFlow-DirectMl更快的情况下进行了重要的性能。我的CPU进行了约3分钟的1个时代,并带有50000个训练数据,而DirectML则在1个时期使用了约13分钟,并使用50000个培训数据。似乎导致这种性能差异,在什么情况下我应该使用GPU或CPU?
I'm currently starting to study CNN in Python with Tensorflow. I do understand that Tensorflow uses CUDA, so I instead tried using Tensorflow-directml because I'm using an AMD gpu (RX 580 and I3 10100f CPU). I tried to build a basic model for an object detection using CIFAR-10 dataset with this model:
model = models.Sequential()
model.add(layers.Conv2D( 32, (3,3), activation='relu', input_shape=(32,32,3) ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))
model.add(layers.Conv2D( 64, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))
model.add(layers.Conv2D( 128, (3,3), activation='relu' ) )
model.add(layers.MaxPooling2D( (2,2) ))
model.add(Dropout(0.1))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation= 'relu'))
model.add(layers.Dense(10))
Note: processed with Adam optimizer with default learning rate.
My question is not about what is the proper model, but instead why significant performance where tensorflow-CPU is performing faster than tensorflow-directml. It takes ~3 mins with my CPU for 1 epoch with 50000 training data where as directml took ~13 mins for 1 epoch with 50000 training data. What seems to cause this performance difference and on what cases should i use my GPU or CPU?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这可能是因为您的CPU比GPU更现代。我在CPU(R5 5500)上运行了CNN,它比我的GPU(Radeon RX 5600XT)慢一点。
Its probably because you have a more modern cpu than gpu. I ran a CNN on my cpu(R5 5500) and it was only a little bit slower than my gpu(Radeon RX 5600XT).