我目前正在研究音频分类任务,并使用 Yamnet,这是 tfhub 的预训练模型。我用它从音频中提取嵌入,然后我使用另一个由两个密集层组成的简单分类模型,第二个模型将yamnet 给出的嵌入并进行分类。
问题是 yamnet 给出的嵌入总是第三类具有最高值,并且它始终是预测类。
如果有人解决过此类问题,请我需要您的帮助并提前致谢。
我遵循了这个教程: https:/ /blog.tensorflow.org/2021/03/transfer-learning-for-audio-data-with-yamnet.html
I am currently working on audio classification task and using Yamnet which is a pretrained model from tfhub.. I am using it to extract embeddings from audios and then i use another simple classification model composed of two dense layers, the second model takes as input the embeddings given by yamnet and does the classification.
The problem is that the embeddings given by yamnet are always in a way that the third class have the highest value and it is always the predicted class.
If anyone worked on such issue plz i need ur help and thanks in advance.
I followed this tuto : https://blog.tensorflow.org/2021/03/transfer-learning-for-audio-data-with-yamnet.html
发布评论
评论(1)
听起来您的数据在每个类之间并未平等分开。您的模型与数据集的“第三类”过拟合。我将考虑研究使用分层方法将数据分配的数据,验证和测试的可能性,以便在培训/验证/测试过程中包括每个类别。
这是分层k折的资源:
Sounds like your data are not separated equally between each class. Your model overfits with the "third class" from your dataset. I would consider investigating the possibility of splitting the data for train, validation and testing using the stratified method so that every class is included during training/validation/testing.
Here is a resource of Stratified K fold:
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html