如何正确微调t5模型
我正在按照 此笔记本。 然而,验证集和训练集的损失下降得非常缓慢。我将learning_rate更改为更大的数字,但没有帮助。最终,验证集上的 bleu 分数很低(13.7 左右),翻译质量也很低。
***** Running Evaluation *****
Num examples = 1000
Batch size = 32
{'eval_loss': 1.06500244140625, 'eval_bleu': 13.7229, 'eval_gen_len': 17.564, 'eval_runtime': 16.7915, 'eval_samples_per_second': 59.554, 'eval_steps_per_second': 1.906, 'epoch': 5.0}
如果我使用“Helsinki-NLP/opus-mt-en-ro”模型,损失会适当减少,最后,微调后的模型效果很好。
如何正确微调t5-base?我错过了什么吗?
I'm finetuning a t5-base model following this notebook.
However, the loss of both validation set and training set decreases very slowly. I changed the learning_rate to a larger number, but it did not help. Eventually, the bleu score on the validation set was low (around 13.7), and the translation quality was low as well.
***** Running Evaluation *****
Num examples = 1000
Batch size = 32
{'eval_loss': 1.06500244140625, 'eval_bleu': 13.7229, 'eval_gen_len': 17.564, 'eval_runtime': 16.7915, 'eval_samples_per_second': 59.554, 'eval_steps_per_second': 1.906, 'epoch': 5.0}
If I use the "Helsinki-NLP/opus-mt-en-ro" model, the loss decreases properly, and at the end, the finetuned model works pretty well.
How to fine-tune t5-base properly? Did I miss something?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为教程中显示的指标适用于已经训练好的 EN>RO opus-mt 模型,然后对其进行了微调。我没有看到其前后指标的比较,因此很难判断微调到底产生了多大的差异。
通常,您不应期望微调 T5 会得到相同的结果,因为 T5 不是(纯)机器翻译模型。更重要的是微调前后指标的差异。
我可以想象您的训练出了两件事:
“将英语翻译成罗马尼亚语:”
)?如果你不这样做,你可能会从头开始训练一项新任务,而不是使用模型在 MT 到罗马尼亚语(以及德语,也许还有其他一些语言)上进行的预训练。您可以在此推理演示中了解这如何影响模型行为:预训练期间使用的语言和预训练期间未使用的语言。t5-base
这样的相对较小的模型,但您坚持使用教程中的num_train_epochs=1
,那么您的训练纪元数可能太低而无法引起注意。不同之处。只要您能从中获得显着的性能提升,就尝试增加纪元,在示例中,至少前 5 到 10 个纪元可能是这种情况。实际上,我做了一些与您之前为 EN>DE(德语)所做的非常相似的事情。我在 10 个时期的 30,000 个样本的自定义数据集上对
opus-mt-en-de
和t5-base
进行了微调。opus-mt-en-de
BLEU 从 0.256 增加到 0.388,t5-base
从 0.166 增加到 0.340,只是为了让您了解会发生什么。罗马尼亚语/您使用的数据集可能对模型来说更具挑战性,但会导致不同的分数。I think the metrics shown in the tutorial are for the already trained EN>RO opus-mt model which was then fine-tuned. I don't see the before and after comparison of the metrics for it, so it is hard to tell how much of a difference that fine-tuning really made.
You generally shouldn't expect the same results from fine-tuning T5 which is not a (pure) machine translation model. More important is the difference in metrics before and after the fine-tuning.
Two things I could imagine having gone wrong with your training:
"translate English to Romanian: "
) for both your training and your evaluation? If you did not you might have been training a new task from scratch and not use the bit of pre-training the model did on MT to Romanian (and German and perhaps some other ones). You can see how that affects the model behavior for example in this inference demo: Language used during pretraining and Language not used during pretraining.t5-base
but you stuck with thenum_train_epochs=1
in the tutorial your train epoch number is probably a lot too low to make a noticable difference. Try increasing the epochs for as long as you get significant performance boosts from it, in the example this is probably the case for at least the first 5 to 10 epochs.I actually did something very similar to what you are doing before for EN>DE (German). I fine-tuned both
opus-mt-en-de
andt5-base
on a custom dataset of 30.000 samples for 10 epochs.opus-mt-en-de
BLEU increased from 0.256 to 0.388 andt5-base
from 0.166 to 0.340, just to give you an idea of what to expect. Romanian/the dataset you use might be more of a challenge for the model and result in different scores though.