如何正确微调t5模型

发布于 2025-01-16 18:58:23 字数 667 浏览 2 评论 0原文

我正在按照 此笔记本。 然而,验证集和训练集的损失下降得非常缓慢。我将learning_rate更改为更大的数字,但没有帮助。最终,验证集上的 bleu 分数很低(13.7 左右),翻译质量也很低。

***** Running Evaluation *****
  Num examples = 1000
  Batch size = 32
{'eval_loss': 1.06500244140625, 'eval_bleu': 13.7229, 'eval_gen_len': 17.564, 'eval_runtime': 16.7915, 'eval_samples_per_second': 59.554, 'eval_steps_per_second': 1.906, 'epoch': 5.0}

如果我使用“Helsinki-NLP/opus-mt-en-ro”模型,损失会适当减少,最后,微调后的模型效果很好。

如何正确微调t5-base?我错过了什么吗?

I'm finetuning a t5-base model following this notebook.
However, the loss of both validation set and training set decreases very slowly. I changed the learning_rate to a larger number, but it did not help. Eventually, the bleu score on the validation set was low (around 13.7), and the translation quality was low as well.

***** Running Evaluation *****
  Num examples = 1000
  Batch size = 32
{'eval_loss': 1.06500244140625, 'eval_bleu': 13.7229, 'eval_gen_len': 17.564, 'eval_runtime': 16.7915, 'eval_samples_per_second': 59.554, 'eval_steps_per_second': 1.906, 'epoch': 5.0}

If I use the "Helsinki-NLP/opus-mt-en-ro" model, the loss decreases properly, and at the end, the finetuned model works pretty well.

How to fine-tune t5-base properly? Did I miss something?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

給妳壹絲溫柔 2025-01-23 18:58:23

我认为教程中显示的指标适用于已经训练好的 EN>RO opus-mt 模型,然后对其进行了微调。我没有看到其前后指标的比较,因此很难判断微调到底产生了多大的差异。

通常,您不应期望微调 T5 会得到相同的结果,因为 T5 不是(纯)机器翻译模型。更重要的是微调前后指标的差异。

我可以想象您的训练出了两件事:

  1. 您是否在训练和评估的输入序列中添加了正确的 T5 前缀(“将英语翻译成罗马尼亚语:”)?如果你不这样做,你可能会从头开始训练一项新任务,而不是使用模型在 MT 到罗马尼亚语(以及德语,也许还有其他一些语言)上进行的预训练。您可以在此推理演示中了解这如何影响模型行为:预训练期间使用的语言预训练期间未使用的语言
  2. 如果您选择了像 t5-base 这样的相对较小的模型,但您坚持使用教程中的 num_train_epochs=1,那么您的训练纪元数可能太低而无法引起注意。不同之处。只要您能从中获得显着的性能提升,就尝试增加纪元,在示例中,至少前 5 到 10 个纪元可能是这种情况。

实际上,我做了一些与您之前为 EN>DE(德语)所做的非常相似的事情。我在 10 个时期的 30,000 个样本的自定义数据集上对 opus-mt-en-det5-base 进行了微调。 opus-mt-en-de BLEU 从 0.256 增加到 0.388,t5-base 从 0.166 增加到 0.340,只是为了让您了解会发生什么。罗马尼亚语/您使用的数据集可能对模型来说更具挑战性,但会导致不同的分数。

I think the metrics shown in the tutorial are for the already trained EN>RO opus-mt model which was then fine-tuned. I don't see the before and after comparison of the metrics for it, so it is hard to tell how much of a difference that fine-tuning really made.

You generally shouldn't expect the same results from fine-tuning T5 which is not a (pure) machine translation model. More important is the difference in metrics before and after the fine-tuning.

Two things I could imagine having gone wrong with your training:

  1. Did you add the proper T5 prefix to the input sequences ("translate English to Romanian: ") for both your training and your evaluation? If you did not you might have been training a new task from scratch and not use the bit of pre-training the model did on MT to Romanian (and German and perhaps some other ones). You can see how that affects the model behavior for example in this inference demo: Language used during pretraining and Language not used during pretraining.
  2. If you chose a relatively small model like t5-base but you stuck with the num_train_epochs=1 in the tutorial your train epoch number is probably a lot too low to make a noticable difference. Try increasing the epochs for as long as you get significant performance boosts from it, in the example this is probably the case for at least the first 5 to 10 epochs.

I actually did something very similar to what you are doing before for EN>DE (German). I fine-tuned both opus-mt-en-de and t5-base on a custom dataset of 30.000 samples for 10 epochs. opus-mt-en-de BLEU increased from 0.256 to 0.388 and t5-base from 0.166 to 0.340, just to give you an idea of what to expect. Romanian/the dataset you use might be more of a challenge for the model and result in different scores though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文