This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 months ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(1)
使用所有GPT模型,您可以在生成过程中指定“ max_length”参数。这将迫使模型生成等于max_length的代币数量。您也可以使用num_return_ sequences播放,并使用辅助功能选择最短序列。
示例:
这些大型语言模型经过大量数据的培训,在学会适应您要喂养的内容时,对它们进行微调可以耐心。尝试不同的事物 - 调整训练数据格式,尝试不同的样本,在一代期间使用预要来指导模型等。词,因此很难预测到底是什么导致它说一件事。
With all GPT models you can specify the "max_length" parameter during generation. This will force the model to generate an amount of tokens equal to max_length. You could also play with num_return_sequences and use a helper function to choose the shortest sequence.
Example:
These large language models are trained on massive amounts of data, and fine-tuning them can take patience as they learn to adapt to what you're feeding it. Try different things - adjust your training data format, try different samples, use a pre-prompt during generation to guide the model, etc.. A model like GPT-J does a mind-numbingly large amount of calculations just to spit out a single word, so it is hard to predict what exactly is causing it to say one thing over another.