关于培训和评估AllennLP的核心分辨率模型,我有一些问题。
-
对哪些GPU进行培训有任何限制/规格吗?我通过24220 MIB内存进行了对Titan RTX GPU进行培训的中途问题。我可以更改可能会有所帮助的任何参数(注意:我正在使用 bert 而不是Spanbert版本)?
-
我注意到模型用法示例使用已经训练有素且存储的模型。我们可以从训练的模型中指定模型路径吗?
-
我们可以在 Bert-Base-uncased
替换
blob/main/main/triending_config/coref/coref_bert_lstm.jsonnet“ rel =“ nofollow noreferrer”> coref_bert-lstm.jsonnet mm.jsonnet 文件,还是进行此更改的其他修改?
I have a few questions about training and evaluating AllenNLP's coreference resolution model.
-
Are there any constraints/specifications on what GPUs should be used for training? I get an OOM issue midway through training on a Titan RTX GPU with 24220 MiB memory. Are there any parameters I can change that might help (note: I am using the BERT instead of the SpanBERT version)?
-
I noticed that the model usage examples use an already trained and stored model. Can we instead specify a model path from a model we have trained?
-
Can we substitute roberta-base
with bert-base-uncased
in the coref_bert-lstm.jsonnet file, or are other modifications necessary to make this change?
发布评论
评论(1)
max_length
参数使内存使用最大。如果您可以以比512短的最大长度逃脱,请先尝试一下。max_length
parameter makes the biggest difference to memory usage. If you can get away with a max length that's shorter than 512, try that first.