变压器微调错误:forward() 得到了意外的关键字参数“标签”;
我正在尝试使用以下代码集成五个变压器:
bert_model_1 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased',num_labels=2)
bert_model_2 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2)
bert_model_3 = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
bert_model_4= BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)
bert_model_5 = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)
ensemble model = concatenate(bert_model_1,bert_model_2,bert_model_3,bert_model_4,bert_model_5)
当我使用 Huggingface Trainer 类单独运行上述模型时,我没有收到任何错误。它返回我预期的 logits。 我的训练器类如下:
trainer = Trainer(
model=ensemble_model,
args=training_args,
train_dataset=train_set_dataset,
eval_dataset=val_dataset
tokenizer = tokenizer
)
在训练器类中传递 ensemble_model 时,出现以下错误:
TypeError Traceback (most recent call last)
<ipython-input-24-ae887539b4ab> in <module>()
11 )
12
---> 13 trainer.train()
3 frames
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1398 tr_loss_step = self.training_step(model, inputs)
1399 else:
-> 1400 tr_loss_step = self.training_step(model, inputs)
1401
1402 if (
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
1982
1983 with self.autocast_smart_context_manager():
-> 1984 loss = self.compute_loss(model, inputs)
1985
1986 if self.args.n_gpu > 1:
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
2014 else:
2015 labels = None
-> 2016 outputs = model(**inputs)
2017 # Save past state if it exists
2018 # TODO: this needs to be fixed and made cleaner later.
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
TypeError: forward() got an unexpected keyword argument 'labels'
当我训练各个模型时,我传入各自的分词器,但是由于我无法集成分词器,在训练器类中定义分词器时,我使用了 BertTokenizer 。所以只有一个。
我使用以下类创建了数据集:
class TheDataset(torch.utils.data.Dataset):
def __init__(self, tweets, labels, tokenizer):
self.tweets = tweets
self.labels = labels
self.tokenizer = tokenizer
self.max_len = MAX_LEN
def __len__(self):
return len(self.tweets)
def __getitem__(self, index):
Tweet = str(self.tweets[index])
Label = self.labels[index]
encoded_tweet = self.tokenizer.encode_plus(
Tweet,
add_special_tokens = True,
max_length = self.max_len,
return_token_type_ids = False,
return_attention_mask = True,
return_tensors = "pt",
padding = "max_length",
truncation = True
)
return {
'input_ids': encoded_tweet['input_ids'][0],
'attention_mask': encoded_tweet['attention_mask'][0],
'labels': torch.tensor(Label, dtype=torch.long)
}
如您所见,返回的标签是张量本身。从建议接受的解决方案中,此解决方案它解释了 BERTModel 没有标签参数,但对于我使用 XForSequenceClassification 的所有模型。我怀疑预训练的模型名称可能是一个问题或者我在这里缺少什么?
我们将非常感谢您的回答和建议。
I am trying to ensemble five transformers with the following code:
bert_model_1 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased',num_labels=2)
bert_model_2 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2)
bert_model_3 = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
bert_model_4= BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)
bert_model_5 = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)
ensemble model = concatenate(bert_model_1,bert_model_2,bert_model_3,bert_model_4,bert_model_5)
When i run the above models individually with the huggingface Trainer class i do not get any error.It returns me the expected logits.
My trainer class is as follows:
trainer = Trainer(
model=ensemble_model,
args=training_args,
train_dataset=train_set_dataset,
eval_dataset=val_dataset
tokenizer = tokenizer
)
While passing the ensemble_model in the Trainer class i get the following error:
TypeError Traceback (most recent call last)
<ipython-input-24-ae887539b4ab> in <module>()
11 )
12
---> 13 trainer.train()
3 frames
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1398 tr_loss_step = self.training_step(model, inputs)
1399 else:
-> 1400 tr_loss_step = self.training_step(model, inputs)
1401
1402 if (
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
1982
1983 with self.autocast_smart_context_manager():
-> 1984 loss = self.compute_loss(model, inputs)
1985
1986 if self.args.n_gpu > 1:
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
2014 else:
2015 labels = None
-> 2016 outputs = model(**inputs)
2017 # Save past state if it exists
2018 # TODO: this needs to be fixed and made cleaner later.
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
TypeError: forward() got an unexpected keyword argument 'labels'
When i train the individual models i pass in the respective tokenizers however as i cannot ensemble tokenizers, while defining the tokenizer in Trainer class i used BertTokenizer. So only one.
I have created the dataset using the following class:
class TheDataset(torch.utils.data.Dataset):
def __init__(self, tweets, labels, tokenizer):
self.tweets = tweets
self.labels = labels
self.tokenizer = tokenizer
self.max_len = MAX_LEN
def __len__(self):
return len(self.tweets)
def __getitem__(self, index):
Tweet = str(self.tweets[index])
Label = self.labels[index]
encoded_tweet = self.tokenizer.encode_plus(
Tweet,
add_special_tokens = True,
max_length = self.max_len,
return_token_type_ids = False,
return_attention_mask = True,
return_tensors = "pt",
padding = "max_length",
truncation = True
)
return {
'input_ids': encoded_tweet['input_ids'][0],
'attention_mask': encoded_tweet['attention_mask'][0],
'labels': torch.tensor(Label, dtype=torch.long)
}
As you can see the returned labels are tensor itself. From the suggested accepted solution, this solution it explains that BERTModel does not have labels argument, but for all the models i am using XForSequenceClassification. I suspect the pretrained model name can be an issue or What i am missing here?
Your answer and suggestion will be highly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论