变压器微调错误：forward() 得到了意外的关键字参数“标签”；

发布于 2025-01-17 03:33:50 字数 4207 浏览 0 评论 0原文

我正在尝试使用以下代码集成五个变压器：

bert_model_1 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased',num_labels=2)
bert_model_2 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2)
bert_model_3 = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
bert_model_4= BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)
bert_model_5 = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)

ensemble model = concatenate(bert_model_1,bert_model_2,bert_model_3,bert_model_4,bert_model_5)

当我使用 Huggingface Trainer 类单独运行上述模型时，我没有收到任何错误。它返回我预期的 logits。我的训练器类如下：

trainer = Trainer(
    model=ensemble_model,                         
    args=training_args,                  
    train_dataset=train_set_dataset,         
    eval_dataset=val_dataset             
   tokenizer = tokenizer
)

在训练器类中传递 ensemble_model 时，出现以下错误：

TypeError                                 Traceback (most recent call last)

<ipython-input-24-ae887539b4ab> in <module>()
     11 )
     12 
---> 13 trainer.train()

3 frames

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1398                         tr_loss_step = self.training_step(model, inputs)
   1399                 else:
-> 1400                     tr_loss_step = self.training_step(model, inputs)
   1401 
   1402                 if (

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   1982 
   1983         with self.autocast_smart_context_manager():
-> 1984             loss = self.compute_loss(model, inputs)
   1985 
   1986         if self.args.n_gpu > 1:

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   2014         else:
   2015             labels = None
-> 2016         outputs = model(**inputs)
   2017         # Save past state if it exists
   2018         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'labels'

当我训练各个模型时，我传入各自的分词器，但是由于我无法集成分词器，在训练器类中定义分词器时，我使用了 BertTokenizer 。所以只有一个。

我使用以下类创建了数据集：

class TheDataset(torch.utils.data.Dataset):

    def __init__(self, tweets, labels, tokenizer):
        self.tweets     = tweets 
        self.labels = labels
        self.tokenizer  = tokenizer
        self.max_len    = MAX_LEN
  
    def __len__(self):
        return len(self.tweets)
  
    def __getitem__(self, index):
        Tweet = str(self.tweets[index])
        Label = self.labels[index]

        encoded_tweet = self.tokenizer.encode_plus(
            Tweet,
            add_special_tokens    = True,
            max_length            = self.max_len,
            return_token_type_ids = False,
            return_attention_mask = True,
            return_tensors        = "pt",
            padding               = "max_length",
            truncation            = True
        )

        return {
            'input_ids': encoded_tweet['input_ids'][0],
            'attention_mask': encoded_tweet['attention_mask'][0],
            'labels': torch.tensor(Label, dtype=torch.long)
        }

如您所见，返回的标签是张量本身。从建议接受的解决方案中，此解决方案它解释了 BERTModel 没有标签参数，但对于我使用 XForSequenceClassification 的所有模型。我怀疑预训练的模型名称可能是一个问题或者我在这里缺少什么？

我们将非常感谢您的回答和建议。

原文

I am trying to ensemble five transformers with the following code:

bert_model_1 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased',num_labels=2)
bert_model_2 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2)
bert_model_3 = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
bert_model_4= BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)
bert_model_5 = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)

ensemble model = concatenate(bert_model_1,bert_model_2,bert_model_3,bert_model_4,bert_model_5)

When i run the above models individually with the huggingface Trainer class i do not get any error.It returns me the expected logits.
My trainer class is as follows:

trainer = Trainer(
    model=ensemble_model,                         
    args=training_args,                  
    train_dataset=train_set_dataset,         
    eval_dataset=val_dataset             
   tokenizer = tokenizer
)

While passing the ensemble_model in the Trainer class i get the following error:

TypeError                                 Traceback (most recent call last)

<ipython-input-24-ae887539b4ab> in <module>()
     11 )
     12 
---> 13 trainer.train()

3 frames

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1398                         tr_loss_step = self.training_step(model, inputs)
   1399                 else:
-> 1400                     tr_loss_step = self.training_step(model, inputs)
   1401 
   1402                 if (

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   1982 
   1983         with self.autocast_smart_context_manager():
-> 1984             loss = self.compute_loss(model, inputs)
   1985 
   1986         if self.args.n_gpu > 1:

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   2014         else:
   2015             labels = None
-> 2016         outputs = model(**inputs)
   2017         # Save past state if it exists
   2018         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'labels'

When i train the individual models i pass in the respective tokenizers however as i cannot ensemble tokenizers, while defining the tokenizer in Trainer class i used BertTokenizer. So only one.

I have created the dataset using the following class:

class TheDataset(torch.utils.data.Dataset):

    def __init__(self, tweets, labels, tokenizer):
        self.tweets     = tweets 
        self.labels = labels
        self.tokenizer  = tokenizer
        self.max_len    = MAX_LEN
  
    def __len__(self):
        return len(self.tweets)
  
    def __getitem__(self, index):
        Tweet = str(self.tweets[index])
        Label = self.labels[index]

        encoded_tweet = self.tokenizer.encode_plus(
            Tweet,
            add_special_tokens    = True,
            max_length            = self.max_len,
            return_token_type_ids = False,
            return_attention_mask = True,
            return_tensors        = "pt",
            padding               = "max_length",
            truncation            = True
        )

        return {
            'input_ids': encoded_tweet['input_ids'][0],
            'attention_mask': encoded_tweet['attention_mask'][0],
            'labels': torch.tensor(Label, dtype=torch.long)
        }

As you can see the returned labels are tensor itself. From the suggested accepted solution, this solution it explains that BERTModel does not have labels argument, but for all the models i am using XForSequenceClassification. I suspect the pretrained model name can be an issue or What i am missing here?

Your answer and suggestion will be highly appreciated.

分享到QQ

分享到微博