变压器微调错误:forward() 得到了意外的关键字参数“标签”;

发布于 2025-01-17 03:33:50 字数 4207 浏览 0 评论 0原文

我正在尝试使用以下代码集成五个变压器:

bert_model_1 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased',num_labels=2)
bert_model_2 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2)
bert_model_3 = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
bert_model_4= BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)
bert_model_5 = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)

ensemble model = concatenate(bert_model_1,bert_model_2,bert_model_3,bert_model_4,bert_model_5)

当我使用 Huggingface Trainer 类单独运行上述模型时,我没有收到任何错误。它返回我预期的 logits。 我的训练器类如下:

trainer = Trainer(
    model=ensemble_model,                         
    args=training_args,                  
    train_dataset=train_set_dataset,         
    eval_dataset=val_dataset             
   tokenizer = tokenizer
)

在训练器类中传递 ensemble_model 时,出现以下错误:

TypeError                                 Traceback (most recent call last)

<ipython-input-24-ae887539b4ab> in <module>()
     11 )
     12 
---> 13 trainer.train()

3 frames

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1398                         tr_loss_step = self.training_step(model, inputs)
   1399                 else:
-> 1400                     tr_loss_step = self.training_step(model, inputs)
   1401 
   1402                 if (

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   1982 
   1983         with self.autocast_smart_context_manager():
-> 1984             loss = self.compute_loss(model, inputs)
   1985 
   1986         if self.args.n_gpu > 1:

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   2014         else:
   2015             labels = None
-> 2016         outputs = model(**inputs)
   2017         # Save past state if it exists
   2018         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'labels'

当我训练各个模型时,我传入各自的分词器,但是由于我无法集成分词器,在训练器类中定义分词器时,我使用了 BertTokenizer 。所以只有一个。

我使用以下类创建了数据集:

class TheDataset(torch.utils.data.Dataset):

    def __init__(self, tweets, labels, tokenizer):
        self.tweets     = tweets 
        self.labels = labels
        self.tokenizer  = tokenizer
        self.max_len    = MAX_LEN
  
    def __len__(self):
        return len(self.tweets)
  
    def __getitem__(self, index):
        Tweet = str(self.tweets[index])
        Label = self.labels[index]

        encoded_tweet = self.tokenizer.encode_plus(
            Tweet,
            add_special_tokens    = True,
            max_length            = self.max_len,
            return_token_type_ids = False,
            return_attention_mask = True,
            return_tensors        = "pt",
            padding               = "max_length",
            truncation            = True
        )

        return {
            'input_ids': encoded_tweet['input_ids'][0],
            'attention_mask': encoded_tweet['attention_mask'][0],
            'labels': torch.tensor(Label, dtype=torch.long)
        }

如您所见,返回的标签是张量本身。从建议接受的解决方案中,解决方案它解释了 BERTModel 没有标签参数,但对于我使用 XForSequenceClassification 的所有模型。我怀疑预训练的模型名称可能是一个问题或者我在这里缺少什么?

我们将非常感谢您的回答和建议。

I am trying to ensemble five transformers with the following code:

bert_model_1 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased',num_labels=2)
bert_model_2 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2)
bert_model_3 = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
bert_model_4= BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)
bert_model_5 = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)

ensemble model = concatenate(bert_model_1,bert_model_2,bert_model_3,bert_model_4,bert_model_5)

When i run the above models individually with the huggingface Trainer class i do not get any error.It returns me the expected logits.
My trainer class is as follows:

trainer = Trainer(
    model=ensemble_model,                         
    args=training_args,                  
    train_dataset=train_set_dataset,         
    eval_dataset=val_dataset             
   tokenizer = tokenizer
)

While passing the ensemble_model in the Trainer class i get the following error:

TypeError                                 Traceback (most recent call last)

<ipython-input-24-ae887539b4ab> in <module>()
     11 )
     12 
---> 13 trainer.train()

3 frames

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1398                         tr_loss_step = self.training_step(model, inputs)
   1399                 else:
-> 1400                     tr_loss_step = self.training_step(model, inputs)
   1401 
   1402                 if (

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   1982 
   1983         with self.autocast_smart_context_manager():
-> 1984             loss = self.compute_loss(model, inputs)
   1985 
   1986         if self.args.n_gpu > 1:

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   2014         else:
   2015             labels = None
-> 2016         outputs = model(**inputs)
   2017         # Save past state if it exists
   2018         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'labels'

When i train the individual models i pass in the respective tokenizers however as i cannot ensemble tokenizers, while defining the tokenizer in Trainer class i used BertTokenizer. So only one.

I have created the dataset using the following class:

class TheDataset(torch.utils.data.Dataset):

    def __init__(self, tweets, labels, tokenizer):
        self.tweets     = tweets 
        self.labels = labels
        self.tokenizer  = tokenizer
        self.max_len    = MAX_LEN
  
    def __len__(self):
        return len(self.tweets)
  
    def __getitem__(self, index):
        Tweet = str(self.tweets[index])
        Label = self.labels[index]

        encoded_tweet = self.tokenizer.encode_plus(
            Tweet,
            add_special_tokens    = True,
            max_length            = self.max_len,
            return_token_type_ids = False,
            return_attention_mask = True,
            return_tensors        = "pt",
            padding               = "max_length",
            truncation            = True
        )

        return {
            'input_ids': encoded_tweet['input_ids'][0],
            'attention_mask': encoded_tweet['attention_mask'][0],
            'labels': torch.tensor(Label, dtype=torch.long)
        }

As you can see the returned labels are tensor itself. From the suggested accepted solution, this solution it explains that BERTModel does not have labels argument, but for all the models i am using XForSequenceClassification. I suspect the pretrained model name can be an issue or What i am missing here?

Your answer and suggestion will be highly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文