我尝试使用bertforforeSequenceCecrification进行二进制情感分析任务,但是在评估过程中,所有逻辑都相同

发布于 2025-01-27 18:10:22 字数 3049 浏览 1 评论 0原文

在训练过程中,似乎所有的事情都可以,并且每10步我尝试通过使用Dev_data_set评估模型,该模型将为批处理的每个样本提供相同的logits。

这是我的代码:

<!-- language: lang-python -->

       def train(self):
        params = [{'params' : self.model.parameters()}]
        optimizer = torch.optim.Adam(params, lr=self.args.lr, weight_decay=self.args.weight_decay)
        my_lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer=optimizer, gamma=self.args.decay_rate)
        best_eval_loss = 1e9
        for epoch in range(int(self.args.epochs)):
            total_train_loss = 0
            step = 0
            train_batch_num = 0
            for batch_index, batch in tqdm(enumerate(self.train_loader)):
                self.model.train()
                step += 1
                train_batch_num += 1
                # optimizer.zero_grad()
                sources = list(batch[0])
                label = batch[1]
            
                inputs = self.tokenizer(sources,return_tensors='pt', padding=True).to(self.device)
                labels = torch.tensor(label).to(self.device)
                labels = labels.squeeze(0)

                logits = self.model(**inputs, labels=labels).logits

                loss = self.model(**inputs, labels=labels).loss
                
                # logits = outputs.logits
                # loss = self.loss(logits , labels)

                # print('train loss per step:', loss.item())
                total_train_loss += loss.item()

                optimizer.zero_grad()
                loss.backward()
                optimizer.step()
                
            
                if step % 10 == 0:
                    total_dev_loss = 0
                    self.model.eval()
                    dev_num = 0
                    with torch.no_grad():
                        for batch_index, batch in tqdm(enumerate(self.dev_loader)):
                            sources = list(batch[0])
                            label = batch[1]
                            dev_num += 1
                            

                            inputs = self.tokenizer(sources,return_tensors='pt', padding=True).to(self.device)
                            labels = torch.tensor(label).to(self.device)
                            labels = labels.squeeze(0)

                            logits = self.model(**inputs, labels=labels).logits

                            loss = self.model(**inputs, labels=labels).loss

在培训过程中: 逻辑是: 训练逻辑:张量([[[ - 0.3385,-1.0671], [-0.1526,-1.0956]],设备='cuda:0',grad_fn =)。似乎没关系。

但是通过行之后:如果step%10 == 0:,, 不管数据是什么,例如:

在此处输入图像描述

在此处输入图像描述 评估logits变得相同:

  • eval logits:张量([[[ - 0.4534,-0.9629], [-0.4533,-0.9628]],设备='cuda:0')

  • eval logits:张量([[ - 0.4534,-0.9629], [-0.4534,-0.9630]],设备='cuda:0')

It seems all the things all right during the training process, and every 10 step i try to evaluate the model by using dev_data_set, the model give every the sample of the batch the same logits.

This is my code:

<!-- language: lang-python -->

       def train(self):
        params = [{'params' : self.model.parameters()}]
        optimizer = torch.optim.Adam(params, lr=self.args.lr, weight_decay=self.args.weight_decay)
        my_lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer=optimizer, gamma=self.args.decay_rate)
        best_eval_loss = 1e9
        for epoch in range(int(self.args.epochs)):
            total_train_loss = 0
            step = 0
            train_batch_num = 0
            for batch_index, batch in tqdm(enumerate(self.train_loader)):
                self.model.train()
                step += 1
                train_batch_num += 1
                # optimizer.zero_grad()
                sources = list(batch[0])
                label = batch[1]
            
                inputs = self.tokenizer(sources,return_tensors='pt', padding=True).to(self.device)
                labels = torch.tensor(label).to(self.device)
                labels = labels.squeeze(0)

                logits = self.model(**inputs, labels=labels).logits

                loss = self.model(**inputs, labels=labels).loss
                
                # logits = outputs.logits
                # loss = self.loss(logits , labels)

                # print('train loss per step:', loss.item())
                total_train_loss += loss.item()

                optimizer.zero_grad()
                loss.backward()
                optimizer.step()
                
            
                if step % 10 == 0:
                    total_dev_loss = 0
                    self.model.eval()
                    dev_num = 0
                    with torch.no_grad():
                        for batch_index, batch in tqdm(enumerate(self.dev_loader)):
                            sources = list(batch[0])
                            label = batch[1]
                            dev_num += 1
                            

                            inputs = self.tokenizer(sources,return_tensors='pt', padding=True).to(self.device)
                            labels = torch.tensor(label).to(self.device)
                            labels = labels.squeeze(0)

                            logits = self.model(**inputs, labels=labels).logits

                            loss = self.model(**inputs, labels=labels).loss

During the training process:
the logits are:
training logits: tensor([[-0.3385, -1.0671],
[-0.1526, -1.0956]], device='cuda:0', grad_fn=)
. It seems all right.

But after pass the line : if step % 10 == 0:,
no matter what the data is, for example:

enter image description here

or
enter image description here
the eval logits become the same:

  • eval logits: tensor([[-0.4534, -0.9629],
    [-0.4533, -0.9628]], device='cuda:0')

  • eval logits: tensor([[-0.4534, -0.9629],
    [-0.4534, -0.9630]], device='cuda:0')

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文