如何为多输出回归问题编写 Pytorch Dataset 类和 DataLoader?

发布于 2025-01-16 03:08:36 字数 2927 浏览 0 评论 0原文

我正在处理基于事件的数据(包括时间戳、x 和 y 坐标以及极性)来解决神经形态问题。但是,我的问题是,当我尝试预测多个输出时,如何创建数据集和正确的数据加载器?具体来说,我试图预测速度矢量的 x 分量、y 分量和 z 分量。这就是我当前的自定义 Dataset 类的样子:

import os
import ast
import torch
import tonic
import torchvision
import numpy as np 
import pandas as pd
import tonic.transforms as transforms
from torch.utils.data import DataLoader

class SyntheticRecording(tonic.Dataset):
    """
        Synthetic event camera recordings dataset.
    """
    def __init__(self, csv_file):
        super(SyntheticRecording, self).__init__()
        self.csv_file = csv_file
        df = pd.read_csv(self.csv_file, index_col = False)
        self.events = df['Events'] # Select only last column of dataframe
        self.target = df[['Vel_x', 'Vel_y', 'Vel_z']] # Select every column except last column of dataframe
        assert(self.target.shape[0] == len(self.events))
        self.sensor_size = (1920, 1080, 2)
    
    """
        Retrieve the index i to get the ith sample from the dataset. Apply the appropriate transformations.
    """
    def __getitem__(self, index):
      list_ = ast.literal_eval(self.events[index])
      t = []
      x = []
      y = []
      p = []
      for e in list_:
        t.append(e[0] * 1e6) # Convert to microseconds
        x.append(e[1])
        y.append(e[2])
        p.append(e[3])
      events = tonic.io.make_structured_array(x, y, t, p) # Ordering is xytp now
      
      # Denoise removes isolated, one-off events
      frame_transform = transforms.Compose([transforms.Denoise(filter_time = 10000), 
                                            transforms.ToFrame(sensor_size = self.sensor_size, 
                                                         time_window = 1000)
                                            ])
      transformed_frames = frame_transform(events)
      vel_x = np.array(self.target.loc[index][0]).astype('float')
      vel_y = np.array(self.target.loc[index][1]).astype('float')
      vel_z = np.array(self.target.loc[index][2]).astype('float')
      
      sample = {'frames': transformed_frames,
                'vel_x': vel_x,
                'vel_y': vel_y,
                'vel_z': vel_z}

      return sample

这就是我尝试创建 DataLoader 的方式:

batch_size = 16
trainloader = DataLoader(sr, batch_size = batch_size, collate_fn = tonic.collation.PadTensors(), shuffle = True, drop_last = True)

每当我尝试迭代框架和目标值(3 个值)时,我都会收到以下错误:

for frames, targets in trainloader:
  print(frames.shape)
  print(targets.shape)
frames, targets = next(iter(trainloader))

结果为:ValueError :太多值无法解压(预期为 2)

创建可以处理多重预测回归问题的 DataLoader 的正确方法是什么?

编辑:

我使用它作为我的资源来尝试处理多输出部分: https://medium.com/jdsc-tech-blog/multioutput-cnn-in-pytorch-c5f702d4915f

I am working with event-based data (consisting of timestamps, x and y coordinates, and polarity) on a neuromorphic problem. However, my question is how do I create the Dataset and proper DataLoader when I am trying to predict multiple outputs? Specifically, I am trying to predict the x-component, y-component, and z-component of a velocity vector. This is what my current custom Dataset class looks like:

import os
import ast
import torch
import tonic
import torchvision
import numpy as np 
import pandas as pd
import tonic.transforms as transforms
from torch.utils.data import DataLoader

class SyntheticRecording(tonic.Dataset):
    """
        Synthetic event camera recordings dataset.
    """
    def __init__(self, csv_file):
        super(SyntheticRecording, self).__init__()
        self.csv_file = csv_file
        df = pd.read_csv(self.csv_file, index_col = False)
        self.events = df['Events'] # Select only last column of dataframe
        self.target = df[['Vel_x', 'Vel_y', 'Vel_z']] # Select every column except last column of dataframe
        assert(self.target.shape[0] == len(self.events))
        self.sensor_size = (1920, 1080, 2)
    
    """
        Retrieve the index i to get the ith sample from the dataset. Apply the appropriate transformations.
    """
    def __getitem__(self, index):
      list_ = ast.literal_eval(self.events[index])
      t = []
      x = []
      y = []
      p = []
      for e in list_:
        t.append(e[0] * 1e6) # Convert to microseconds
        x.append(e[1])
        y.append(e[2])
        p.append(e[3])
      events = tonic.io.make_structured_array(x, y, t, p) # Ordering is xytp now
      
      # Denoise removes isolated, one-off events
      frame_transform = transforms.Compose([transforms.Denoise(filter_time = 10000), 
                                            transforms.ToFrame(sensor_size = self.sensor_size, 
                                                         time_window = 1000)
                                            ])
      transformed_frames = frame_transform(events)
      vel_x = np.array(self.target.loc[index][0]).astype('float')
      vel_y = np.array(self.target.loc[index][1]).astype('float')
      vel_z = np.array(self.target.loc[index][2]).astype('float')
      
      sample = {'frames': transformed_frames,
                'vel_x': vel_x,
                'vel_y': vel_y,
                'vel_z': vel_z}

      return sample

This is how I try to create the DataLoader:

batch_size = 16
trainloader = DataLoader(sr, batch_size = batch_size, collate_fn = tonic.collation.PadTensors(), shuffle = True, drop_last = True)

Whenever I try to iterate over the frames and target values (3 values), I receive the following error:

for frames, targets in trainloader:
  print(frames.shape)
  print(targets.shape)
frames, targets = next(iter(trainloader))

results in: ValueError: too many values to unpack (expected 2)

What is the right way to create a DataLoader that can handle a multi-prediction regression problem?

EDIT:

I am using this as my resource to try to handle the multi output part: https://medium.com/jdsc-tech-blog/multioutput-cnn-in-pytorch-c5f702d4915f

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

似梦非梦 2025-01-23 03:08:36

也许试试这个:

for sample in trainloader:
   print(sample) 

通过这种方式,您可以检查打印的输出中返回了多少个值。

maybe try this:

for sample in trainloader:
   print(sample) 

In this way you can check how many values are being returned in the output printed.

私野 2025-01-23 03:08:36

有点晚了,但我也偶然发现了同样的问题,这是我解决它的方法:

问题不在于迭代数据加载器,而是来自您传递给 collat​​e_fn >DataLoader 因为它假设您的目标只有 1 个值。您得到的错误来自 tonic 的 PadTensor()< /a> 函数迭代新批次:

for sample, target in batch:
    ...

它假设您没有字典,但有一个可以解包的元组。后来它假设目标也只是一个值。

如何修复

  1. 修改整理函数

创建一个新类,在其中删除目标仅为一个值的假设,如下所示:

class MultiTargetPadTensors:
    def __init__(self, batch_first: bool = True):
        self.batch_first = batch_first

    def __call__(self, batch):
        samples_output = []
        targets_output = []

        max_length = max([sample.shape[0] for sample, target in batch])

        for sample, target in batch:
            if not isinstance(sample, torch.Tensor):
                sample = torch.tensor(sample)

            sample = torch.cat(
                (
                    sample,
                    torch.zeros(
                        max_length - sample.shape[0],
                        *sample.shape[1:],
                        device=sample.device
                    ),
                )
            )
            samples_output.append(sample)
            targets_output.append(target)
        
        samples_output = torch.stack(samples_output, 0 if self.batch_first else 1)
        targets_output = ??? # TODO: Stack the targets according to your use case
        return (samples_output, targets_output)

DataLoader 现在应如下所示:DataLoader(sr, batch_size = batch_size, collat​​e_fn = MultiTargetPadTensors(), shuffle = True, drop_last = True)

  1. 调整数据集:

返回一个元组,例如 这:

    ...
    sample = (transformed_frames, [vel_x, vel_y, vel_z])
    return sample

It's a bit late, but I also stumbled on the same issue and here is how I fixed it:

The problem is not in iterating over the dataloader, but it is from the collate_fn you pass to the DataLoader as it assumes that your target is only 1 value. The error you get is from tonic's PadTensor() function where it iterates over the new batches:

for sample, target in batch:
    ...

It assumes that you don't have a dictionary but a tuple that can be unpacked. Later it then assumes that the target is also just one value.

How to fix

  1. Modify the collate function:

Create a new class where you remove the assumption of the target being just one value, like this:

class MultiTargetPadTensors:
    def __init__(self, batch_first: bool = True):
        self.batch_first = batch_first

    def __call__(self, batch):
        samples_output = []
        targets_output = []

        max_length = max([sample.shape[0] for sample, target in batch])

        for sample, target in batch:
            if not isinstance(sample, torch.Tensor):
                sample = torch.tensor(sample)

            sample = torch.cat(
                (
                    sample,
                    torch.zeros(
                        max_length - sample.shape[0],
                        *sample.shape[1:],
                        device=sample.device
                    ),
                )
            )
            samples_output.append(sample)
            targets_output.append(target)
        
        samples_output = torch.stack(samples_output, 0 if self.batch_first else 1)
        targets_output = ??? # TODO: Stack the targets according to your use case
        return (samples_output, targets_output)

The DataLoader should now look like this: DataLoader(sr, batch_size = batch_size, collate_fn = MultiTargetPadTensors(), shuffle = True, drop_last = True)

  1. Adjust the Dataset:

Instead of a dictionary, return a tuple like this:

    ...
    sample = (transformed_frames, [vel_x, vel_y, vel_z])
    return sample
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文