将张量保存到 .pt 文件以创建数据集

发布于 2025-01-10 02:05:15 字数 910 浏览 0 评论 0原文

我的任务是创建一个数据集来测试我们正在处理的代码的功能。

数据集必须具有一组稍后将在生成模型中使用的张量。

我正在尝试将张量保存到 .pt 文件中，但我正在覆盖张量，从而创建一个只有一个的文件。我已经阅读过有关 torch.utils.data.dataset 的内容，但我无法自己弄清楚如何使用它。

这是我的代码：

import torch
import numpy as np

from torch.utils.data import Dataset

#variables that will be used to create the size of the tensors:
num_jets, num_particles, num_features = 1, 30, 3


for i in range(100):
    #tensor from a gaussian dist with mean=5,std=1 and shape=size:
    tensor = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    #We will need the tensors to be of the cpu type
    tensor = tensor.cpu()

    #save the tensor to 'tensor_dataset.pt'
    torch.save(tensor,'tensor_dataset.pt')


#open the recently created .pt file inside a list
tensor_list = torch.load('tensor_dataset.pt')

#prints the list. Just one tensor inside .pt file
print(tensor_list)

原文

I was tasked with the creation of a dataset to test the functionality of the code we're working on.

The dataset must have a group of tensors that will be used later on in a generative model.

I'm trying to save the tensors to a .pt file, but I'm overwriting the tensors thus creating a file with only one. I've read about torch.utils.data.dataset but I'm not able to figure out by my own how to use it.

Here is my code:

import torch
import numpy as np

from torch.utils.data import Dataset

#variables that will be used to create the size of the tensors:
num_jets, num_particles, num_features = 1, 30, 3


for i in range(100):
    #tensor from a gaussian dist with mean=5,std=1 and shape=size:
    tensor = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    #We will need the tensors to be of the cpu type
    tensor = tensor.cpu()

    #save the tensor to 'tensor_dataset.pt'
    torch.save(tensor,'tensor_dataset.pt')


#open the recently created .pt file inside a list
tensor_list = torch.load('tensor_dataset.pt')

#prints the list. Just one tensor inside .pt file
print(tensor_list)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

说好的呢 2025-01-17 02:05:15

原因：您每次在循环中都覆盖了张量x，因此您没有得到列表，并且最后只有 x 。

解决方案：你已经知道了张量的大小，可以先初始化一个张量，然后迭代lst_tensors：

import torch
import numpy as np
from torch.utils.data import Dataset

num_jets, num_particles, num_features = 1, 30, 3

lst_tensors = torch.empty(size=(100,num_jets, num_particles, num_features))

for i in range(100):

    lst_tensors[i] = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    lst_tensors[i] = lst_tensors[i].cpu()


torch.save(lst_tensors,'tensor_dataset.pt')

tensor_list = torch.load('tensor_dataset.pt')

print(tensor_list.shape)   # [100,1,30,3]

Reason: You overwrote tensor x each time in a loop, therefore you did not get your list, and you only had x at the end.

Solution: you have the size of the tensor, you can initialize a tensor first and iterate through lst_tensors:

import torch
import numpy as np
from torch.utils.data import Dataset

num_jets, num_particles, num_features = 1, 30, 3

lst_tensors = torch.empty(size=(100,num_jets, num_particles, num_features))

for i in range(100):

    lst_tensors[i] = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    lst_tensors[i] = lst_tensors[i].cpu()


torch.save(lst_tensors,'tensor_dataset.pt')

tensor_list = torch.load('tensor_dataset.pt')

print(tensor_list.shape)   # [100,1,30,3]

回复收藏 0 原文

~没有更多了~