将张量保存到 .pt 文件以创建数据集

发布于 2025-01-10 02:05:15 字数 910 浏览 0 评论 0原文

我的任务是创建一个数据集来测试我们正在处理的代码的功能。

数据集必须具有一组稍后将在生成模型中使用的张量。

我正在尝试将张量保存到 .pt 文件中,但我正在覆盖张量,从而创建一个只有一个的文件。我已经阅读过有关 torch.utils.data.dataset 的内容,但我无法自己弄清楚如何使用它。

这是我的代码:

import torch
import numpy as np

from torch.utils.data import Dataset

#variables that will be used to create the size of the tensors:
num_jets, num_particles, num_features = 1, 30, 3


for i in range(100):
    #tensor from a gaussian dist with mean=5,std=1 and shape=size:
    tensor = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    #We will need the tensors to be of the cpu type
    tensor = tensor.cpu()

    #save the tensor to 'tensor_dataset.pt'
    torch.save(tensor,'tensor_dataset.pt')


#open the recently created .pt file inside a list
tensor_list = torch.load('tensor_dataset.pt')

#prints the list. Just one tensor inside .pt file
print(tensor_list)

I was tasked with the creation of a dataset to test the functionality of the code we're working on.

The dataset must have a group of tensors that will be used later on in a generative model.

I'm trying to save the tensors to a .pt file, but I'm overwriting the tensors thus creating a file with only one. I've read about torch.utils.data.dataset but I'm not able to figure out by my own how to use it.

Here is my code:

import torch
import numpy as np

from torch.utils.data import Dataset

#variables that will be used to create the size of the tensors:
num_jets, num_particles, num_features = 1, 30, 3


for i in range(100):
    #tensor from a gaussian dist with mean=5,std=1 and shape=size:
    tensor = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    #We will need the tensors to be of the cpu type
    tensor = tensor.cpu()

    #save the tensor to 'tensor_dataset.pt'
    torch.save(tensor,'tensor_dataset.pt')


#open the recently created .pt file inside a list
tensor_list = torch.load('tensor_dataset.pt')

#prints the list. Just one tensor inside .pt file
print(tensor_list)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

说好的呢 2025-01-17 02:05:15

原因:您每次在循环中都覆盖了张量x,因此您没有得到列表,并且最后只有 x 。

解决方案:你已经知道了张量的大小,可以先初始化一个张量,然后迭代lst_tensors

import torch
import numpy as np
from torch.utils.data import Dataset

num_jets, num_particles, num_features = 1, 30, 3

lst_tensors = torch.empty(size=(100,num_jets, num_particles, num_features))

for i in range(100):

    lst_tensors[i] = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    lst_tensors[i] = lst_tensors[i].cpu()


torch.save(lst_tensors,'tensor_dataset.pt')

tensor_list = torch.load('tensor_dataset.pt')

print(tensor_list.shape)   # [100,1,30,3]

Reason: You overwrote tensor x each time in a loop, therefore you did not get your list, and you only had x at the end.

Solution: you have the size of the tensor, you can initialize a tensor first and iterate through lst_tensors:

import torch
import numpy as np
from torch.utils.data import Dataset

num_jets, num_particles, num_features = 1, 30, 3

lst_tensors = torch.empty(size=(100,num_jets, num_particles, num_features))

for i in range(100):

    lst_tensors[i] = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    lst_tensors[i] = lst_tensors[i].cpu()


torch.save(lst_tensors,'tensor_dataset.pt')

tensor_list = torch.load('tensor_dataset.pt')

print(tensor_list.shape)   # [100,1,30,3]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文