ImageFolder 在远程计算机上不起作用

发布于 2025-01-11 16:22:51 字数 2900 浏览 2 评论 0原文

因此,我需要访问安装了 ImageNet 的远程计算机。我已经在本地计算机上下载了训练神经网络所需的 tar,并使用 random_split() 将每个 tar 分成 80% 的训练集、10% 的验证集和 10% 的测试集。每组图像的名称都附加到一个 txt 文件中,我想将其用作检查通道。

因此,我编写了要在远程计算机上启动的 python 脚本。该文件必须传递 ImageNet 数据库的文件夹,并根据我发送到计算机的 txt 文件,仅包含文件中“命名”的图片。 这是脚本:

import os
import torch
from torch.utils.data import random_split
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
import time
import torch.optim as optim
from torchvision import datasets, models, transforms
from torch import nn
import copy

def set_seed(seed):
    """ Random seed generation for PyTorch. See https://pytorch.org/docs/stable/notes/randomness.html
        for further details.
    Args:
        seed (int): the seed for pseudonumber generation.
    """
    import random
    import numpy as np
    import torch

    if seed is not None:
        random.seed(seed)

        np.random.seed(seed)

        torch.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False

set_seed(42)

def is_valid_trfile(fname):
    with open('/home/fdalligna/hopelessly_trying/training.txt', 'r') as f:
        text = f.read().split()
    if fname.endswith('.JPEG'):
        return True
    elif fname in text:
        return True
    else: 
        return False


trainset = ImageFolder(root = '/media/data/Datasets/imagenet/winter21_whole', is_valid_file = is_valid_trfile)


def is_valid_vlfile(fname):
    with open('/home/fdalligna/hopelessly_trying/validating.txt', 'r') as f:
        text = f.read().split()
    if fname.endswith('.JPEG'):
        return True
    elif fname in text:
        return True
    else:
        return False


valset = ImageFolder(root = '/media/data/Datasets/imagenet/winter21_whole', is_valid_file = is_valid_vlfile)


def is_valid_tsfile(fname):
    with open('/home/fdalligna/hopelessly_trying/testing.txt', 'r') as f:
        text = f.read().split()
    if fname.endswith('.JPEG'):
        return True
    elif fname in text:
        return True
    else:
        return False


testset = ImageFolder(root = '/media/data/Datasets/imagenet/winter21_whole', is_valid_file = is_valid_tsfile)

assert trainset == 32298
assert valset == 4023
assert testset == 4064

print('everything ok') 

我插入了 assert 因为我希望它检查集合的大小是否正确。

但是,我收到此错误:

trainset = ImageFolder(root= /media/data/Datasets/imagenet/winter21_whole'
is_valid_file = is_valid_trfile)
File "/opt/anaconda/lib/python3.7/site-packages/torchvision/datasets/folder.py
line 209, in __init__
is_valid_file=is_valid_file)
File "/opt/anaconda/lib/python3.7/site-packages/torchvision/datasets/folder.py
line 97, in __init__
"Supported extensions are: " + ",".join (extensions)))
TypeError: can only join an iterable

我不知道该怎么做以及如何解决问题。有谁知道吗?

谢谢你!

So, I need to access a remote machine where ImageNet is installed. I already downloaded on my local computers the tars I need to train my neural network, and I used random_split() to split each tar in 80% trainset, 10% valset and 10% testset. The names of the images for each set were appended to a txt file, that I wanted to use as a check-passage.

So, I wrote python script that I wanted to launch on the remote machine. The file has to pass over the folders of the ImageNet database and based on txt files I sent to the machine, include only the pics that are "named" in the file.
Here's the script:

import os
import torch
from torch.utils.data import random_split
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
import time
import torch.optim as optim
from torchvision import datasets, models, transforms
from torch import nn
import copy

def set_seed(seed):
    """ Random seed generation for PyTorch. See https://pytorch.org/docs/stable/notes/randomness.html
        for further details.
    Args:
        seed (int): the seed for pseudonumber generation.
    """
    import random
    import numpy as np
    import torch

    if seed is not None:
        random.seed(seed)

        np.random.seed(seed)

        torch.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False

set_seed(42)

def is_valid_trfile(fname):
    with open('/home/fdalligna/hopelessly_trying/training.txt', 'r') as f:
        text = f.read().split()
    if fname.endswith('.JPEG'):
        return True
    elif fname in text:
        return True
    else: 
        return False


trainset = ImageFolder(root = '/media/data/Datasets/imagenet/winter21_whole', is_valid_file = is_valid_trfile)


def is_valid_vlfile(fname):
    with open('/home/fdalligna/hopelessly_trying/validating.txt', 'r') as f:
        text = f.read().split()
    if fname.endswith('.JPEG'):
        return True
    elif fname in text:
        return True
    else:
        return False


valset = ImageFolder(root = '/media/data/Datasets/imagenet/winter21_whole', is_valid_file = is_valid_vlfile)


def is_valid_tsfile(fname):
    with open('/home/fdalligna/hopelessly_trying/testing.txt', 'r') as f:
        text = f.read().split()
    if fname.endswith('.JPEG'):
        return True
    elif fname in text:
        return True
    else:
        return False


testset = ImageFolder(root = '/media/data/Datasets/imagenet/winter21_whole', is_valid_file = is_valid_tsfile)

assert trainset == 32298
assert valset == 4023
assert testset == 4064

print('everything ok') 

I inserted the assert because I wanted it to check that the size of the sets were correct.

However, I get this error:

trainset = ImageFolder(root= /media/data/Datasets/imagenet/winter21_whole'
is_valid_file = is_valid_trfile)
File "/opt/anaconda/lib/python3.7/site-packages/torchvision/datasets/folder.py
line 209, in __init__
is_valid_file=is_valid_file)
File "/opt/anaconda/lib/python3.7/site-packages/torchvision/datasets/folder.py
line 97, in __init__
"Supported extensions are: " + ",".join (extensions)))
TypeError: can only join an iterable

I don't know what to do and how to solve the problem. Does anyone know?

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

倾城°AllureLove 2025-01-18 16:22:51

根据您的评论,我的第一个猜测是使用 root 目录,而不是子文件夹的目录。在您的代码中,您可以像这样使用它:

trainset = ImageFolder(root = '/media/data/Datasets/imagenet/')

您可以阅读此处 了解如何在从 ImageFolder 实例化对象时构建文件夹并设置 root

根文件夹应包含类的文件夹,每个类都包含自己的图像。

Based on your comments, my first guess is to use the root directory, not the directory to subfolder. In your code, you can use it like:

trainset = ImageFolder(root = '/media/data/Datasets/imagenet/')

You can read here more to see how to structure the folders and set the root when you instantiate an object from ImageFolder.

The root folder should contain the folders of classes, and each class contains its own images.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文