RuntimeError:Shape' [32,3,224,224]'无效的输入50176

发布于 2025-02-01 05:40:49 字数 3377 浏览 2 评论 0原文

首先,我已经在224,224,3映像上训练了一个型号,现在我正在研究MNIST数据集代码库的可视化。下面的代码在灰度图像上很好地工作,但是当我用于颜色图像时,它无法解决。

代码正常工作

with torch.no_grad():
    while True:
        image = cv2.imread("example.png", flags=cv2.IMREAD_GRAYSCALE)
        print(image.shape)
        input_img_h, input_img_w = image.shape
        image = scale_transformation(image, scale_factor=scale_factors[scale_idx_factor])
        image = rotation_transformation(image, angle=rotation_factors[rotation_idx_factor])
        scale_idx_factor = (scale_idx_factor + 1) % len(scale_factors)
        rotation_idx_factor = (rotation_idx_factor + 1) % len(rotation_factors)

        image_tensor = torch.from_numpy(image) / 255.
        print("image_tensor.shape:", image_tensor.shape)

        image_tensor = image_tensor.view(1, 1, input_img_h, input_img_w)

        image_tensor = T.Normalize((0.1307,), (0.3081,))(image_tensor)
        image_tensor = image_tensor.to(device)

        out = model(image_tensor)

        image = np.repeat(image[..., np.newaxis], 3, axis=-1)
        roi_y, roi_x = input_img_h // 2, input_img_w // 2
        plot_offsets(image, save_output, roi_x=roi_x, roi_y=roi_y)

        save_output.clear()
        image = cv2.resize(image, dsize=(224, 224))
        cv2.imshow("image", image)
        key = cv2.waitKey(30)
        if key == 27:
            break

与问题的代码:我仅更改了图像大小

with torch.no_grad():
    while True:
        image = cv2.imread("image_06764.jpg")
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        print('Original Dimensions : ', image.shape)

        width = 224
        height = 224
        dim = (width, height)
        image = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
        # print(resized.shape[0])
        input_img_h = image.shape[0]
        input_img_w = image.shape[1]

        image = scale_transformation(image, scale_factor=scale_factors[scale_idx_factor])
        print("dfdf", image.shape)
        image = rotation_transformation(image, angle=rotation_factors[rotation_idx_factor])
        scale_idx_factor = (scale_idx_factor + 1) % len(scale_factors)
        rotation_idx_factor = (rotation_idx_factor + 1) % len(rotation_factors)

        image_tensor = torch.from_numpy(image) / 255.
        print("ggggggggggg", image_tensor.size())

        image_tensor = image_tensor.view(32, 3, input_img_h, input_img_w)
        print("image_tensor.shape:", image_tensor.shape)
        image_tensor = T.Normalize((0.1307,), (0.3081,))(image_tensor)
        image_tensor = image_tensor.to(device)
        out = model(image_tensor)

        image = np.repeat(image[..., np.newaxis], 3, axis=-1)
        roi_y, roi_x = input_img_h // 2, input_img_w // 2
        plot_offsets(image, save_output, roi_x=roi_x, roi_y=roi_y)

        save_output.clear()
        image = cv2.resize(image, dsize=(224, 224))
        cv2.imshow("image", image)
        key = cv2.waitKey(30)
        if key == 27:
            break

trackback

Traceback (most recent call last):
  File "/media/cvpr/CM_1/tutorials/Deformable_Convolutionv_V2/offset_visualization.py", line 184, in <module>
    image_tensor = image_tensor.view(32, 3, input_img_h, input_img_w)
RuntimeError: shape '[32, 3, 224, 224]' is invalid for input of size 50176

Firstly, I have trained a model on 224,224,3 images and now I am working on visualization taken from MNIST dataset codebase. Below code is worked fine on grayscale images but when i used for color images it didn't not work out.

Code Works fine

with torch.no_grad():
    while True:
        image = cv2.imread("example.png", flags=cv2.IMREAD_GRAYSCALE)
        print(image.shape)
        input_img_h, input_img_w = image.shape
        image = scale_transformation(image, scale_factor=scale_factors[scale_idx_factor])
        image = rotation_transformation(image, angle=rotation_factors[rotation_idx_factor])
        scale_idx_factor = (scale_idx_factor + 1) % len(scale_factors)
        rotation_idx_factor = (rotation_idx_factor + 1) % len(rotation_factors)

        image_tensor = torch.from_numpy(image) / 255.
        print("image_tensor.shape:", image_tensor.shape)

        image_tensor = image_tensor.view(1, 1, input_img_h, input_img_w)

        image_tensor = T.Normalize((0.1307,), (0.3081,))(image_tensor)
        image_tensor = image_tensor.to(device)

        out = model(image_tensor)

        image = np.repeat(image[..., np.newaxis], 3, axis=-1)
        roi_y, roi_x = input_img_h // 2, input_img_w // 2
        plot_offsets(image, save_output, roi_x=roi_x, roi_y=roi_y)

        save_output.clear()
        image = cv2.resize(image, dsize=(224, 224))
        cv2.imshow("image", image)
        key = cv2.waitKey(30)
        if key == 27:
            break

Code with problem: I have changed image size only

with torch.no_grad():
    while True:
        image = cv2.imread("image_06764.jpg")
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        print('Original Dimensions : ', image.shape)

        width = 224
        height = 224
        dim = (width, height)
        image = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
        # print(resized.shape[0])
        input_img_h = image.shape[0]
        input_img_w = image.shape[1]

        image = scale_transformation(image, scale_factor=scale_factors[scale_idx_factor])
        print("dfdf", image.shape)
        image = rotation_transformation(image, angle=rotation_factors[rotation_idx_factor])
        scale_idx_factor = (scale_idx_factor + 1) % len(scale_factors)
        rotation_idx_factor = (rotation_idx_factor + 1) % len(rotation_factors)

        image_tensor = torch.from_numpy(image) / 255.
        print("ggggggggggg", image_tensor.size())

        image_tensor = image_tensor.view(32, 3, input_img_h, input_img_w)
        print("image_tensor.shape:", image_tensor.shape)
        image_tensor = T.Normalize((0.1307,), (0.3081,))(image_tensor)
        image_tensor = image_tensor.to(device)
        out = model(image_tensor)

        image = np.repeat(image[..., np.newaxis], 3, axis=-1)
        roi_y, roi_x = input_img_h // 2, input_img_w // 2
        plot_offsets(image, save_output, roi_x=roi_x, roi_y=roi_y)

        save_output.clear()
        image = cv2.resize(image, dsize=(224, 224))
        cv2.imshow("image", image)
        key = cv2.waitKey(30)
        if key == 27:
            break

Traceback

Traceback (most recent call last):
  File "/media/cvpr/CM_1/tutorials/Deformable_Convolutionv_V2/offset_visualization.py", line 184, in <module>
    image_tensor = image_tensor.view(32, 3, input_img_h, input_img_w)
RuntimeError: shape '[32, 3, 224, 224]' is invalid for input of size 50176

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

吃不饱 2025-02-08 05:40:49

image_tensor50176的张量大小,可以将其调整到224x224。但是,您正在尝试将其调整到32x3x224x224
尝试以下操作:

image_tensor = image_tensor.view(1, 1, input_img_h, input_img_w).repeat(1, 3, 1, 1)

以上代码将复制灰度图像3时间频道的图像,导致张量的大小为1x3x224x224

此外,为什么要用image = cv2.cvtcolor(image,cv2.color_bgr2gray)将颜色图像转换为灰度图像?如果您将其删除,则不会出现通道问题。

欢迎答案的任何建议或错误纠正

image_tensor is a tensor size of 50176, which can be resized to 224x224. However, you're trying to resize it to 32x3x224x224.
Try this:

image_tensor = image_tensor.view(1, 1, input_img_h, input_img_w).repeat(1, 3, 1, 1)

Above code will copy the grayscale image 3 time channel-wise, resulting a tensor size of 1x3x224x224.

Additionally, why are you converting the color image to grayscale image with image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)? There will be no channel problem if you remove it.

Any advise or error correction of the answer is welcomed

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文