当前位置：文江博客话题详情

缩小 32 位 RGB 图像的最快算法

发布于 2024-08-10 17:20:38 字数 223 浏览 9 评论 0原文

使用哪种算法将 32 位 RGB 图像缩小到自定义分辨率？算法应该平均像素。

例如，如果我有 100x100 的图像，并且我想要尺寸为 20x50 的新图像。第一个源行的前五个像素的平均值将给出目标的第一个像素，第一个源列的前两个像素的平均值将给出第一个目标列像素。

目前我所做的是首先缩小 X 分辨率，然后缩小 Y 分辨率。我在此方法中需要一个临时缓冲区。

您知道有什么优化方法吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我做我的改变 2024-08-17 17:20:38

您正在寻找的术语是“重新采样”。在您的情况下，您需要图像重新采样。您似乎已经在进行线性插值，这应该是最快的。这里有大约 6 种基本算法。如果您确实想深入研究该主题，请查看“重采样内核”。

回复收藏 0 原文

梦年海沫深 2024-08-17 17:20:38

完成标准 C 优化（指针算术、定点数学等...）
还有一些更巧妙的优化。（非常）很久以前，我看到一个首先缩放 X 方向的缩放器实现。在写出水平缩放图像的过程中，它在内存中将图像旋转了 90 度。这样，当需要读取 Y 方向刻度时，内存中的数据会更好地缓存对齐。

该技术在很大程度上取决于它将运行的处理器。

回复收藏 0 原文

混浊又暗下来 2024-08-17 17:20:38

这对适当的像素进行平均。

 w_ratio = src.w / dest.w
 h_ratio = src.h / dest.h

 dest[x,y] = 
    AVG( src[x * w_ratio + xi, y * h_ratio + yi] ) 
      where
           xi in range (0, w_ratio - 1), inc by 1
           yi in range (0, h_ratio - 1), inc by 1

对于边界条件，执行单独的循环（循环中没有 if ）。

这是一个更像 C 的代码：

src 和 dest 是位图：
* 像素属性 src[x,y]
* 宽度属性 src.w
* 高度

像素的属性 src.h 已定义，以便

为简单起见添加

p1 = p1 + p2     
is same as
p1.r = p1.r + p2.r
p1.g = p1.g + p2.g
...

除法

p1 = p1 / c
p1.r = p1.r / c
p1.g = p1.g / c

常量 0 的

p1 = 0
p1.r = 0
p1.g = 0
...

评估，当像素分量整数溢出时我不会考虑问题...

float w_ratio = src.w / dest.w;
float h_ratio = src.h / dest.h;
int w_ratio_i = floor(w_ratio);
int h_ratio_i = floor(h_ratio);

wxh = w_ratio*h_ratio;

for (y = 0; y < dest.w; y++)
for (x = 0; x < dest.h; x++){
    pixel temp = 0;     

    int srcx, srcy;
    // we have to use here the floating point value w_ratio, h_ratio
    // otherwise towards the end it can get a little wrong
    // this multiplication can be optimized similarly to Bresenham's line
    srcx = floor(x * w_ratio);
    srcy = floor(y * h_ratio);

    // here we use floored value otherwise it might overflow src bitmap
    for(yi = 0; yi < h_ratio_i; yi++)
    for(xi = 0; xi < w_ratio_i; xi++)
            temp += src[srcx + xi, srcy + yi];
    dest[x,y] = temp / wxh;
}

Bresenham 线路优化

This averages the appropriate pixels.

 w_ratio = src.w / dest.w
 h_ratio = src.h / dest.h

 dest[x,y] = 
    AVG( src[x * w_ratio + xi, y * h_ratio + yi] ) 
      where
           xi in range (0, w_ratio - 1), inc by 1
           yi in range (0, h_ratio - 1), inc by 1

For boundary conditions do a separate loop (no if's in loop).

Here's a more C like code:

src and dest are bitmaps that:
* property src[x,y] for pixel
* property src.w for width
* property src.h for height

pixel has been defined so that

adding

p1 = p1 + p2     
is same as
p1.r = p1.r + p2.r
p1.g = p1.g + p2.g
...

division

p1 = p1 / c
p1.r = p1.r / c
p1.g = p1.g / c

evaluation with a constant 0

p1 = 0
p1.r = 0
p1.g = 0
...

for simplicity sake I won't consider the problem when pixel component integer overflows...

float w_ratio = src.w / dest.w;
float h_ratio = src.h / dest.h;
int w_ratio_i = floor(w_ratio);
int h_ratio_i = floor(h_ratio);

wxh = w_ratio*h_ratio;

for (y = 0; y < dest.w; y++)
for (x = 0; x < dest.h; x++){
    pixel temp = 0;     

    int srcx, srcy;
    // we have to use here the floating point value w_ratio, h_ratio
    // otherwise towards the end it can get a little wrong
    // this multiplication can be optimized similarly to Bresenham's line
    srcx = floor(x * w_ratio);
    srcy = floor(y * h_ratio);

    // here we use floored value otherwise it might overflow src bitmap
    for(yi = 0; yi < h_ratio_i; yi++)
    for(xi = 0; xi < w_ratio_i; xi++)
            temp += src[srcx + xi, srcy + yi];
    dest[x,y] = temp / wxh;
}

Bresenham's line optimization

回复收藏 0 原文

烟酉 2024-08-17 17:20:38

您忘记提及问题中最重要的方面：您对质量的关心程度。如果您不确切地关心源像素的值如何组合在一起以创建目标像素，则最快的像素（至少在几乎所有情况下）会产生最差的质量。

如果您想回答“仍然能产生非常好的质量的最快算法”，那么您基本上已经涵盖了仅处理图像采样/调整大小的整个算法领域。

您已经概述了该算法的初步想法：

第一个的前五个像素的平均值
源行将给出第一个像素
目的地，

计算源像素上每个通道的平均值可能被视为微不足道，您是否正在寻找执行此操作的示例代码？

或者您是否正在寻找有人用更快的东西来挑战您的算法初稿？

回复收藏 0 原文

念﹏祤嫣 2024-08-17 17:20:38

如果您正在寻找冗长的解释，我找到了这篇文章有帮助。另一方面，如果您更多地处理数学公式，则有一种快速图像缩小方法的解释此处。

回复收藏 0 原文

南烟 2024-08-17 17:20:38

这确实是速度/质量的权衡。

首先，你是正确的，先做一个维度，然后做另一个维度比它必须的要慢。内存读写次数过多。

您的重要选择是是否支持分数像素。您的示例是 100x100 到 20x50。因此 10 像素映射为 1。如果您要从 100x100 变为 21x49 该怎么办？您愿意在源像素边界进行操作，还是想拉入分数像素？对于 100x100 到 99x99 你会做什么？

您必须告诉我们您愿意接受什么，然后我们才能说出什么是最快的。

并告诉我们收缩可能出现的极端情况。源和目的地之间的差异可能有多少个数量级？在某些时候，对源内的代表性像素进行采样不会比对所有像素进行平均差很多。但是您必须小心选择代表性像素，否则您会因许多常见模式而出现锯齿。

回复收藏 0 原文

冬天的雪花 2024-08-17 17:20:38

您正在做的是优化的方法。唯一更快的称为最近邻，您只需抓取范围的中间像素，而无需尝试对其中任何像素进行平均。如果原始图像中存在任何细节，则质量会明显变差，但如果原始图像很简单，则质量可能是可以接受的。

回复收藏 0 原文

甜｀诱少女 2024-08-17 17:20:38

这就是您在 C 中寻找的东西。它是用 C 实现的 Egons 方法，并针对速度进行了优化。 Alpha 通道被忽略并设置为 0，但这可以轻松更改。将两个内部循环包装在 Duffs-Loop 中只是为了提高性能 - 如果需要，可以用普通的 for 循环替换 Duffs-Loops。

参数：dst和src是指向32位像素数据的指针，dst_pitch和src_pitch是一条扫描线的长度（以字节为单位），src_width和src_height是以像素为单位的源图像的宽度和高度，factor_x和factor_y是缩放分母x 和 y 方向。

成功时返回 0，失败时返回 -1。

#define DUFFS_LOOP(pixel_copy_increment, width) \
{ int n = (width+7)/8;                          \
    switch (width & 7) {                        \
    case 0: do {    pixel_copy_increment;       \
    case 7:     pixel_copy_increment;           \
    case 6:     pixel_copy_increment;           \
    case 5:     pixel_copy_increment;           \
    case 4:     pixel_copy_increment;           \
    case 3:     pixel_copy_increment;           \
    case 2:     pixel_copy_increment;           \
    case 1:     pixel_copy_increment;           \
        } while ( --n > 0 );                    \
    }                                           \
}

int fastscale(unsigned char *dst, int dst_pitch, unsigned char *src, int src_width, int src_height, int src_pitch, int factor_x, int factor_y)
{
    if (factor_x < 1 || factor_y < 1) return -1;

    int temp_r, temp_g, temp_b;
    int i1,i2;

    int dst_width = src_width / factor_x;
    int dst_height = src_height / factor_y;
    if (!dst_height || !dst_width) return -1;
    int factors_mul = factor_x * factor_y;
    int factorx_mul4 = factor_x << 2;
    int src_skip1 = src_pitch - factorx_mul4;
    int src_skip2 = factorx_mul4 - factor_y * src_pitch;
    int src_skip3 = src_pitch * factor_y - dst_width * factorx_mul4;
    int dst_skip = dst_pitch - (dst_width << 2);

    for (i1 = 0; i1 < dst_height; ++i1)
    {
        for (i2 = 0; i2 < dst_width; ++i2)
        {
            temp_r = temp_g = temp_b = 0;
            DUFFS_LOOP ({
                DUFFS_LOOP ({
                    src++; // alpha
                    temp_r += *(src++);
                    temp_g += *(src++);
                    temp_b += *(src++);
                }, factor_x);
                src += src_skip1;
            }, factor_y);
            *(dst++) = 0; // alpha
            *(dst++) = temp_r / factors_mul;
            *(dst++) = temp_g / factors_mul;
            *(dst++) = temp_b / factors_mul;
            src += src_skip2;
        }
        dst += dst_skip;
        src += src_skip3;
    }
    return 0;
}

This is what you are looking for in C. It is Egons approach implemented in C and optimized for speed. Alpha channel is ignored and set to 0, but this can be easily changed. Wrapping the two inner loops in a Duffs-Loop is only for performance - the Duffs-Loops can be replaced by a normal for-loop if desired.

Parameters: dst and src are pointers to the 32-bit pixel data, dst_pitch and src_pitch are the lengths of one scanline in bytes, src_width and src_height are the width and height of the source image in pixels, factor_x and factor_y are the scaling denominators in x- and y-directions.

Returns 0 on success and -1 on failure.

#define DUFFS_LOOP(pixel_copy_increment, width) \
{ int n = (width+7)/8;                          \
    switch (width & 7) {                        \
    case 0: do {    pixel_copy_increment;       \
    case 7:     pixel_copy_increment;           \
    case 6:     pixel_copy_increment;           \
    case 5:     pixel_copy_increment;           \
    case 4:     pixel_copy_increment;           \
    case 3:     pixel_copy_increment;           \
    case 2:     pixel_copy_increment;           \
    case 1:     pixel_copy_increment;           \
        } while ( --n > 0 );                    \
    }                                           \
}

int fastscale(unsigned char *dst, int dst_pitch, unsigned char *src, int src_width, int src_height, int src_pitch, int factor_x, int factor_y)
{
    if (factor_x < 1 || factor_y < 1) return -1;

    int temp_r, temp_g, temp_b;
    int i1,i2;

    int dst_width = src_width / factor_x;
    int dst_height = src_height / factor_y;
    if (!dst_height || !dst_width) return -1;
    int factors_mul = factor_x * factor_y;
    int factorx_mul4 = factor_x << 2;
    int src_skip1 = src_pitch - factorx_mul4;
    int src_skip2 = factorx_mul4 - factor_y * src_pitch;
    int src_skip3 = src_pitch * factor_y - dst_width * factorx_mul4;
    int dst_skip = dst_pitch - (dst_width << 2);

    for (i1 = 0; i1 < dst_height; ++i1)
    {
        for (i2 = 0; i2 < dst_width; ++i2)
        {
            temp_r = temp_g = temp_b = 0;
            DUFFS_LOOP ({
                DUFFS_LOOP ({
                    src++; // alpha
                    temp_r += *(src++);
                    temp_g += *(src++);
                    temp_b += *(src++);
                }, factor_x);
                src += src_skip1;
            }, factor_y);
            *(dst++) = 0; // alpha
            *(dst++) = temp_r / factors_mul;
            *(dst++) = temp_g / factors_mul;
            *(dst++) = temp_b / factors_mul;
            src += src_skip2;
        }
        dst += dst_skip;
        src += src_skip3;
    }
    return 0;
}

回复收藏 0 原文

~没有更多了~