缩小 32 位 RGB 图像的最快算法
使用哪种算法将 32 位 RGB 图像缩小到自定义分辨率?算法应该平均像素。
例如,如果我有 100x100 的图像,并且我想要尺寸为 20x50 的新图像。第一个源行的前五个像素的平均值将给出目标的第一个像素,第一个源列的前两个像素的平均值将给出第一个目标列像素。
目前我所做的是首先缩小 X 分辨率,然后缩小 Y 分辨率。我在此方法中需要一个临时缓冲区。
您知道有什么优化方法吗?
which algorithm to use to scale down 32Bit RGB IMAGE to custom resolution? Algorithm should average pixels.
for example If I have 100x100 image and I want new Image of size 20x50. Avg of first five pixels of first source row will give first pixel of dest, And avg of first two pixels of first source column will give first dest column pixel.
Currently what I do is first scale down in X resolution, and after that I scale down in Y resolution. I need one temp buffer in this method.
Is there any optimized method that you know?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
您正在寻找的术语是“重新采样”。在您的情况下,您需要图像重新采样。您似乎已经在进行线性插值,这应该是最快的。这里有大约 6 种基本算法。如果您确实想深入研究该主题,请查看“重采样内核”。
The term you are looking for is "Resampling." In your case you want image resampling. You seem to already be doing linear interpolation, which should be the fastest. Here are ~6 base algorithms. If you really want to delve into the subject look into "resampling kernels."
完成标准 C 优化(指针算术、定点数学等...)
还有一些更巧妙的优化。 (非常)很久以前,我看到一个首先缩放 X 方向的缩放器实现。在写出水平缩放图像的过程中,它在内存中将图像旋转了 90 度。这样,当需要读取 Y 方向刻度时,内存中的数据会更好地缓存对齐。
该技术在很大程度上取决于它将运行的处理器。
After you do the standard C optimizations (pointer arithmetic, fixed point math, etc...)
There are also some more clever optimizations to be had. A (very) long time ago, I saw an scaler implementation that scaled the X direction first. In the process of writing out the horizontally scaled image, it rotated the image 90degrees in memory. This was so that when it came time to do the reads for the Y direction scale, the data in memory would be better cache aligned.
This technique depends heavily on the processor that it will run on.
这对适当的像素进行平均。
对于边界条件,执行单独的循环(循环中没有 if )。
这是一个更像 C 的代码:
src 和 dest 是位图:
* 像素属性 src[x,y]
* 宽度属性 src.w
* 高度
像素的属性 src.h 已定义,以便
为简单起见添加
除法
常量 0 的
评估,当像素分量整数溢出时我不会考虑问题...
Bresenham 线路优化
This averages the appropriate pixels.
For boundary conditions do a separate loop (no if's in loop).
Here's a more C like code:
src and dest are bitmaps that:
* property src[x,y] for pixel
* property src.w for width
* property src.h for height
pixel has been defined so that
adding
division
evaluation with a constant 0
for simplicity sake I won't consider the problem when pixel component integer overflows...
Bresenham's line optimization
您忘记提及问题中最重要的方面:您对质量的关心程度。如果您不确切地关心源像素的值如何组合在一起以创建目标像素,则最快的像素(至少在几乎所有情况下)会产生最差的质量。
如果您想回答“仍然能产生非常好的质量的最快算法”,那么您基本上已经涵盖了仅处理图像采样/调整大小的整个算法领域。
您已经概述了该算法的初步想法:
计算源像素上每个通道的平均值可能被视为微不足道,您是否正在寻找执行此操作的示例代码?
或者您是否正在寻找有人用更快的东西来挑战您的算法初稿?
You forget to mention the most important aspect of the question: how much you care about quality. If you dont care exactly how the values of the sources pixels are smashed together to create the destination pixel the fastest is (at least in almost all cases) the one that produces the worst quality.
If youre tempted to respond with "the fastest algorithm that still yields very good quality" you have essentially covered the entire algorithm field that deals with just imagesampling/resizing.
And you already outlined your initial idea of the algorithm:
Calculating the average value for each channel on the source pixels could be seen as trivial, are you looking for example code that does that?
Or are you looking for someone to challenge your initial draft of the algorithm with something even faster?
如果您正在寻找冗长的解释,我找到了这篇文章 有帮助。另一方面,如果您更多地处理数学公式,则有一种快速图像缩小方法的解释 此处。
If you're looking for a wordy explanation, I've found this article to be helpful. If on the other hand you deal more in mathematical formulae, there is a method of fast image downscaling explained here.
这确实是速度/质量的权衡。
首先,你是正确的,先做一个维度,然后做另一个维度比它必须的要慢。内存读写次数过多。
您的重要选择是是否支持分数像素。您的示例是 100x100 到 20x50。因此 10 像素映射为 1。如果您要从 100x100 变为 21x49 该怎么办?您愿意在源像素边界进行操作,还是想拉入分数像素?对于 100x100 到 99x99 你会做什么?
您必须告诉我们您愿意接受什么,然后我们才能说出什么是最快的。
并告诉我们收缩可能出现的极端情况。源和目的地之间的差异可能有多少个数量级?在某些时候,对源内的代表性像素进行采样不会比对所有像素进行平均差很多。但是您必须小心选择代表性像素,否则您会因许多常见模式而出现锯齿。
It really is a speed/quality trade-off.
First of all, you're correct that doing one dimension then the other is slower than it has to be. Way too many memory reads and writes.
Your big choice is whether to support fractional pixels or not. Your example is 100x100 to 20x50. So 10 pixels map to 1. What if you're going from 100x100 to 21x49? Are you willing to operate at source pixel boundaries, or do you want to pull fractional pixels in? What would you do for 100x100 to 99x99?
You have to tell us what you're willing to accept before we can say what's fastest.
And also tell us the possible extremes of the shrinkage. How many orders of magnitude might the difference between the source and destination be? At some point, sampling representative pixels inside the source are won't be much worse than averaging all the pixels. But you'll have to be careful in choosing representative pixels or you'll get aliasing with many common patterns.
您正在做的是优化的方法。唯一更快的称为最近邻,您只需抓取范围的中间像素,而无需尝试对其中任何像素进行平均。如果原始图像中存在任何细节,则质量会明显变差,但如果原始图像很简单,则质量可能是可以接受的。
What you're doing is the optimized method. The only faster one is called nearest neighbor, where you simply grab the middle pixel of the range without trying to average any of them. The quality is significantly worse if there is any detail in the original image, although it might be acceptable if the original is simple.
这就是您在 C 中寻找的东西。它是用 C 实现的 Egons 方法,并针对速度进行了优化。 Alpha 通道被忽略并设置为 0,但这可以轻松更改。将两个内部循环包装在 Duffs-Loop 中只是为了提高性能 - 如果需要,可以用普通的 for 循环替换 Duffs-Loops。
参数:dst和src是指向32位像素数据的指针,dst_pitch和src_pitch是一条扫描线的长度(以字节为单位),src_width和src_height是以像素为单位的源图像的宽度和高度,factor_x和factor_y是缩放分母x 和 y 方向。
成功时返回 0,失败时返回 -1。
This is what you are looking for in C. It is Egons approach implemented in C and optimized for speed. Alpha channel is ignored and set to 0, but this can be easily changed. Wrapping the two inner loops in a Duffs-Loop is only for performance - the Duffs-Loops can be replaced by a normal for-loop if desired.
Parameters: dst and src are pointers to the 32-bit pixel data, dst_pitch and src_pitch are the lengths of one scanline in bytes, src_width and src_height are the width and height of the source image in pixels, factor_x and factor_y are the scaling denominators in x- and y-directions.
Returns 0 on success and -1 on failure.