如何在图像卷积过程中使用预乘来解决alpha出血问题?
我正在尝试对透明图像应用框模糊,并且在边缘周围出现“暗光晕”。
Jerry Huxtable 简短地提到了该问题,并提供了一个很好的演示来显示问题的发生:
但我一生都无法理解“预乘 alpha” 是如何实现的解决问题。现在举一个非常简单的例子。我有一张 3x3 图像,包含一个红色和一个绿色像素:
实际上,其余像素是透明的:
现在我们将对图像应用 3x3 框模糊。为了简单起见,我们只计算中心像素的新值。 由于我们有一个 9 位置的正方形(3x3,称为内核),我们取内核中每个像素的 1/9,并将其相加:
框模糊的工作方式是, sstatic.net/cHyJg.png" alt="在此处输入图像描述">
因此
finalRed = 1/9 * red1 + 1/9 * red2 + 1/9 * red3+ ... + 1/9 * red9
finalGreen = 1/9*green1 + 1/9*green2 + 1/9*green3+ ... + 1/9*green9
finalBlue = 1/9* blue1 + 1/9* blue2 + 1/9* blue3+ ... + 1/9* blue9
finalAlpha = 1/9*alpha1 + 1/9*alpha2 + 1/9*alpha3+ ... + 1/9*alpha9
,在这个非常简化的示例中,计算变得非常简单:
finalRed = 1/9 * 255
finalGreen = 1/9 * 255
finalBlue = 0
finalAlpha = 1/9*255 + 1/9*255
这给出了最终颜色值:
finalRed = 28
finalGreen = 28
finalBlue = 0
finalAlpha = 56 (22.2%)
此颜色太暗。当我在 Photoshop 中对同一个 3x3 像素图像执行 3px 框模糊时,我得到了我所期望的结果:
在白色上显示时更清晰:
实际上,我正在包含透明的位图上执行框模糊文本,并且文本在边缘周围获得明显的黑色:
我从 GDI+ 开始采用 PixelFormat32bppARGB
格式的位图
应用 3x3 卷积核时如何使用“预乘 alpha”?
任何答案都必须包括新的论坛,因为:
final = 1/9*(pixel1+pixel2+pixel3...+pixel9)
让我得到错误的答案。
编辑:一个更简单的示例是:
我将使用 0..1 范围内的颜色和 alpha 值执行此数学运算:
我将把框模糊卷积滤镜应用到中间像素:
ARGB'
= 1/9 * (0,1,0,1) + 1/9 * (0,0,0,0) + 1/9 * (0,0,0,0) +
1/9 * (0,1,0,1) + 1/9 * (0,0,0,0) + 1/9 * (0,0,0,0) +
1/9 * (0,1,0,1) + 1/9 * (0,0,0,0) + 1/9 * (0,0,0,0);
= (0, 0.11, 0, 0.11) + (0,0,0,0) + (0,0,0,0) +
(0, 0.11, 0, 0.11) + (0,0,0,0) + (0,0,0,0) +
(0, 0.11, 0, 0.11) + (0,0,0,0) + (0,0,0,0)
= (0, 0.33, 0, 0.33)
这会产生相当透明的深绿色。
这不是我期望看到的。相比之下,Photoshop 的框模糊是:
如果我假设 (0, 0.33, 0, 0.33 )
是预先相乘的 alpha,并且取消相乘,我得到:
(0, 1, 0, 0.33)
看起来对于我的完全不透明的例子来说是正确的;但当我开始涉及部分透明像素时,我不知道该怎么办。
另请参阅
- 纹理过滤:Alpha 剪切
- < a href="http://blogs.msdn.com/b/shawnhar/archive/2009/11/06/premultiplied-alpha.aspx">预乘 alpha
i'm trying to apply a box blur to an transparent image, and i'm getting a "dark halo" around the edges.
Jerry Huxtable has a short mention of the problem, and a very good demonstration showing the problem happen:
But i, for the life of me, cannot understand how "pre-multiplied alpha" can fix the problem. Now for a very simple example. i have a 3x3 image, containing one red and one green pixel:
In reality the remaining pixels are transparent:
Now we will apply a 3x3 Box Blur to the image. For simplicities sake, we'll only calculate the new value of the center pixel. The way a box blur works is that since we have a 9 position square (3x3, called the kernel) we take 1/9th of each pixels in the kernel, and add it up:
So
finalRed = 1/9 * red1 + 1/9 * red2 + 1/9 * red3+ ... + 1/9 * red9
finalGreen = 1/9*green1 + 1/9*green2 + 1/9*green3+ ... + 1/9*green9
finalBlue = 1/9* blue1 + 1/9* blue2 + 1/9* blue3+ ... + 1/9* blue9
finalAlpha = 1/9*alpha1 + 1/9*alpha2 + 1/9*alpha3+ ... + 1/9*alpha9
In this very simplified example, the calculations become very simple:
finalRed = 1/9 * 255
finalGreen = 1/9 * 255
finalBlue = 0
finalAlpha = 1/9*255 + 1/9*255
This gives me a final color value of:
finalRed = 28
finalGreen = 28
finalBlue = 0
finalAlpha = 56 (22.2%)
This color is too dark. When i perform a 3px Box blur on the same 3x3 pixel image in Photoshop, i get what i expect:
Which is clearer when displayed over white:
In reality i'm performing a box blur on a bitmap containing transparent text, and the text gains the tell-tale dark around the fringes:
i'm starting with a GDI+ Bitmap that is in PixelFormat32bppARGB
format
How do i use "pre-multiplied alpha" when applying 3x3 convolution kernel?
Any answer will have to include new forumla, since:
final = 1/9*(pixel1+pixel2+pixel3...+pixel9)
Is getting me the wrong answer.
Edit: A simpler example is:
i'll perform this math with color and alpha values in the range of 0..1:
i'm going to apply the box blur convolution filter to the middle pixel:
ARGB'
= 1/9 * (0,1,0,1) + 1/9 * (0,0,0,0) + 1/9 * (0,0,0,0) +
1/9 * (0,1,0,1) + 1/9 * (0,0,0,0) + 1/9 * (0,0,0,0) +
1/9 * (0,1,0,1) + 1/9 * (0,0,0,0) + 1/9 * (0,0,0,0);
= (0, 0.11, 0, 0.11) + (0,0,0,0) + (0,0,0,0) +
(0, 0.11, 0, 0.11) + (0,0,0,0) + (0,0,0,0) +
(0, 0.11, 0, 0.11) + (0,0,0,0) + (0,0,0,0)
= (0, 0.33, 0, 0.33)
Which gives a fairly transparent dark green.
Which is not what i expect to see. And by comparison Photoshop's Box Blur is:
If i assume (0, 0.33, 0, 0.33)
is pre-multiplied alpha, and un-multiply it, i get:
(0, 1, 0, 0.33)
Which looks right for my all-opaque example; but i don't know what to do when i begin to involve partially transparent pixels.
See also
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
tkerwin 有已经提供了正确答案,但似乎需要进一步解释。
您在问题中显示的数学直到最后都是绝对正确的。正是在那里,您错过了一个步骤 - 结果仍然处于预乘 Alpha 模式,并且必须“未相乘”回到 PixelFormat32bppARGB 格式。乘法的反面是除法,因此:
您已经表达了对除法可能会产生严重超出范围的结果的担忧,但这不会发生。如果您进行数学计算,您会发现由于预乘步骤的原因,红色、绿色和蓝色值不能大于 alpha 值。如果您使用的过滤器比简单的框模糊更复杂,则可能是这样,但即使您不使用 Alpha,情况也会如此!正确的反应是限制结果,将负数转为 0,将大于 255 的任何值转为 255。
tkerwin has already provided the correct answer, but it seems to need further explanation.
The math you've shown in your question is absolutely correct, right up until the end. It is there that you're missing a step - the results are still in a pre-multiplied alpha mode, and must be "unmultiplied" back to the PixelFormat32bppARGB format. The opposite of a multiply is a divide, thus:
You've expressed a concern that the divide might create a result that is wildly out of range, but that won't happen. If you trace the math, you'll notice that the red, green, and blue values can't be greater than the alpha value, because of the pre-multiplication step. If you were using a more complicated filter than a simple box blur it might be a possibility, but that would be the case even if you weren't using alpha! The correct response is to clamp the result, turning negative numbers into 0 and anything greater than 255 into 255.
按照链接中的建议,您在模糊之前进行预乘,并在模糊后取消预乘。在您的示例中,预乘实际上没有任何作用,因为没有半透明像素。您进行了模糊,然后您需要通过以下方式进行非预乘(假设标准化颜色值从 0 到 1):
这将为您提供非预乘的模糊最终图像。
Following the advice in your link, you pre-multiply before blurring and un-pre-multiply after blurring. In your example, pre-multiplying actually does nothing, since there are no semi-transparent pixels. You did the blur, then you need you un-pre-multiply by doing (assuming normalized color values from 0 to 1):
This will get you a non-pre-multiplied blurred final image.
正确的数学,错误的运算
这里两个答案都出现错误,并且您在示例中得到了正确的数学,只是不是过度运算。
使用正确的混合公式,根据波特达夫的说法是:
不确定其余答案来自哪里,但是哇。
alpha 编码决定了过度运算。
正确编码后,RGBA 代表发射和遮挡
您必须在图像中关联 Alpha 的原因是 RGB 值始终代表发射。在这种情况下,RGB代表像素的发射,而alpha代表遮挡程度。
也就是说,如果您进行线性插值,就像您的框模糊一样,这些值必须指示排放。在最简单的情况下,考虑一个像素发射 100%,我们在每个增量处线性插值 25%。
在这里,每个 RGB 的发射量应减少 25%,这样我们就有 100%、75%、50%、25%,最后是 0%。请注意,每次增量时,发射都会随着遮挡程度而缩放,这样 alpha 将在相同的值下保持完美同步。只需将 RGBA 卷积作为发射和遮挡的一个单元即可。最重要的是,混合配方必须是这个答案中的第一个。
现在考虑不相关的 alpha 情况,它实际上根本没有编码。这里,发射与遮挡程度完全无关,因此过度操作将发射的缩放捆绑到第一步中,其中不相关的发射乘以遮挡程度。
FG.RGB * FG.A + (1.0 - FG.A) BG.RGB
但是,如果有人愚蠢地尝试执行完全相同的线性插值、旋转、模糊、涂抹或任何其他操作,会发生什么情况对于这样完全未编码的值?好吧,让我们看看...
在不相关的 alpha 示例中,我们的发射率为 100%,但我们的遮挡程度为 75%。现在向下插值 25%。请注意光传输的基本数学在这里是如何失败的。我们最终会得到与发射完全不同步的 alpha 表示。 100%RGB-75%A、75%RGB-56.25%A、50%RGB-28.125%A 等。
但是等等,还有更多黑暗...
最后一个难题是您正在使用引用的非线性编码显示价值观。也就是说,当使用为 sRGB 显示器编码的值时,它们会被非线性压缩,而这又会被显示器的 EOTF 恢复为显示器的线性光输出。这意味着我们的代码值不代表从显示器发出的类似辐射的比率。因此,发射和遮挡的线性数学运算将失败,就像对压缩 MP3 值执行音频数学运算将失败一样。
因此,在我们简单的线性插值示例中,我们可以看到数学从根本上很快就崩溃了。
我们的第一个增量是 100% 排放量,第二个增量是 75%,依此类推。但是,如果我们使用 sRGB 非线性编码值从 100% 线性插值到 25%,则实际发射的光量甚至与这个简单的数学无关,并且我们的 alpha 将以与不关联的 alpha 编码崩溃的方式非常相似的方式产生边缘。 。解决这个问题的唯一方法是使我们的过程分为三个步骤:
正确的答案,错误的公式
不过,你的盒子模糊数学在第一个例子中是足够正确的,你的半透明绿色的发射率为 33%,alpha 遮挡为 33%。根本问题是他的运算过度,也就是上面的公式。也就是说,在绿色覆盖完整发射背景的情况下:
但请注意,如果您要将示例合成在纯 100% sRGB 红色发射上,即使使用正确的数学,您也会得到较暗的结果,因为非-辐射编码值。您选择的 sRGB 绿色和红色是完美的示例,它们会加剧将要发生的非线性数学变暗,特别是当您在完全发射的绿色背景之上对红色对象执行模糊时。尽管使用非线性压缩的 sRGB 代码值进行数学计算,但以下合成完全正确:
补充阅读
一些相关的引述。 首先来自 Autodesk 的 Zap Andersson,在臭名昭著的 Adobe Alpha 线程中,由 Alpha 通道的创建者 Alvy Ray Smith 确认:
科学院科技成就奖获得者 Larry Gritz:
(格里茨先生在这里通过他的“不能”声明所指的一个例子是火焰或反射之类的情况,它没有遮挡,只有发射,或者 Alpha 为零的 RGB。这是根本上不可能使用不关联的 alpha 运算来表达。)
科学院科技成果获得者杰里米·塞兰:
Correct Math, Wrong Operation
Both answers appear wrong here, and you got the math right in your example, just not the over operation.
Use the proper blending formula, which according to Porter Duff is:
Not sure where the rest of the answers are coming from, but wow.
The alpha encoding dictates the over operation.
When Properly Encoded, RGBA Represents Emission and Occlusion
The reason that you must have associated alpha in an image is because the RGB values always represent the emission. In this case, the RGB represents the emission of pixel, while the alpha represents the degree of occlusion.
That is, if you linearly interpolate, much like your box blur, the values must be indicative of the emissions. In the most simple case, consider a pixel emitting 100% and we linearly interpolate down by 25% at each increment.
Here, the emission for each RGB should decrease by 25%, such that we'd have 100%, 75%, 50%, 25%, and finally 0%. Note that at each increment the emission would scale with the degree of occlusion, such that the alpha would remain perfectly in sync at the same values. Simply apply your convolution across RGBA as one unit of emission and occlusion. Most importantly, the blending formula must be the first one in this answer.
Now consider the unassociated alpha case, which isn't actually encoded at all. Here the emission is completely unassociated with the degree of occlusion, so the over operation bundles that scaling of the emission up into the first step, where the unassociated emission is multiplied by the degree of occlusion.
FG.RGB * FG.A + (1.0 - FG.A) BG.RGB
But what happens if someone foolishly tries to perform the exact same linear interpolation, rotation, blur, smudge, or any other manipulation on such completely not-encoded values? Well let's see...
In the unassociated alpha example, our emission is 100%, yet our degree of occlusion is 75%. Now interpolate down by 25%. Notice how the fundamental math of light transport fails here. We would end up with the alpha representation completely out-of-sync with the emission. 100%RGB-75%A, 75%RGB-56.25%A, 50%RGB-28.125%A, etc.
But Wait, There's More Darkness...
The final piece of the puzzle is that you are using nonlinearly encoded display referred values. That is, when using values encoded for an sRGB display, they are nonlinearly compressed, which in turn is undone by the display's EOTF back to linear light output from the display. That means that our code values do not represent the radiometric-like ratios that are emitted from the display. As such, the linear math of emission and occlusion will fail, in much the same way performing audio math on compressed MP3 values will fail.
So in our simple linear interpolation example, we can see that the math fundamentally falls apart quickly.
Our first increment would be 100% emission, and our second would be 75%, and so on. But if we linearly interpolate down from 100% to 25% using sRGB nonlinearly encoded values, the actual emitted amount of light is disconnected from even this simple math, and our alpha will fringe in a very similar manner to how the unassociated alpha encoding fell apart. The only way around this is to make our process three steps:
Right Answer, Wrong Over Formula
Again though, your box blur math was correct enough in the first instance, with your semi-transparent green being at 33% emission and the alpha occlusion being 33%. The fundamental problem was he over operation, which would be the above formula. That is, in the case of your green over full emission background:
Note however that if you were to composite your example over a solid 100% sRGB red emission that even with the proper math, you'd get a darker result due to the non-radiometric encoded values. Your choice of sRGB green and red are perfect examples that will exacerbate the nonlinear math darkening that will happen, especially if you perform the blur on the red object over top of a fully emissive green background. The following is composited entirely correctly, albeit using the nonlinearly compressed sRGB code values for the math:
Additional Reading
A few relevant quotes. First from Zap Andersson of Autodesk in the infamous Adobe Alpha thread, confirmed by Alvy Ray Smith, the creator of the alpha channel:
Academy Scientific and Technical Achievement winner Larry Gritz:
(An example of what Mr. Gritz is referring to here by his CANNOT statement is the case of something like a flame or a reflection, which has no occlusion and only emission, or RGB with zero Alpha. This is fundamentally impossible to express using an unassociated alpha operation.)
Academy Scientific and Technical Achievement winner Jeremy Selan:
对于将来阅读这篇精彩帖子并尝试在多通道模糊上取消预乘颜色的人 - 仅在最后一次传递时执行一次。
经过4个小时的反复尝试,终于学会了。
To anyone in the future reading this awesome thread and trying to un-pre-multiply colors on multi-pass blur – do it only once, on the last pass.
Learned it after 4 hours of trial and error.