Hi again! Today, we’re gonna take a look at the famous Fast Approximate Anti-Aliasing algorithm.
It is used to produce smoother results, cheaply imitating a high-resolution being downscaled (often too expensive for real-time applications).
The Algorithm
Here’s my implementation of the Geeks3D algorithm:
//Texel size (1/resolution)
uniform vec2 u_texel;
//Maximum texel span
#define SPAN_MAX (8.0)
//These are more technnical and probably don't need changing:
//Minimum "dir" reciprocal
#define REDUCE_MIN (1.0/128.0)
//Luma multiplier for "dir" reciprocal
#define REDUCE_MUL (1.0/32.0)
vec4 textureFXAA(sampler2D tex, vec2 uv)
{
//Sample center and 4 corners
vec3 rgbCC = texture2D(tex, uv).rgb;
vec3 rgb00 = texture2D(tex, uv+vec2(-0.5,-0.5)*u_texel).rgb;
vec3 rgb10 = texture2D(tex, uv+vec2(+0.5,-0.5)*u_texel).rgb;
vec3 rgb01 = texture2D(tex, uv+vec2(-0.5,+0.5)*u_texel).rgb;
vec3 rgb11 = texture2D(tex, uv+vec2(+0.5,+0.5)*u_texel).rgb;
//Luma coefficients
const vec3 luma = vec3(0.299, 0.587, 0.114);
//Get luma from the 5 samples
float lumaCC = dot(rgbCC, luma);
float luma00 = dot(rgb00, luma);
float luma10 = dot(rgb10, luma);
float luma01 = dot(rgb01, luma);
float luma11 = dot(rgb11, luma);
//Compute gradient from luma values
vec2 dir = vec2((luma01 + luma11) - (luma00 + luma10), (luma00 + luma01) - (luma10 + luma11));
//Diminish dir length based on total luma
float dirReduce = max((luma00 + luma10 + luma01 + luma11) * REDUCE_MUL, REDUCE_MIN);
//Divide dir by the distance to nearest edge plus dirReduce
float rcpDir = 1.0 / (min(abs(dir.x), abs(dir.y)) + dirReduce);
//Multiply by reciprocal and limit to pixel span
dir = clamp(dir * rcpDir, -SPAN_MAX, SPAN_MAX) * u_texel.xy;
//Average middle texels along dir line
vec4 A = 0.5 * (
texture2D(tex, uv - dir * (1.0/6.0)) +
texture2D(tex, uv + dir * (1.0/6.0)));
//Average with outer texels along dir line
vec4 B = A * 0.5 + 0.25 * (
texture2D(tex, uv - dir * (0.5)) +
texture2D(tex, uv + dir * (0.5)));
//Get lowest and highest luma values
float lumaMin = min(lumaCC, min(min(luma00, luma10), min(luma01, luma11)));
float lumaMax = max(lumaCC, max(max(luma00, luma10), max(luma01, luma11)));
//Get average luma
float lumaB = dot(B.rgb, luma);
//If the average is outside the luma range, using the middle average
return ((lumaB < lumaMin) || (lumaB > lumaMax)) ? A : B;
}
The original source code was a bit messy, but I did my best to clean it up and added some explanatory comments.
Overview
Here’s a quick technical rundown of the process:
Sample 5 points starting at the center texel and halfway between the 4 neighboring texels (with interpolation enabled). That’s what the “rgb” variables are for.
Here’s an up-close illustration of the 5 sample points:Compute the luma for all the sample points. As you might remember, it can be computed with a simple dot product:
float color_luma = dot(color.rgb, vec3(0.299, 0.587, 0.114));
Which is done for the 5 texel samples. We’llCompute gradient direction. This is what the “dir” vector is doing here:
vec2 dir = vec2((luma01 + luma11) - (luma00 + luma10), (luma00 + luma01) - (luma10 + luma11));
If you look closely, you’ll notice dir x-component is computed from the y-axis luma derivative, and the y-component is computed from the negative x derivative. This means it’s actually computing the direction perpendicular to the luma gradient.
“dirReduce
” is the sum of the 4 corner luma values multiplied to keep it small and limited so it’s never 0 (preventing division by 0).
”rcpDir
” is short for the reciprocal of dir. It essentially computes which axis the dir is most aligned with and divides it by that axis, effectively raytracing against that axis. The dirReduce is added as a bias, preventing division by 0 and slightly distorting the distance (not sure if that’s intended?)
And finally, dir is limited to the max texel span and converted to texel units.Don’t worry if this bit was hard to follow, it’s quite obscure stuff!
Blur along the “dir” axis. Now that we have dir, which points along the axis of the lowest luma contrast, we can sample 4 points along it and average them for a simple blur. For slightly smoother results, this algorithm averages the middle two samples first (“A”) and then averages another two outer samples in “B”. If B’s luma is outside the min/max range of all the other samples, we’ll output A instead. So it uses either a two-sample or four-sample blur depending on which fits in the expected range best.
So that’s the gist of the algorithm!
I’ve put together a little project to show how this can be implemented in GM. Download the source code here!
Conclusion
The key takeaway here is that FXAA is a cheap and effective algorithm for producing smoother images. It has some drawbacks (no smoothing pixels of different colors with the same luma, and has difficulties with some corners). Once you understand the uses and limitations of FXAA, you’ll probably start finding countless applications for it. It is famous for a reason!
Anyway, that’s all I got for today. Paid supporters can help me pick my next topic!