使用 Slug 算法渲染任意比例的表情符号

使用 Slug 算法渲染任意比例的表情符号
Rendering arbitrary-scale emojis using the Slug algorithm

原始链接: https://leduyquang753.name.vn/blog/2026/4/4/rendering-arbitrary-scale-emojis-using-the-slug-algorithm

## HarfBuzz GPU 与 Slug 渲染彩色字体 Eric Lengyel 的 Slug 算法现已开源，并集成到 HarfBuzz 中作为 GPU 库，超越了文本塑形，进入了字形渲染领域。传统上，文本渲染依赖于在特定尺寸下栅格化位图，这对于缩放或 3D 环境来说是个问题。像符号距离场 (SDF) 这样的替代方案也有局限性，但 Slug 直接在片段着色器中计算字形覆盖率，从而实现完美的缩放和变换。核心思想是将字形曲线预处理成数据缓冲区并上传到 GPU。虽然最初是用于单色字形，但可以通过 COLRv0 和 COLRv1 等格式扩展到矢量彩色字体（如表情符号）。COLRv0 将表情符号渲染为堆叠的彩色字形，可以通过调整现有的单色渲染来轻松支持。COLRv1 更加复杂，利用带有变换、裁剪和混合的渲染树 – 由 HarfBuzz 的 `hb-paint` 组件处理。这涉及将绘图命令（裁剪蒙版、填充、变换、组）编码到纹理缓冲区中，并在片段着色器中执行它们，可能需要基于图层的混合方法。最终，这使得在任何应用程序中都能实现清晰、可缩放的表情符号渲染，并且即使对于单色文本也优于传统方法。作者希望该概述能够激发进一步的开发并集成到现有的渲染库中。

黑客新闻新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录使用 Slug 算法渲染任意比例的表情符号 (name.vn) 5 分，由 leduyquang753 1 小时前发布 | 隐藏 | 过去 | 收藏 | 讨论帮助指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

With the recent (very generous) release of Eric Lengyel's Slug algorithm into the public domain, an open-source implementation has quickly made its way into the HarfBuzz repository as a new HarfBuzz GPU library. (Yes, HarfBuzz is no longer just a text shaping library; it also handles glyph rendering now.) However, currently the library only directly deals with rendering ordinary single-color glyphs. In this blog post, I will give an outline of how the Slug algorithm could be incorporated to render vector color fonts (commonly emojis) at arbitrary scales.

Firstly, here's a bit of background information for those who aren't too familiar with the topic. Those who are may skip right to the next section.

On-screen text has been traditionally drawn on-screen by first rasterizing glyphs on the CPU as bitmaps, then pasting them onto the display buffer. The text only looks good if the bitmaps are not scaled up or down significantly. If you want to render a different size of text nicely, you would need to rasterize new bitmaps. But that is not always feasible to do, for example, if the text is placed in a 3D environment.

An alternative approach, commonly used in games, is to generate a signed distance field (SDF) for each glyph and save it into a bitmap. The signed distance could then be interpolated from the bitmap for a fragment shader program to determine whether a pixel is inside or outside the glyph. The main issue with this is that it makes corners rounded when scaled up. Multi-channel SDFs (generated using msdfgen for example) help with this problem to an extent.

The Slug algorithm just does away with prerendered bitmaps altogether, and instead calculates glyphs' coverage onto each pixel directly within the fragment shader. Glyphs' curves go through some CPU preprocessing into a data buffer, which is uploaded to the GPU and read by the fragment shader. The algorithm is both fast and robust, allowing text or any vector shape to be rendered pretty much perfectly at any scale, and under any transformation in 3D space. The original implementation, the Slug library, is not free and is patented, but the author Eric Lengyel has waived his exclusive rights to the patent, allowing anyone to implement the algorithm. Behdad Esfahbod then created an open-source implementation and added into the HarfBuzz codebase under the name HarfBuzz GPU.

Rendering ordinary single-color glyphs is exceedingly simple using HarfBuzz GPU. An hb_gpu_draw_t instance is created to receive a glyph's outline data through calling hb_font_draw_glyph. Then the bounding box of the glyph is retrieved with hb_gpu_draw_get_extents, and the encoded outline curves are generated by calling hb_gpu_draw_encode. This encoded data is uploaded to a texture buffer.

struct GlyphData {
	GLuint dataOffset;
	float minX, minY, maxX, maxY;
};
GLuint nextGlyphDataOffset = 0;
hb_gpu_draw_t *hbGpuDraw = hb_gpu_draw_create_or_fail();

GlyphData generateGlyphData(
	hb_font_t *const hbFont, const hb_codepoint_t codepoint
) {
	hb_gpu_draw_reset(hbGpuDraw);
	hb_font_draw_glyph(hbFont, codepoint, hb_gpu_draw_get_funcs(), hbGpuDraw);

	GlyphData glyphData = {};
	hb_glyph_extents_t extents;
	hb_gpu_draw_get_extents(hbGpuDraw, &extents);

	hb_blob_t *const blob = hb_gpu_draw_encode(hbGpuDraw);
	unsigned encodedDataLength;
	const char *const encodedData = hb_blob_get_data(blob, &encodedDataLength);
	glBufferSubData(
		GL_TEXTURE_BUFFER, nextGlyphDataOffset, encodedDataLength, encodedData
	);
	
	
	glyphData.dataOffset = nextGlyphDataOffset / 8;
	nextGlyphDataOffset += encodedDataLength;
	hb_gpu_draw_recycle_blob(hbGpuDraw, blob);

	glyphData.minX = extents.x_bearing;
	glyphData.minY = extents.y_bearing;
	glyphData.maxX = extents.x_bearing + extents.width;
	glyphData.maxY = extents.y_bearing + extents.height;

	return glyphData;
}

Each glyph is rendered as a single quad (rectangle). In the fragment shader, hb_gpu_render is called to read the outline data and compute the coverage of the glyph onto each pixel. For single-color glyphs, we just multiply the text color's alpha with the coverage to output as the fragment's output color.

in vec2 fragUv;
in vec4 fragColor;
flat in uint fragDataOffset;

out vec4 color;

void main() {
	float coverage = hb_gpu_render(fragUv, fragDataOffset);
	color = vec4(fragColor.rgb, fragColor.a * coverage);
}

This will serve as the foundation to the next sections where we deal with rendering multicolor glyphs.

An example implementation can be found in HarfBuzz's repository here, which is an interactive demo also hosted on Behdad's website.

Color fonts were born in order to allow emojis in text to be rendered in different colors, instead of having to replace their occurrences with inline images or leave them as just black and white. Over the years, multiple competing formats have appeared; some are just straight up images embedded into the fonts, others are vector-based. One of these vector formats is literally SVGs, the remaining ones are based on composing different glyphs together with various color patterns to create the desired glyph image. These are the COLRv0 and COLRv1 formats.

COLRv0 was created by Microsoft to implement Windows 10's flat emojis. The format is very simple: take ordinary glyphs, assign a solid color to each, then stack them on top of each other. The list of layers of a glyph can be retrieved by calling hb_ot_color_glyph_get_layers from HarfBuzz's ht-ot-color component. So you can just take the single-color glyph rendering code from the previous section, modify the processing logic to render multicolor glyphs as multiple individual glyphs at the same position with different colors, and you have already supported colorful emojis in your game or application. A GitHub repository provides a few emoji fonts of this format for you to use, which includes Twemoji that you can most likely recognize from X or Discord.

Sample flat emojis from Windows 10's Segoe UI emoji and Twemoji fonts, along with their exploded views.

Having been unsatisfied with COLRv0's flatness, Google then created a successor, COLRv1. And it is way, way more complicated. What was a stack of glyphs is now a full-blown render tree, with transforms, clipping, and layer blending. You can take a glimpse at its specification to see just how involved it is.

Luckily, HarfBuzz has taken away some of the burden from us in trying to parse such a wide range of drawing commands and options from the font data. The hb-paint component presents us with an interface that we can hook into to retrieve parsed drawing commands. These include:

Setting a clip mask from a rectangle, or a glyph's outline.
Filling the area inside the clip mask using a solid color or a gradient.
Pushing and popping affine transforms from the transform stack.
Pushing and popping layer groups, using different modes of blending.

Firstly, clip masks are set in order to define the area to be filled with a color or gradient. There is actually a stack of multiple clip masks; however, from what I see in Windows's Segoe UI emoji and Google's Noto color emoji fonts, they just use one clip mask at a time, so if we just target those, we can simplify our implementation. When we receive a push_clip_glyph command, we acquire the glyph's data and encode its outline into the texture buffer for the Slug algorithm.

Solid color fills are straightforward, but gradient fills require us to also encode their definition into the texture buffer, so that the fragment shader can calculate the color for each pixel depending on the position. The data includes the type of gradient (linear, radial, or sweeped), and the list of color stops. We pass the position into the texture buffer of both the clip glyph's outline data and the gradient into the fragment shader. The fragment shader first computes the gradient color at the pixel's position, then multiplies the alpha with the clip glyph's coverage. (I personally found the radial gradient's implementation to be quite tricky, you can refer to a guide and example here.)

Next are affine transforms, of which we need to maintain a stack in order to apply them to different parts of the render tree. The initial transform in the stack is the identity transform. When a push_transform command is received, we multiply the top (current) transformation matrix with the given transformation matrix, and push the result onto the stack. When pop_transform is received, we pop the top transformation matrix from the stack. When we receive a fill command, we need to transform both the clip glyph as well as the gradient (if present). We can encode the transformed glyph by making an intermediate hb_draw_funcs_t to intercept outline drawing commands from hb_font_draw_glyph, multiplying the current transformation matrix with each pair of coordinates, then calling the hb_gpu_draw_t's hb_draw_funcs_t with the transformed coordinates. This works because the Bezier curves that make up the outline can be affine-transformed by transforming their control points.

struct GlyphTransformer {
	hb_gpu_draw_t *hbGpuDraw;
	hb_draw_funcs_t *drawFuncs;
	hb_draw_funcs_t *hbGpuDrawFuncs = hb_gpu_draw_get_funcs();
	glm::mat3 transform;
	hb_draw_state_t drawState;

	GlyphTransformer(hb_gpu_draw_t *const hbGpuDraw): hbGpuDraw(hbGpuDraw) {
		drawFuncs = hb_draw_funcs_create();
		
		
		hb_draw_funcs_set_quadratic_to_func(
			drawFuncs,
			[](
				hb_draw_funcs_t *const funcs, void *const drawData,
				hb_draw_state_t *const originalDrawState,
				const float originalControlX, const float originalControlY,
				const float originalToX, const float originalToY,
				void *const userData
			) {
				auto &e = *static_cast<GlyphTransformer*>(drawData);
				float
					controlX = originalControlX,
					controlY = originalControlY,
					toX = originalToX,
					toY = originalToY;
				e.transformPoint(controlX, controlY);
				e.transformPoint(toX, toY);
				hb_draw_quadratic_to(
					e.hbGpuDrawFuncs, e.hbGpuDraw, &e.drawState,
					controlX, controlY, toX, toY
				);
			},
			nullptr, nullptr
		);
		
		hb_draw_funcs_make_immutable(drawFuncs);
	}

	void transformPoint(float &x, float &y) {
		const auto transformedPoint = transform * glm::vec3(x, y, 1.f);
		x = transformedPoint.x;
		y = transformedPoint.y;
	}

	void transformGlyph(
		hb_font_t *const hbFont, const hb_codepoint_t codepoint,
		const glm::mat3 transformIn
	) {
		transform = transformIn;
		drawState = {};
		hb_font_draw_glyph(hbFont, codepoint, drawFuncs, this);
	}
};

To correctly evaluate the gradient, we also encode the inverse transformation matrix into the texture buffer. The fragment shader then mutiplies that with the original coordinates to get the coordinates within the untransformed gradient before evaluating the gradient's color.


float evaluateLinearGradient(ivec4 gradient, vec2 uv);
float evaluateRadialGradient(ivec4 gradient1, ivec4 gradient2, vec2 uv);
float evaluateSweepGradient(ivec4 gradient1, ivec4 gradient2, vec2 uv);

vec4 evaluateColorLine(int dataOffset, vec4 defaultColor, float t);

float readFixed(int integer, int fractional) {
	return integer + float(fractional) / (1 << 15);
}

vec4 getGradientColor(int dataOffset, vec4 defaultColor) {
	
	
	ivec4 xxyx = hb_gpu_fetch(dataOffset);
	ivec4 xyyy = hb_gpu_fetch(dataOffset + 1);
	ivec4 dxdy = hb_gpu_fetch(dataOffset + 2);
	vec2 uv = (mat3(
		readFixed(xxyx.x, xxyx.y), readFixed(xxyx.z, xxyx.w), 0.,
		readFixed(xyyy.x, xyyy.y), readFixed(xyyy.z, xyyy.w), 0.,
		readFixed(dxdy.x, dxdy.y), readFixed(dxdy.z, dxdy.w), 1.
	) * vec3(fragUv, 1.)).xy;
	
	
	ivec4 gradient1 = hb_gpu_fetch(dataOffset + 3);
	ivec4 gradient2 = hb_gpu_fetch(dataOffset + 4);
	return evaluateColorLine(
		dataOffset + 5, 
		defaultColor,
		gradient1.x == 0 ? evaluateLinearGradient(gradient2, uv)
		: gradient1.x == 1 ? evaluateRadialGradient(gradient1, gradient2, uv)
		: gradient1.x == 2 ? evaluateSweepGradient(gradient1, gradient2, uv)
		: 0.5
	);
}

Finally, there are grouping commands. When push_group is received, we create a new color layer to apply drawing operations on; and when pop_group is received, we blend the color of the layer to that of the previous one using the given blending operation. If no groups are used, the default blending operation is normal alpha blending, so we can just emit each drawing operation as one quad. If they are used, however, blending has to be done in the fragment shader to handle different blending equations. To accomplish this, we can also encode into the texture buffer the sequence of drawing commands to be done, including pushing a group, filling a glyph, and popping a group. The fragment shader creates an array storing the layers' current colors, iterates over the command sequence and performs each command. When the last group is popped, the color of the last layer is the blending result.

in vec2 fragUv;
in vec4 fragColor;
flat in uint fragDataOffset;

out vec4 color;

vec4 getGradientColor(int dataOffset, vec4 defaultColor);
vec4 blend(vec4 source, vec4 destination, int mode);

vec4 getGroupColor() {
	int groupIndex = -1;
	vec4 colorStack[8];
	int cursor = int(fragDataOffset);
	for (int i = 0; i < 64; ++i) {
		ivec4 command = hb_gpu_fetch(cursor);
		++cursor;
		
		switch (command.x) {
			case 0: { 
				
				uint rg = uint(command.z) & 0xFFFFu;
				uint ba = uint(command.w) & 0xFFFFu;
				vec4 workingColor = command.y == 1 ? fragColor : vec4(
					(rg >> 8) / 255.,
					(rg & 0xFFu) / 255.,
					(ba >> 8) / 255.,
					(ba & 0xFFu) / 255.
				);
				
				ivec4 offsets = hb_gpu_fetch(cursor);
				++cursor;
				workingColor = getGradientColor(
					int(uint(offsets.z) << 16
					| uint(offsets.w)), workingColor
				);
				workingColor.a *= hb_gpu_render(
					fragUv, uint(offsets.x) << 16 | uint(offsets.y)
				);
				colorStack[groupIndex]
					= blend(colorStack[groupIndex], workingColor, 3);
				break;
			}
			case 1: {
				++groupIndex;
				colorStack[groupIndex] = vec4(0., 0., 0., 0.);
				break;
			}
			case 2: {
				--groupIndex;
				
				if (groupIndex != -1) colorStack[groupIndex] = blend(
					colorStack[groupIndex],
					colorStack[groupIndex + 1],
					command.y
				);
				break;
			}
		}
		if (groupIndex == -1) break;
	}
	return colorStack[0];
}

And with that, we can now render colorful gradient emojis of Windows 11, for example, at any scale.

Sample gradient emojis from Windows 11's Segoe UI emoji font, along with
their exploded views.

An assortment of emojis from Windows 11's Segoe UI emoji font.

It is worth noting that this approach can be applied to rasterized monochrome bitmaps and SDFs as well, but if you've gone to this length to implement data-driven blending in the fragment shader, I don't see much reason to not also use Slug.

And that concludes the outline of how you can implement support for displaying colorful, arbitrary-scale emojis using HarfBuzz's Slug algorithm implementation. I may make a library for this at some point, but I figure someone with this information might be able to do it sooner and/or integrate it into existing text rendering libraries. I look forward to seeing nice, crisp text and emojis in more 3D games thanks to this incredible technology that Eric has genereously set free for the benefit of everyone.

使用 Slug 算法渲染任意比例的表情符号 Rendering arbitrary-scale emojis using the Slug algorithm

使用 Slug 算法渲染任意比例的表情符号
Rendering arbitrary-scale emojis using the Slug algorithm