TL;DR
- Our servers have no GPU. WebGL still has to render somewhere.
- Chrome's default software path (SwiftShader) took ~24s per 3D page.
- Pointing ANGLE at Mesa llvmpipe (
--use-angle=gl) dropped it to ~6s. - The one-line flag is the easy part. The display, the from-source Mesa, and proving it stays on the fast path are the rest of the story.
Our browser fleet runs on commodity Linux nodes with no graphics card and no /dev/dri. Cheaper, simpler, fewer drivers to babysit. But WebGL is a GPU API, so without one, something has to emulate it on the CPU. Which something was the difference between a 24-second screenshot and a 6-second one.
SwiftShader emulates the whole pipeline conservatively, optimizing for "draws correctly anywhere." A heavy 3D scene takes ~24s; the 2D pages next to it, 2-3s.
llvmpipe is built differently:
- It JITs to native code. LLVM compiles the live shader and GL state into real x86-64. No interpreter loop.
- It is tiled and multi-threaded. It actually uses every core.
Several times faster, same output.
- '--use-angle=swiftshader',
+ '--use-angle=gl',What you must not add back:
--disable-gpusilently forces SwiftShader on again. It is the most-copied flag in every headless tutorial.--in-process-gpukills the GL surface ANGLE needs.
--use-angle=gl has to bind a GL surface, which needs an X display, even headless. No display, and WebGL silently degrades to a flat 2D fallback: the screenshot still succeeds, the request still returns 200, and the output is wrong but plausible.
LIBGL_ALWAYS_SOFTWARE=1 pinning Mesa to llvmpipe.Ubuntu jammy's Mesa is too old for this, and the PPAs that used to backport it are gone. So the base image compiles its own:
meson setup build \
-Dbuildtype=release -Dgallium-drivers=llvmpipe -Dvulkan-drivers= \
-Dllvm=enabled -Dshared-llvm=enabledllvmpipe only, no Vulkan, shared LLVM (that's where the JIT speed lives). The build toolchain is huge (LLVM, clang, Rust, ~160 -dev packages), so the Dockerfile is multi-stage: compile Mesa, then COPY only the artifacts into a clean image. 2.65GB instead of 4.5GB.
apt list lies (we side-load Mesa over the package), and the real answer lives inside the page. So asks the live GL context directly:const browserless = require('browserless')
const report = await browserless.report()
console.log(report)browserless.report() from a production node. Expand gpu and cpu for the full picture.
The gpu block is the whole story:
typeissoftware/llvmpipehere.swiftshaderwould mean we fell back;hardwarewould mean a GPU appeared.mesais read from the loadedlibgallium-<ver>.so, not dpkg, which reports the stale package version under our side-load.simdWidth: 256means llvmpipe is using AVX2, which is most of why it's fast.
report({ benchmark: true }) adds a deterministic shader benchmark (~300ms on llvmpipe) for comparing nodes.
The flat-2D fallback is dangerous because it looks like success. So CI asserts on report(): gpu.type must be software, gpu.device must be llvmpipe. Any drift fails the build instead of shipping flat 3D. The same call runs against production pods.
The diff is one line. Proving it was the right line took weeks of measurement, mostly fighting two traps:
- Dev machines lie. A real GPU renders pages that come back black on prod. Every number had to come from prod-shaped hardware.
- Single runs lie. Cold JIT, first-paint races, shared cores. The fastest-looking result was sometimes the wrong one: the flat fallback shipped ~1s quicker.
That's why the deterministic benchmark exists: fixed shader, forced frames, one stable number. With it, the comparison stopped being anecdotal. SwiftShader landed at ~24-31s; llvmpipe at ~6s warm and correct.
Same 3D chart, same GPU-less hardware, measured on production:
| SwiftShader (before) | Mesa llvmpipe (after) | |
|---|---|---|
| Render time (isolated) | ~24s | ~6s (~4×) |
| Render time (under load) | ~24s | 7–14s (~2×) |
| Failed requests | timed out → errors | none |
| Active renderer | SwiftShader | llvmpipe (asserted in CI) |
Isolated, the chart finishes in ~6s. Under real traffic, where captures share cores, expect ~2×. Either way, the requests that used to time out now finish.
The following examples show how to use the Microlink API with CLI, cURL, JavaScript, Python, Ruby, PHP & Golang, targeting 'https://threejs.org/examples/webgl_animation_skinning_blending' URL with 'screenshot' API parameter:
CLI Microlink API example
microlink https://threejs.org/examples/webgl_animation_skinning_blending&screenshot.animatedcURL Microlink API example
curl -G "https://api.microlink.io" \
-d "url=https://threejs.org/examples/webgl_animation_skinning_blending" \
-d "screenshot.animated=true"JavaScript Microlink API example
import mql from '@microlink/mql'
const { data } = await mql('https://threejs.org/examples/webgl_animation_skinning_blending', {
screenshot: {
animated: true
}
})Python Microlink API example
import requests
url = "https://api.microlink.io/"
querystring = {
"url": "https://threejs.org/examples/webgl_animation_skinning_blending",
"screenshot.animated": "true"
}
response = requests.get(url, params=querystring)
print(response.json())Ruby Microlink API example
require 'uri'
require 'net/http'
base_url = "https://api.microlink.io/"
params = {
url: "https://threejs.org/examples/webgl_animation_skinning_blending",
screenshot.animated: "true"
}
uri = URI(base_url)
uri.query = URI.encode_www_form(params)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Get.new(uri)
response = http.request(request)
puts response.bodyPHP Microlink API example
<?php
$baseUrl = "https://api.microlink.io/";
$params = [
"url" => "https://threejs.org/examples/webgl_animation_skinning_blending",
"screenshot.animated" => "true"
];
$query = http_build_query($params);
$url = $baseUrl . '?' . $query;
$curl = curl_init();
curl_setopt_array($curl, [
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => "GET"
]);
$response = curl_exec($curl);
$err = curl_error($curl);
curl_close($curl);
if ($err) {
echo "cURL Error #: " . $err;
} else {
echo $response;
}Golang Microlink API example
package main
import (
"fmt"
"net/http"
"net/url"
"io"
)
func main() {
baseURL := "https://api.microlink.io"
u, err := url.Parse(baseURL)
if err != nil {
panic(err)
}
q := u.Query()
q.Set("url", "https://threejs.org/examples/webgl_animation_skinning_blending")
q.Set("screenshot.animated", "true")
u.RawQuery = q.Encode()
req, err := http.NewRequest("GET", u.String(), nil)
if err != nil {
panic(err)
}
client := &http.Client{}
resp, err := client.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
panic(err)
}
fmt.Println(string(body))
}import mql from '@microlink/mql'
const { data } = await mql('https://threejs.org/examples/webgl_animation_skinning_blending', {
screenshot: {
animated: true
}
})Software GL closes most of the gap, not all. Heavy fragment-shader heroes can still come back black, because the canvas hasn't painted by capture time. That's a first-paint race, not a renderer problem, and no flag fixes it. The real fixes are gating capture on first paint, or real GPUs. We're doing the first.
For everything else, moving from SwiftShader to llvmpipe turned our slowest, flakiest requests into ordinary ones.