Understanding Volume Shader Performance: A Deep Dive
Volume shader benchmarking is about measuring the real rendering work your GPU performs while ray marching a 3D field. In this deep dive we explain how our WebGL volume shader benchmark renders the Mandelbulb fractal, what FPS and frame time actually tell you, and how presets, parameters, and shareable links help produce fair, repeatable results.
What is a volume shader benchmark?
A volume shader benchmark pushes the GPU by evaluating a volumetric function for every pixel on the screen. Instead of rasterizing triangles, the shader ray marches: it steps along a view ray and evaluates a function to determine whether the ray intersects a surface. In our case, the function defines a Mandelbulb fractal, which is an ideal stress test because it mixes transcendental math, iterative loops, and branching—perfect ingredients to expose both raw compute throughput and memory behavior in real hardware.
Our implementation runs entirely in a WebGL fragment shader to keep the pipeline simple and comparable across devices. Each frame computes the same orbiting camera path, so you can trust that differences in frame rate reflect GPU performance, driver behavior, or the parameters you choose—not scene variability.
Architecture considerations and compiler effects
A volume shader benchmark magnifies architecture differences. GPUs with higher FP32 throughput and efficient transcendental units tend to excel at Mandelbulb math. Wider wavefronts or warps can hide latency as long as divergence stays modest. However, heavy branching inside the kernel may reduce occupancy. If you compare a mobile GPU to a desktop GPU at the same preset, expect the desktop to win on both compute and memory bandwidth—and to sustain higher clocks for longer.
The driver compiler also matters. Seemingly small GLSL changes can shift register pressure or instruction selection. While our shader is intentionally stable and portable, two browsers may JIT slightly different code paths. That is another reason we recommend using the same browser for A/B comparisons and focusing on relative differences across presets rather than a single absolute FPS value.
A simple, repeatable methodology
To build a trustworthy dataset, pick one preset per class of device—for example, Ultra Low for tablets, Balanced for gaming laptops, and High for desktops. Warm up for one minute, then record metrics over another minute. Export the results after each run, and label your CSVs with device, browser, driver, and date. Finally, publish a share link for each run so colleagues can reproduce your settings and verify outcomes. This lightweight process yields clean, comparable data without custom tooling.
How ray marching the Mandelbulb works
The Mandelbulb is evaluated by repeatedly transforming a 3D point using power, trigonometric, and inverse trigonometric operations. During ray marching, we start at a camera origin and step forward along the ray, evaluating the fractal function to detect sign changes and refine the hit. This approach is straightforward, visually rich, and GPU intensive. Key controls include:
- Kernel Iterations: how many times the fractal kernel iterates per evaluation. Higher values add detail but increase cost.
- Step Size: distance between samples along the ray. Smaller steps improve quality at the cost of more samples per pixel.
- Resolution Scale: renders at a scaled resolution (e.g., 0.6× to 1.6×) to tune work per frame across devices.
Because this is all in the fragment stage, performance correlates strongly with pixel count (resolution) and per-pixel work (iteration count and step size). The benchmark exposes these levers so you can profile low-end integrated GPUs and high-end discrete GPUs with the same code path.
Interpreting FPS and frame time
We report two primary metrics: FPS (frames per second) and frame time in milliseconds. Average FPS is easy to read—higher is better—but frame time is more sensitive. 16.67 ms corresponds to 60 FPS; 33.3 ms to 30 FPS. If your frame time spikes, you will feel stutter even if average FPS looks fine. We also show recent min/max FPS to indicate stability over a short window.
For apples-to-apples comparisons, keep parameters stable. Two runs with different resolution scales, step sizes, or iteration counts are not directly comparable—one changes pixel count, the other changes the cost per pixel. Use the shareable link to embed your settings and let others re-run with the same configuration.
Presets, exportable results, and shareable links
To streamline testing, we include presets ranging from Ultra Low to Very High. A preset sets a trio of values—kernel iterations, step size, and resolution scale—so you can quickly gauge where your hardware sits. If a preset is too heavy, step down; if it is trivial for your GPU, step up until frame time becomes meaningful.
When you finish a run, click Export Result to download a CSV snapshot of your current metrics (average FPS, frame time, min/max, and GPU model). Use it to track tuning changes over time or compare machines in a lab. If you want others to reproduce your exact run, use the Share section: it generates a URL with your parameters embedded. Anyone opening the link will auto-apply the same settings, creating a fair comparison baseline.
Practical tips for reliable comparisons
- Close background apps and overlays to reduce noise.
- Keep drivers current; GPU backends differ across browsers and OSes.
- Use the same browser across tests when possible.
- Warm up the GPU for a minute before recording results to avoid turbo transients.
- Record both FPS and frame time; the latter often reveals stutter faster.
The benchmark intentionally performs real shader work without vendor-specific extensions. That helps ensure your results are portable and meaningful across hardware generations and vendors.
Why WebGL for a volume shader benchmark?
WebGL is available in every major desktop and mobile browser, which makes a browser-based volume shader benchmark both accessible and reproducible. Our renderer targets WebGL 1 for maximum compatibility and uses careful shader code to avoid features that older devices lack. Despite its age, WebGL 1 can still push a surprising amount of work, and that helps highlight differences in GPU compute throughput and compiler quality.
On modern hardware you will see clear scaling with resolution and iteration count. That scaling is the point: it allows you to dial in a workload that sits around 10–25 ms per frame on your device, which is an informative range for everyday interactive workloads.
Key takeaways
- Use presets to establish a baseline, then tweak iterations, step size, and resolution to target a useful frame time.
- Export results to CSV to track progress and changes over time.
- Share parameterized links so others can reproduce your exact run and compare GPUs fairly.
- Watch frame time as closely as FPS—stability matters as much as average throughput.
With a consistent workload and transparent parameters, a volume shader benchmark reveals how your GPU handles real math‑heavy fragment shading. For a step‑by‑step process, see our reproducible benchmarking guide.