Writing a C++20 Path Tracer From Scratch Without AI
A dependency-free renderer showcases the power of modern C++20, custom acceleration structures, and classic graphics engineering.
Writing a path tracer is a rite of passage for graphics programmers. Doing it in modern C++20 with absolutely zero third-party dependencies—and without leaning on AI assistants—is a masterclass in software craftsmanship.
That is exactly what developer themartiano delivered with Luz, a CPU-based Monte Carlo path tracer built entirely from the ground up. By eschewing external libraries for everything from vector math to image output, the project serves as a clean, highly readable blueprint for how modern C++ can handle complex mathematical simulations and heavy parallel workloads.
The Zero-Dependency Architecture
In modern software development, it is easy to fall into the trap of dependency bloat. A typical project might pull in dozens of libraries for windowing, image saving, math, and parsing. Luz takes the opposite path.
To keep the codebase entirely self-contained, the author implemented every core utility manually. This includes:
- Custom Math & Geometry: Vector operations, ray-object intersections, and probability density functions (PDFs) for importance sampling.
- File I/O: Custom parsers for
.luzscene files and standard Wavefront OBJ meshes, alongside custom encoders for BMP and TIFF image outputs. - Integration Tools: A custom exporter to convert
.blendfiles from Blender directly into the native.luzformat, bridging the gap between professional modeling tools and a custom engine.
The project targets macOS, Linux, and Windows (via MSVC or MinGW), and supports building with either a standard Makefile or CMake. The only optional dependency is Python, which is used solely for helper scripts and tooling.
Acceleration and Sampling Under the Hood
Path tracing is computationally brutal. Without optimization, casting millions of rays into a scene with complex geometry will quickly grind any CPU to a halt. To achieve interactive rendering speeds, Luz implements several classic and advanced graphics algorithms.
Binned SAH BVH Acceleration
To handle complex OBJ meshes, Luz uses a Bounding Volume Hierarchy (BVH). Specifically, it implements packed mesh BVHs constructed using a binned Surface Area Heuristic (SAH) and traversed using a near-first approach. SAH binning significantly speeds up BVH construction times by grouping primitives into spatial bins to find the optimal split plane, while near-first traversal ensures the ray intersects the closest bounding boxes first, allowing for early termination of intersection tests.
Intelligent Adaptive Sampling
Instead of throwing a uniform number of samples at every pixel, Luz features an adaptive sampling engine. The user sets a maximum sample limit, and the renderer processes each pixel progressively. After a minimum threshold of samples is met, the engine periodically checks the luminance and RGB confidence intervals of the pixel.
If the pixel has converged (meaning the noise is below a specified threshold), rendering stops early for that coordinate. To prevent "fireflies" or missed light paths, very dark pixels use a conservative minimum sample count before they are allowed to stop, ensuring rare light contributions are not mistaken for converged black.
Materials, Mediums, and Post-Processing
Despite having zero external dependencies, Luz supports a surprisingly rich feature set that rivals commercial toy renderers:
- Materials & Lights: Support for Lambertian, metal, dielectric (glass), emissive, and isotropic materials. Light sources include area, point, sphere, and directional lights.
- Volumetrics & Atmosphere: Isotropic materials allow for rendering participating media (like fog or smoke), complemented by an atmospheric simulation that models Rayleigh and Mie scattering.
- Post-Processing Pipeline: Once the raw rays are traced, Luz applies a built-in post-processing stack. This includes depth of field, antialiasing, exposure compensation, contrast adjustment, tone mapping, gamma correction, and bloom.
- Denoising: To clean up Monte Carlo noise without waiting hours for renders to converge, Luz includes an integrated Non-Local Means (NFOR-style) denoiser that can output a clean companion image alongside the raw render.
Squeezing Performance Out of the CPU
Because Luz is a multithreaded CPU renderer, squeezing every drop of performance out of the hardware is critical. The build system is designed to compile highly optimized binaries by default, utilizing aggressive compiler flags:
-O3: Enables high-level compiler optimizations.-march=native: Instructs the compiler to generate instructions specific to the host CPU (utilizing modern vector instruction sets like AVX if available).-flto: Enables Link-Time Optimization (or interprocedural optimization in CMake) to optimize across translation units.- Fast Math: Enables fast floating-point modes where supported by the compiler and platform.
While these flags yield massive performance gains, they can occasionally cause illegal-instruction crashes on older CPUs or trigger toolchain-specific linker bugs. To address this, the build system allows developers to easily disable native tuning and LTO via command-line overrides (e.g., make NATIVE=0 LTO=0).
To help developers measure the impact of code changes, Luz includes a deterministic benchmarking harness. Running make benchmark generates detailed CSV reports breaking down performance across rendering, denoising, and post-processing, allowing for precise before-and-after comparisons during optimization passes.
Sources & further reading
Lenn writes about cloud platforms, Kubernetes internals, and the infrastructure decisions that quietly make or break engineering organizations. Based in Berlin's vibrant tech scene, they have a talent for turning dense platform-engineering topics into prose that people actually finish reading.
Discussion 4
i love how luz handles parallel workloads, reminds me of how postgres uses parallel query support to speed up complex queries, really interesting to see similar concepts applied to graphics rendering
okay this is actually huge
@excited_emma i totally agree, the fact that luz is dependency-free makes it a great learning resource - you can really see how all the pieces fit together without any external libraries getting in the way
@excited_emma yeah but what's the attack surface on a custom vector math lib?