When triangle meshes are rendered by a GPU, there are pipeline stages that need to load and process vertex and index data. The efficiency of this process will depend on the layout of the data, and how the GPU is designed. There is an excellent library from Arseny Kapoulkine called meshoptimizer, which provides a variety of algorithms for optimizing geometry for the GPU.
This particular library has started to gain quite a nice adoption, as previous algorithms were either standalone, rarely updated - if ever, or part of huge monstrosities like assimp. Even AMD Compressonator v3 includes support for meshoptimizer:
As with other data processing operations, I have been pushing towards using rust as my defacto systems language whenever possible. Unfortunately, meshoptimizer is only available as a C/C++ library.
An important aspect of shader compilation is the ability to include arbitrary graphs of shader files. Typically, this is performed with a callback supplied to a shader compiler invoked with a relative or absolute file path, and the callback returns the contents of the specified file, sourced from a file system or virtual file system.
If a single shader entry point is being compiled, then the shader and all include dependencies will be evaluated, and compilation time for a single invocation is typically not a major concern. However, if rebuilding all shaders for a full game, the overall compilation time becomes a major concern, and eliminating redundant or unnecessary work is critical.
The common approach to eliminate redundant or unnecessary work is to generate an identity for each shader entry point, consistenting of a hash representing the file contents, preprocessor definitions, compiler flags, and a hash of the compiler binary itself.
However, the include graph referenced from each shader must also be represented by the identity, otherwise any changes to utility shader files would not cause a shader using the utility functions to pick up the changes and rebuild correctly.
Microsoft’s DirectX shader compiler now compiles on Linux, and can generate both SPIR-V and DXIL from HLSL. However, in order to create shaders from DXIL in a running application without using development or experimental mode, the DXIL must be validated with the official dxil.dll binary releases from Microsoft.
The actual validation logic isn’t a secret, as it is public in the GitHub repository, but the official binaries are built at a specific revision so that hardware drivers have a well-known and deterministic set of rules to rely on.
When compilation is performed, there is a sneakyLoadLibrary call for dxil.dll, and if found, the shader byte code result will be officially signed. If the library is not found, then the shader byte code will be unsigned. In both situations, all validation rules are still evaluated.
Microsoft’s new DirectX shader compiler (DXC) is based on LLVM and Clang, which has been traditionally a cross platform codebase, but became Windows-centric through COM, SAL, etc. used to support DirectX 12 shaders. Recently, Google listened (thank you!) to a number of requests from myself and others (here and here) to refactor the codebase to support Linux and macOS compilation.
I have been pursuing cloud based shader compilation for a while, in order to scale our very slow shader compilation pipelines to greatly improve developer iteration time. Running content pipelines on Windows-based virtual machines is not a feasible approach due to concerns of cost, maintainability, and robustness.
With Linux compilation support, it is now possible to run the DXC compiler within a Docker container, and scale it out in a Kubernetes cluster (like GKE).