During GDC 2018, SEED unveiled our real-time hybrid ray tracing experiment titled “PICA PICA”, and there have been a number of great presentations from colleagues about our work on this project, primarily from the standpoint of image quality.
In some of these talks, there were some brief references to our R&D framework titled “Halcyon”, but very few details were released about what Halcyon is, and how it is designed. We wanted to get more information out around Halcyon itself, so I did two presentations recently about the current rendering architecture and capabilities of Halcyon.
When triangle meshes are rendered by a GPU, there are pipeline stages that need to load and process vertex and index data. The efficiency of this process will depend on the layout of the data, and how the GPU is designed. There is an excellent library from Arseny Kapoulkine called meshoptimizer, which provides a variety of algorithms for optimizing geometry for the GPU.
This particular library has started to gain quite a nice adoption, as previous algorithms were either standalone, rarely updated - if ever, or part of huge monstrosities like assimp. Even AMD Compressonator v3 includes support for meshoptimizer:
As with other data processing operations, I have been pushing towards using rust as my defacto systems language whenever possible. Unfortunately, meshoptimizer is only available as a C/C++ library.
An important aspect of shader compilation is the ability to include arbitrary graphs of shader files. Typically, this is performed with a callback supplied to a shader compiler invoked with a relative or absolute file path, and the callback returns the contents of the specified file, sourced from a file system or virtual file system.
If a single shader entry point is being compiled, then the shader and all include dependencies will be evaluated, and compilation time for a single invocation is typically not a major concern. However, if rebuilding all shaders for a full game, the overall compilation time becomes a major concern, and eliminating redundant or unnecessary work is critical.
The common approach to eliminate redundant or unnecessary work is to generate an identity for each shader entry point, consistenting of a hash representing the file contents, preprocessor definitions, compiler flags, and a hash of the compiler binary itself.
However, the include graph referenced from each shader must also be represented by the identity, otherwise any changes to utility shader files would not cause a shader using the utility functions to pick up the changes and rebuild correctly.
Microsoft’s DirectX shader compiler now compiles on Linux, and can generate both SPIR-V and DXIL from HLSL. However, in order to create shaders from DXIL in a running application without using development or experimental mode, the DXIL must be validated with the official dxil.dll binary releases from Microsoft.
The actual validation logic isn’t a secret, as it is public in the GitHub repository, but the official binaries are built at a specific revision so that hardware drivers have a well-known and deterministic set of rules to rely on.
When compilation is performed, there is a sneakyLoadLibrary call for dxil.dll, and if found, the shader byte code result will be officially signed. If the library is not found, then the shader byte code will be unsigned. In both situations, all validation rules are still evaluated.