Halcyon Architecture - "Director's Cut"

During GDC 2018, SEED unveiled our real-time hybrid ray tracing experiment titled “PICA PICA”, and there have been a number of great presentations from colleagues about our work on this project, primarily from the standpoint of image quality.

In some of these talks, there were some brief references to our R&D framework titled “Halcyon”, but very few details were released about what Halcyon is, and how it is designed. We wanted to get more information out around Halcyon itself, so I did two presentations recently about the current rendering architecture and capabilities of Halcyon.

Rust Mesh Optimizer

When triangle meshes are rendered by a GPU, there are pipeline stages that need to load and process vertex and index data. The efficiency of this process will depend on the layout of the data, and how the GPU is designed. There is an excellent library from Arseny Kapoulkine called meshoptimizer, which provides a variety of algorithms for optimizing geometry for the GPU.

This particular library has started to gain quite a nice adoption, as previous algorithms were either standalone, rarely updated - if ever, or part of huge monstrosities like assimp. Even AMD Compressonator v3 includes support for meshoptimizer: Source

As with other data processing operations, I have been pushing towards using rust as my defacto systems language whenever possible. Unfortunately, meshoptimizer is only available as a C/C++ library.

Parsing Shader Includes

An important aspect of shader compilation is the ability to include arbitrary graphs of shader files. Typically, this is performed with a callback supplied to a shader compiler invoked with a relative or absolute file path, and the callback returns the contents of the specified file, sourced from a file system or virtual file system.

If a single shader entry point is being compiled, then the shader and all include dependencies will be evaluated, and compilation time for a single invocation is typically not a major concern. However, if rebuilding all shaders for a full game, the overall compilation time becomes a major concern, and eliminating redundant or unnecessary work is critical.

The common approach to eliminate redundant or unnecessary work is to generate an identity for each shader entry point, consistenting of a hash representing the file contents, preprocessor definitions, compiler flags, and a hash of the compiler binary itself.

However, the include graph referenced from each shader must also be represented by the identity, otherwise any changes to utility shader files would not cause a shader using the utility functions to pick up the changes and rebuild correctly.

Running Microsoft FXC in Docker

Microsoft DXC is the new shader compiler stack, but the FXC compiler is still the dominant HLSL compiler for a number of reasons:

  • Performance and correctness regressions of DXIL shaders compared to DXBC
  • Many cross compilers and custom toolchains still rely on DXBC
  • IHV drivers are still being adapted to consume DXIL, which is more low-level compared to DXBC
  • DXC is a complex codebase, as it is based on LLVM - difficult to build, and many components
  • DXIL is Direct3D 12 only, which makes it Windows 10 only

Therefore, it is still important to support shader compilation with FXC in some situations.

Signing DXIL Post-Compile

Microsoft’s DirectX shader compiler now compiles on Linux, and can generate both SPIR-V and DXIL from HLSL. However, in order to create shaders from DXIL in a running application without using development or experimental mode, the DXIL must be validated with the official dxil.dll binary releases from Microsoft.

The actual validation logic isn’t a secret, as it is public in the GitHub repository, but the official binaries are built at a specific revision so that hardware drivers have a well-known and deterministic set of rules to rely on.

When compilation is performed, there is a sneaky LoadLibrary call for dxil.dll, and if found, the shader byte code result will be officially signed. If the library is not found, then the shader byte code will be unsigned. In both situations, all validation rules are still evaluated.


© 2018. All rights reserved.