Using the DirectXShaderCompiler C++ API

04 Mar 2020 - Reading time: 18 mins - Comments

The old fxc.exe compiles to DXBC and only supports up to Shader Model 5.1. Microsoft has since introduced their new llvm-based compiler DirectXShaderCompiler (DXC) which compiles to DXIL (whereas FXC compiles to DXBC). It’s completely open-source on GitHub. It both provides command line tools and a C++ API for compiling, validating and using shaders with SM 6.0 and up.

FXC has existed for quite a long time now and there are guides for it everywhere however DXC is quite a bit younger and I didn’t find too much about it. The documentation for it is quite minimal and incomplete which is surprising considering D3D is generally really well documented. Using DXC’s commandline tools are quite straight forward and self-explanatory however I’ve always liked to compile my shaders at runtime because it makes it easier to recompile on the fly. Unfortunately, I didn’t find too much documentation on the C++ API.

I’ve found a few articles here and there about the very basic setup for this however most of the were very minimal and didn’t really cover things like compile-arguments, defines, include handling, validation, reflection, debug-data, … which are quite important to know about.

EDIT (20/03/2020): DirectX Developer day has happened and there’s been a super good talk by one of the DXC developers and they gave a great walkthrough of the interface. I’d definitely recommend watching that. The general usage is the same but with the latest few updates, it has become a bit easier to use and manipulate.

The main thing that got updated are the introduction of IDxcCompiler3 which is has the new interfaces that are streamlined with the CLI. Secondly, it’s now a lot clearer what “parts” the compiler outputs and it’s more easy to isolate certain parts like reflection and PDBs and possibly strip them to be processed later.

Getting Started

At the time of writing this, there is not yet an official release with the new DXC updates however the latest build binaries can be downloaded from the repo’s AppVeyor.

There are quite a lot of binaries in there but to use the C++ API you only need a few:

dxcompiler.lib - The compiler dll link lib
dxcompiler.dll - The compiler’s frontend
dxcapi.h - The header with the interfaces. You only need this single header. (/include/dxc/dxcapi.h)

Compiling

The new update comes with a new interface IDxcUtils, this interface essentially replaces IDxcLibrary. The ‘Utils’ interface provides all the functionality to create data blobs. Besides that, just like before, creating a blob remains the same:

ComPtr<IDxcUtils> pUtils;
DxcCreateInstance(CLSID_DxcUtils, IID_PPV_ARGS(pUtils.GetAddressOf()));
ComPtr<IDxcBlobEncoding> pSource;
pUtils->CreateBlob(pShaderSource, shaderSourceSize, CP_UTF8, pSource.GetAddressOf());

Now introducing IDxcCompiler3, the Compile function no longer provides separate input arguments for defines. This now all needs to get passed through as compile arguments using the -D argument. Note the use of std::vector is just for simplicity’s sake.

std::vector<LPWSTR> arguments;
// -E for the entry point (eg. 'main')
arguments.push_back(L"-E");
arguments.push_back(entryPoint);

// -T for the target profile (eg. 'ps_6_6')
arguments.push_back(L"-T");
arguments.push_back(target);

// Strip reflection data and pdbs (see later)
arguments.push_back(L"-Qstrip_debug");
arguments.push_back(L"-Qstrip_reflect");

arguments.push_back(DXC_ARG_WARNINGS_ARE_ERRORS); //-WX
arguments.push_back(DXC_ARG_DEBUG); //-Zi

for (const std::wstring& define : defines)
{
    arguments.push_back(L"-D");
    arguments.push_back(define.c_str());
}

DxcBuffer sourceBuffer;
sourceBuffer.Ptr = pSource->GetBufferPointer();
sourceBuffer.Size = pSource->GetBufferSize();
sourceBuffer.Encoding = 0;

ComPtr<IDxcResult> pCompileResult;
HR(pCompiler->Compile(&sourceBuffer, arguments.data(), (uint32)arguments.size(), nullptr, IID_PPV_ARGS(pCompileResult.GetAddressOf())));

// Error Handling. Note that this will also include warnings unless disabled.
ComPtr<IDxcBlobUtf8> pErrors;
pCompileResult->GetOutput(DXC_OUT_ERRORS, IID_PPV_ARGS(pErrors.GetAddressOf()), nullptr);
if (pErrors && pErrors->GetStringLength() > 0)
{
    MyLogFunction(Error, (char*)pErrors->GetBufferPointer());
}

Now before you stop reading and being happy that you compile a shader, wait! You’re missing out on the main reason why this update is such a big deal! You might have noticed the two arguments -Qstrip_debug and -Qstrip_reflect.

Stripping parts

I’ve found that previously, stripping debug data or reflection data hasn’t always been doing what I would expect. With the new updates, this becomes a lot easier.

The output of the compiler now is a IDxcResult (as opposed to IDxcOperationResult). It has a method GetOutput which allows you to extract a part of the output. This can be one of the following: (defined in dxcapi.h)

DXC_OUT_OBJECT
DXC_OUT_ERRORS
DXC_OUT_PDB
DXC_OUT_SHADER_HASH
DXC_OUT_DISASSEMBLY
DXC_OUT_HLSL
DXC_OUT_TEXT
DXC_OUT_REFLECTION
DXC_OUT_ROOT_SIGNATURE

In the code example above, I added the two arguments -Qstrip_debug and -Qstrip_reflect. The effect of this is that the compiler will strip both the shader PDBs and reflection data from the Object part. The important thing here is that it will still be in the compile result and can be extracted using DXC_OUT_PDB and DXC_OUT_REFLECTION respectively. The result of the flags are that it will no longer be embedded in the DXC_OUT_OBJECT part which keeps the actual shader object nice and slim. I would definitely advise to always use these flags. Pix also understands separate PDBs for shaders.

So now, it’s a lot more clear and easy to specifically get parts of the shader. Getting shader PDBs is done as follows:

ComPtr<IDxcBlob> pDebugData;
ComPtr<IDxcBlobUtf16> pDebugDataPath;
pCompileResult->GetOutput(DXC_OUT_PDB, IID_PPV_ARGS(pDebugData.GetAddressOf()), pDebugDataPath.GetAddressOf());

The function has one extra argument which can be retrieved as a IDxcBlobUtf16. This contains the path that is baked into the shader object to refer to the part in question. So if you want to save the PDBs to a separate file, use this name so that Pix will know where to find it.

For reflection data: (library reflection is similar but with ID3D12LibraryReflection)

ComPtr<IDxcBlob> pReflectionData;
pCompileResult->GetOutput(DXC_OUT_REFLECTION, IID_PPV_ARGS(pReflectionData.GetAddressOf()), nullptr);
DxcBuffer reflectionBuffer;
reflectionBuffer.Ptr = pReflectionData->GetBufferPointer();
reflectionBuffer.Size = pReflectionData->GetBufferSize();
reflectionBuffer.Encoding = 0;
ComPtr<ID3D12ShaderReflection> pShaderReflection;
pUtils->CreateReflection(&reflectionBuffer, IID_PPV_ARGS(pShaderReflection.GetAddressOf()));

The hash:

RefCountPtr<IDxcBlob> pHash;
if (SUCCEEDED(pCompileResult->GetOutput(DXC_OUT_SHADER_HASH, IID_PPV_ARGS(pHash.GetAddressOf()), nullptr)))
{
    DxcShaderHash* pHashBuf = (DxcShaderHash*)pHash->GetBufferPointer();
}

On a last note, as mentioned in the video, none of the DXC interfaces are thread-safe and it is advised to have an instance of each interface for each thread. DxcCreateInstance is an exception and is thread-safe.

Custom include handler

I’ve found no good examples of implementing a custom include handler and it’s really not straight forward at first. This is useful if you want to add some custom include logic or if you want to know which files were included to do some kind of shader hot-reloading. Here’s an example implementation that correctly loads include sources and makes sure not to include the same file twice. To add include directories, use the -I argument like for example -I /Resources/Shaders.

static ComPtr<IDxcUtils> pUtils;
if(!pUtils)
    VERIFY_HR(DxcCreateInstance(CLSID_DxcUtils, IID_PPV_ARGS(pUtils.GetAddressOf())));

class CustomIncludeHandler : public IDxcIncludeHandler
{
public:
    HRESULT STDMETHODCALLTYPE LoadSource(_In_ LPCWSTR pFilename, _COM_Outptr_result_maybenull_ IDxcBlob** ppIncludeSource) override
    {
        ComPtr<IDxcBlobEncoding> pEncoding;
        std::string path = Paths::Normalize(UNICODE_TO_MULTIBYTE(pFilename));
        if (IncludedFiles.find(path) != IncludedFiles.end())
        {
            // Return empty string blob if this file has been included before
            static const char nullStr[] = " ";
            pUtils->CreateBlobFromPinned(nullStr, ARRAYSIZE(nullStr), DXC_CP_ACP, pEncoding.GetAddressOf());
            *ppIncludeSource = pEncoding.Detach();
            return S_OK;
        }

        HRESULT hr = pUtils->LoadFile(pFilename, nullptr, pEncoding.GetAddressOf());
        if (SUCCEEDED(hr))
        {
            IncludedFiles.insert(path);
            *ppIncludeSource = pEncoding.Detach();
        }
        return hr;
    }

    HRESULT STDMETHODCALLTYPE QueryInterface(REFIID riid, _COM_Outptr_ void __RPC_FAR* __RPC_FAR* ppvObject) override { return E_NOINTERFACE; }
    ULONG STDMETHODCALLTYPE AddRef(void) override {	return 0; }
    ULONG STDMETHODCALLTYPE Release(void) override { return 0; }

    std::unordered_set<std::string> IncludedFiles;
};

All compiler arguments

I’ve also found it pretty hard to find the compiler options for DXC. When using FXC, most of the compiler options were defines and easy to find because they were all defined together in a unified way. That doesn’t seem to be the case for DXC.

I recently saw a conversation on the DirectX Discord server about when to use “-“ or “/” for compile argument but if you look at HLSLOptions.td in the DirectXShaderCompiler reposity, you’ll see “-“ works for all arguments while “/” only works for some. So it’s always best to use “-“.

When you execute dxc.exe -help, it shows you the most important options which I’ve formatted below, mostly for my own reference.

Version: dxcompiler.dll: 1.7 - 1.7.2212.12 (8c9d92be7); dxil.dll: 1.7(101.7.2212.14)

Common Options:
-help	Display available options
-Qunused-arguments	Don’t emit warning for unused driver arguments
–version	Display compiler version information

Commpilation Options:
-all-resources-bound	Enables agressive flattening
-auto-binding-space	Set auto binding space - enables auto resource binding in libraries
-Cc	Output color coded assembly listings
-default-linkage	Set default linkage for non-shader functions when compiling or linking to a library target (internal, external)
-denorm	select denormal value options (any, preserve, ftz). any is the default.
-disable-payload-qualifiers	Disables support for payload access qualifiers for raytracing payloads in SM 6.7.
-D	Define macro
-enable-16bit-types	Enable 16bit types and disable min precision types. Available in HLSL 2018 and shader model 6.2
-enable-lifetime-markers	Enable generation of lifetime markers
-enable-payload-qualifiers	Enables support for payload access qualifiers for raytracing payloads in SM 6.6.
-encoding	Set default encoding for source inputs and text outputs (utf8	utf16(win)	utf32(*nix)	wide) default=utf8
-export-shaders-only	Only export shaders when compiling a library
-exports	Specify exports when compiling a library: export1[[,export1_clone,…]=internal_name][;…]
-E	Entry point name
-Fc	Output assembly code listing file
-fdiagnostics-show-option	Print option name with mappable diagnostics
-fdisable-loc-tracking	Disable source location tracking in IR. This will break diagnostic generation for late validation. (Ignored if /Zi is passed)
-Fd	Write debug information to the given file, or automatically named file in directory when ending in ‘'
-Fe	Output warnings and errors to the given file
-Fh	Output header file containing object code
-Fi	Set preprocess output file name (with /P)
-flegacy-macro-expansion	Expand the operands before performing token-pasting operation (fxc behavior)
-flegacy-resource-reservation	Reserve unused explicit register assignments for compatibility with shader model 5.0 and below
-fno-diagnostics-show-option	Do not print option name with mappable diagnostics
-force-rootsig-ver	force root signature version (rootsig_1_1 if omitted)
-Fo	Output object file
-Fre	Output reflection to the given file
-Frs	Output root signature to the given file
-Fsh	Output shader hash to the given file
-ftime-report	Print time report
-ftime-trace=	Print hierchial time tracing to file
-ftime-trace	Print hierchial time tracing to stdout
-Gec	Enable backward compatibility mode
-Ges	Enable strict mode
-Gfa	Avoid flow control constructs
-Gfp	Prefer flow control constructs
-Gis	Force IEEE strictness
-HV	HLSL version (2016, 2017, 2018, 2021). Default is 2018
-H	Show header includes and nesting depth
-ignore-line-directives	Ignore line directives
-I	Add directory to include search path
-Lx	Output hexadecimal literals
-Ni	Output instruction numbers in assembly listings
-no-legacy-cbuf-layout	Do not use legacy cbuffer load
-no-warnings	Suppress warnings
-No	Output instruction byte offsets in assembly listings
-Odump	Print the optimizer commands.
-Od	Disable optimizations
-pack-optimized	Optimize signature packing assuming identical signature provided for each connecting stage
-pack-prefix-stable	(default) Pack signatures preserving prefix-stable property - appended elements will not disturb placement of prior elements
-recompile	recompile from DXIL container with Debug Info or Debug Info bitcode file
-res-may-alias	Assume that UAVs/SRVs may alias
-rootsig-define	Read root signature from a #define
-T	Set target profile. : ps_6_0, ps_6_1, ps_6_2, ps_6_3, ps_6_4, ps_6_5, ps_6_6, ps_6_7, vs_6_0, vs_6_1, vs_6_2, vs_6_3, vs_6_4, vs_6_5, vs_6_6, vs_6_7, gs_6_0, gs_6_1, gs_6_2, gs_6_3, gs_6_4, gs_6_5, gs_6_6, gs_6_7, hs_6_0, hs_6_1, hs_6_2, hs_6_3, hs_6_4, hs_6_5, hs_6_6, hs_6_7, ds_6_0, ds_6_1, ds_6_2, ds_6_3, ds_6_4, ds_6_5, ds_6_6, ds_6_7, cs_6_0, cs_6_1, cs_6_2, cs_6_3, cs_6_4, cs_6_5, cs_6_6, cs_6_7, lib_6_1, lib_6_2, lib_6_3, lib_6_4, lib_6_5, lib_6_6, lib_6_7, ms_6_5, ms_6_6, ms_6_7, as_6_5, as_6_6, as_6_7
-Vd	Disable validation
-Vi	Display details about the include process.
-Vn	Use as variable name in header file
-WX	Treat warnings as errors
-Zi	Enable debug information. Cannot be used together with -Zs
-Zpc	Pack matrices in column-major order
-Zpr	Pack matrices in row-major order
-Zsb	Compute Shader Hash considering only output binary
-Zss	Compute Shader Hash considering source information
-Zs	Generate small PDB with just sources and compile options. Cannot be used together with -Zi

OPTIONS:
-MD	Write a file with .d extension that will contain the list of the compilation target dependencies.
-MF	Write the specfied file that will contain the list of the compilation target dependencies.
-M	Dumps the list of the compilation target dependencies.

Optimization Options:
-O0	Optimization Level 0
-O1	Optimization Level 1
-O2	Optimization Level 2
-O3	Optimization Level 3 (Default)

Rewriter Options:
-decl-global-cb	Collect all global constants outside cbuffer declarations into cbuffer GlobalCB { … }. Still experimental, not all dependency scenarios handled.
-extract-entry-uniforms	Move uniform parameters from entry point to global scope
-global-extern-by-default	Set extern on non-static globals
-keep-user-macro	Write out user defines after rewritten HLSL
-line-directive	Add line directive
-remove-unused-functions	Remove unused functions and types
-remove-unused-globals	Remove unused static globals and functions
-skip-fn-body	Translate function definitions to declarations
-skip-static	Remove static functions and globals when used with -skip-fn-body
-unchanged	Rewrite HLSL, without changes.

SPIR-V CodeGen Options:
-fspv-debug=	Specify whitelist of debug info category (file -> source -> line, tool, vulkan-with-source)
-fspv-entrypoint-name=	Specify the SPIR-V entry point name. Defaults to the HLSL entry point name.
-fspv-extension=	Specify SPIR-V extension permitted to use
-fspv-flatten-resource-arrays	Flatten arrays of resources so each array element takes one binding number
-fspv-print-all	Print the SPIR-V module before each pass and after the last one. Useful for debugging SPIR-V legalization and optimization passes.
-fspv-reduce-load-size	Replaces loads of composite objects to reduce memory pressure for the loads
-fspv-reflect	Emit additional SPIR-V instructions to aid reflection
-fspv-target-env=	Specify the target environment: vulkan1.0 (default), vulkan1.1, vulkan1.1spirv1.4, vulkan1.2, vulkan1.3, or universal1.5
-fspv-use-legacy-buffer-matrix-order	Assume the legacy matrix order (row major) when accessing raw buffers (e.g., ByteAdddressBuffer)
-fvk-auto-shift-bindings	Apply fvk-*-shift to resources without an explicit register assignment.
-fvk-b-shift	Specify Vulkan binding number shift for b-type register
-fvk-bind-globals	Specify Vulkan binding number and set number for the $Globals cbuffer
-fvk-bind-register	Specify Vulkan descriptor set and binding for a specific register
-fvk-invert-y	Negate SV_Position.y before writing to stage output in VS/DS/GS to accommodate Vulkan’s coordinate system
-fvk-s-shift	Specify Vulkan binding number shift for s-type register
-fvk-support-nonzero-base-instance	Follow Vulkan spec to use gl_BaseInstance as the first vertex instance, which makes SV_InstanceID = gl_InstanceIndex - gl_BaseInstance (without this option, SV_InstanceID = gl_InstanceIndex)
-fvk-t-shift	Specify Vulkan binding number shift for t-type register
-fvk-u-shift	Specify Vulkan binding number shift for u-type register
-fvk-use-dx-layout	Use DirectX memory layout for Vulkan resources
-fvk-use-dx-position-w	Reciprocate SV_Position.w after reading from stage input in PS to accommodate the difference between Vulkan and DirectX
-fvk-use-gl-layout	Use strict OpenGL std140/std430 memory layout for Vulkan resources
-fvk-use-scalar-layout	Use scalar memory layout for Vulkan resources
-Oconfig=	Specify a comma-separated list of SPIRV-Tools passes to customize optimization configuration (see http://khr.io/hlsl2spirv#optimization)
-spirv	Generate SPIR-V code

Utility Options:
-dumpbin	Load a binary file rather than compiling
-extractrootsignature	Extract root signature from shader bytecode (must be used with /Fo )
-getprivate	Save private data from shader blob
-link	Link list of libraries provided in argument separated by ';'
-P	Preprocess to file
-Qembed_debug	Embed PDB in shader container (must be used with /Zi)
-Qstrip_debug	Strip debug information from 4_0+ shader bytecode (must be used with /Fo )
-Qstrip_priv	Strip private data from shader bytecode (must be used with /Fo )
-Qstrip_reflect	Strip reflection data from shader bytecode (must be used with /Fo )
-Qstrip_rootsignature	Strip root signature data from shader bytecode (must be used with /Fo )
-setprivate	Private data to add to compiled shader blob
-setrootsignature	Attach root signature to shader bytecode
-verifyrootsignature	Verify shader bytecode with root signature

Warning Options:
-W[no-]	Enable/Disable the specified warning

Updates

15/12/2021: Update DXC cheat sheet with options from DXC 1.6.2112.12
27/04/2021: Update DXC cheat sheet with options from DXC 1.7 (1.6.0.3119) with SM6.6 support
14/10/2021: Add custom include handler example
01/02/2023: Update DXC cheat sheet with options from DXC 1.7 (1.7.2212.12)

Some great articles that go more in-depth

« Previous: Optimizing spotlight intersection in tiled/clustered light culling

Next: DOOM Eternal - Graphics Study »