31. Device-Generated Commands
This chapter discusses the generation of command buffer content on the device, for which these principle steps are to be taken:
-
Define via
VkIndirectCommandsLayoutNVthe sequence of commands which should be generated. -
Optionally make use of device-bindable Shader Groups.
-
Retrieve device addresses by vkGetBufferDeviceAddressEXT for setting buffers on the device.
-
Fill one or more
VkBufferwith the appropriate content that gets interpreted byVkIndirectCommandsLayoutNV. -
Create a
preprocessVkBufferusing the allocation information from vkGetGeneratedCommandsMemoryRequirementsNV. -
Optionally preprocess the input data using vkCmdPreprocessGeneratedCommandsNV in a separate action.
-
Generate and execute the actual commands via vkCmdExecuteGeneratedCommandsNV passing all required data.
vkCmdPreprocessGeneratedCommandsNV executes in a separate logical pipeline from either graphics or compute. When preprocessing commands in a separate step they must be explicitly synchronized against the command execution. When not preprocessing, the preprocessing is automatically synchronized against the command execution.
31.1. Indirect Commands Layout
The device-side command generation happens through an iterative processing of an atomic sequence comprised of command tokens, which are represented by:
// Provided by VK_NV_device_generated_commands
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkIndirectCommandsLayoutNV)
31.1.1. Creation and Deletion
Indirect command layouts are created by:
// Provided by VK_NV_device_generated_commands
VkResult vkCreateIndirectCommandsLayoutNV(
VkDevice device,
const VkIndirectCommandsLayoutCreateInfoNV* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkIndirectCommandsLayoutNV* pIndirectCommandsLayout);
-
deviceis the logical device that creates the indirect command layout. -
pCreateInfois a pointer to aVkIndirectCommandsLayoutCreateInfoNVstructure containing parameters affecting creation of the indirect command layout. -
pAllocatorcontrols host memory allocation as described in the Memory Allocation chapter. -
pIndirectCommandsLayoutis a pointer to aVkIndirectCommandsLayoutNVhandle in which the resulting indirect command layout is returned.
The VkIndirectCommandsLayoutCreateInfoNV structure is defined as:
// Provided by VK_NV_device_generated_commands
typedef struct VkIndirectCommandsLayoutCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkIndirectCommandsLayoutUsageFlagsNV flags;
VkPipelineBindPoint pipelineBindPoint;
uint32_t tokenCount;
const VkIndirectCommandsLayoutTokenNV* pTokens;
uint32_t streamCount;
const uint32_t* pStreamStrides;
} VkIndirectCommandsLayoutCreateInfoNV;
-
sTypeis the type of this structure. -
pNextisNULLor a pointer to a structure extending this structure. -
pipelineBindPointis the VkPipelineBindPoint that this layout targets. -
flagsis a bitmask of VkIndirectCommandsLayoutUsageFlagBitsNV specifying usage hints of this layout. -
tokenCountis the length of the individual command sequence. -
pTokensis an array describing each command token in detail. See VkIndirectCommandsTokenTypeNV and VkIndirectCommandsLayoutTokenNV below for details. -
streamCountis the number of streams used to provide the token inputs. -
pStreamStridesis an array defining the byte stride for each input stream.
The following code illustrates some of the flags:
void cmdProcessAllSequences(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsTokens, sequencesCount, indexbuffer, indexbufferOffset)
{
for (s = 0; s < sequencesCount; s++)
{
sUsed = s;
if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV) {
sUsed = indexbuffer.load_uint32( sUsed * sizeof(uint32_t) + indexbufferOffset);
}
if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV) {
sUsed = incoherent_implementation_dependent_permutation[ sUsed ];
}
cmdProcessSequence( cmd, pipeline, indirectCommandsLayout, pIndirectCommandsTokens, sUsed );
}
}
When tokens are consumed, an offset is computed based on token offset and
stream stride.
The resulting offset is required to be aligned.
The alignment for a specific token is equal to the scalar alignment of the
data type as defined in Alignment
Requirements, or
VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minIndirectCommandsBufferOffsetAlignment,
whichever is lower.
|
Note
A |
Bits which can be set in
VkIndirectCommandsLayoutCreateInfoNV::flags, specifying usage
hints of an indirect command layout, are:
// Provided by VK_NV_device_generated_commands
typedef enum VkIndirectCommandsLayoutUsageFlagBitsNV {
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV = 0x00000001,
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV = 0x00000002,
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV = 0x00000004,
} VkIndirectCommandsLayoutUsageFlagBitsNV;
-
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NVspecifies that the layout is always used with the manual preprocessing step through calling vkCmdPreprocessGeneratedCommandsNV and executed by vkCmdExecuteGeneratedCommandsNV withisPreprocessedset toVK_TRUE. -
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NVspecifies that the input data for the sequences is not implicitly indexed from 0..sequencesUsed but a user providedVkBufferencoding the index is provided. -
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NVspecifies that the processing of sequences can happen at an implementation-dependent order, which is not: guaranteed to be coherent using the same input data.
// Provided by VK_NV_device_generated_commands
typedef VkFlags VkIndirectCommandsLayoutUsageFlagsNV;
VkIndirectCommandsLayoutUsageFlagsNV is a bitmask type for setting a
mask of zero or more VkIndirectCommandsLayoutUsageFlagBitsNV.
Indirect command layouts are destroyed by:
// Provided by VK_NV_device_generated_commands
void vkDestroyIndirectCommandsLayoutNV(
VkDevice device,
VkIndirectCommandsLayoutNV indirectCommandsLayout,
const VkAllocationCallbacks* pAllocator);
-
deviceis the logical device that destroys the layout. -
indirectCommandsLayoutis the layout to destroy. -
pAllocatorcontrols host memory allocation as described in the Memory Allocation chapter.
31.1.2. Token Input Streams
The VkIndirectCommandsStreamNV structure specifies the input data for
one or more tokens at processing time.
// Provided by VK_NV_device_generated_commands
typedef struct VkIndirectCommandsStreamNV {
VkBuffer buffer;
VkDeviceSize offset;
} VkIndirectCommandsStreamNV;
-
bufferspecifies the VkBuffer storing the functional arguments for each sequence. These arguments can be written by the device. -
offsetspecified an offset intobufferwhere the arguments start.
The input streams can contain raw uint32_t values, existing indirect
commands such as:
or additional commands as listed below. How the data is used is described in the next section.
The VkBindShaderGroupIndirectCommandNV structure specifies the input
data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV token.
// Provided by VK_NV_device_generated_commands
typedef struct VkBindShaderGroupIndirectCommandNV {
uint32_t groupIndex;
} VkBindShaderGroupIndirectCommandNV;
-
indexspecifies which shader group of the current bound graphics pipeline is used.
The VkBindIndexBufferIndirectCommandNV structure specifies the input
data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV token.
// Provided by VK_NV_device_generated_commands
typedef struct VkBindIndexBufferIndirectCommandNV {
VkDeviceAddress bufferAddress;
uint32_t size;
VkIndexType indexType;
} VkBindIndexBufferIndirectCommandNV;
-
bufferAddressspecifies a physical address of the VkBuffer used as index buffer. -
sizeis the byte size range which is available for this operation from the provided address. -
indexTypeis a VkIndexType value specifying how indices are treated. Instead of the Vulkan enum values, a customuint32_tvalue can be mapped to an VkIndexType by specifying theVkIndirectCommandsLayoutTokenNV::pIndexTypesandVkIndirectCommandsLayoutTokenNV::pIndexTypeValuesarrays.
The VkBindVertexBufferIndirectCommandNV structure specifies the input
data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV token.
// Provided by VK_NV_device_generated_commands
typedef struct VkBindVertexBufferIndirectCommandNV {
VkDeviceAddress bufferAddress;
uint32_t size;
uint32_t stride;
} VkBindVertexBufferIndirectCommandNV;
-
bufferAddressspecifies a physical address of the VkBuffer used as vertex input binding. -
sizeis the byte size range which is available for this operation from the provided address. -
strideis the byte size stride for this vertex input binding as inVkVertexInputBindingDescription::stride. It is only used ifVkIndirectCommandsLayoutTokenNV::vertexDynamicStridewas set, otherwise the stride is inherited from the current bound graphics pipeline.
The VkSetStateFlagsIndirectCommandNV structure specifies the input
data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV token.
Which state is changed depends on the VkIndirectStateFlagBitsNV
specified at VkIndirectCommandsLayoutNV creation time.
// Provided by VK_NV_device_generated_commands
typedef struct VkSetStateFlagsIndirectCommandNV {
uint32_t data;
} VkSetStateFlagsIndirectCommandNV;
-
dataencodes packed state that this command alters.-
Bit
0: If set representsVK_FRONT_FACE_CLOCKWISE, otherwiseVK_FRONT_FACE_COUNTER_CLOCKWISE
-
A subset of the graphics pipeline state can be altered using indirect state flags:
// Provided by VK_NV_device_generated_commands
typedef enum VkIndirectStateFlagBitsNV {
VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV = 0x00000001,
} VkIndirectStateFlagBitsNV;
-
VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NVallows to toggle the VkFrontFace rasterization state for subsequent draw operations.
// Provided by VK_NV_device_generated_commands
typedef VkFlags VkIndirectStateFlagsNV;
VkIndirectStateFlagsNV is a bitmask type for setting a mask of zero or
more VkIndirectStateFlagBitsNV.
31.1.3. Tokenized Command Processing
The processing is in principle illustrated below:
void cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s)
{
for (t = 0; t < indirectCommandsLayout.tokenCount; t++)
{
uint32_t stream = indirectCommandsLayout.pTokens[t].stream;
uint32_t offset = indirectCommandsLayout.pTokens[t].offset;
uint32_t stride = indirectCommandsLayout.pStreamStrides[stream];
stream = pIndirectCommandsStreams[stream];
const void* input = stream.buffer.pointer( stream.offset + stride * s + offset )
// further details later
indirectCommandsLayout.pTokens[t].command (cmd, pipeline, input, s);
}
}
void cmdProcessAllSequences(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, sequencesCount)
{
for (s = 0; s < sequencesCount; s++)
{
cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s);
}
}
The processing of each sequence is considered stateless, therefore all state changes must occur prior work provoking commands within the sequence. A single sequence is strictly targeting the VkPipelineBindPoint it was created with.
The primary input data for each token is provided through VkBuffer
content at preprocessing using vkCmdPreprocessGeneratedCommandsNV or
execution time using vkCmdExecuteGeneratedCommandsNV, however some
functional arguments, for example binding sets, are specified at layout
creation time.
The input size is different for each token.
Possible values of those elements of the
VkIndirectCommandsLayoutCreateInfoNV::pTokens array specifying
command tokens (other elements of the array specify command parameters) are:
// Provided by VK_NV_device_generated_commands
typedef enum VkIndirectCommandsTokenTypeNV {
VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV = 0,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV = 1,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV = 2,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV = 3,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV = 4,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV = 5,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV = 6,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV = 7,
} VkIndirectCommandsTokenTypeNV;
| Token type | Equivalent command |
|---|---|
|
|
|
- |
|
|
|
|
|
|
|
|
|
|
|
The VkIndirectCommandsLayoutTokenNV structure specifies details to the
function arguments that need to be known at layout creation time:
// Provided by VK_NV_device_generated_commands
typedef struct VkIndirectCommandsLayoutTokenNV {
VkStructureType sType;
const void* pNext;
VkIndirectCommandsTokenTypeNV tokenType;
uint32_t stream;
uint32_t offset;
uint32_t vertexBindingUnit;
VkBool32 vertexDynamicStride;
VkPipelineLayout pushconstantPipelineLayout;
VkShaderStageFlags pushconstantShaderStageFlags;
uint32_t pushconstantOffset;
uint32_t pushconstantSize;
VkIndirectStateFlagsNV indirectStateFlags;
uint32_t indexTypeCount;
const VkIndexType* pIndexTypes;
const uint32_t* pIndexTypeValues;
} VkIndirectCommandsLayoutTokenNV;
-
sTypeis the type of this structure. -
pNextisNULLor a pointer to a structure extending this structure. -
tokenTypespecifies the token command type. -
streamis the index of the input stream containing the token argument data. -
offsetis a relative starting offset within the input stream memory for the token argument data. -
vertexBindingUnitis used for the vertex buffer binding command. -
vertexDynamicStridesets if the vertex buffer stride is provided by the binding command rather than the current bound graphics pipeline state. -
pushconstantPipelineLayoutis theVkPipelineLayoutused for the push constant command. -
pushconstantShaderStageFlagsare the shader stage flags used for the push constant command. -
pushconstantOffsetis the offset used for the push constant command. -
pushconstantSizeis the size used for the push constant command. -
indirectStateFlagsare the active states for the state flag command. -
indexTypeCountis the optional size of thepIndexTypesandpIndexTypeValuesarray pairings. If not zero, it allows to register a customuint32_tvalue to be treated as specificVkIndexType. -
pIndexTypesis the usedVkIndexTypefor the correspondinguint32_tvalue entry inpIndexTypeValues.
The following code provides detailed information on how an individual sequence is processed. For valid usage, all restrictions from the regular commands apply.
void cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s)
{
for (uint32_t t = 0; t < indirectCommandsLayout.tokenCount; t++){
token = indirectCommandsLayout.pTokens[t];
uint32_t stride = indirectCommandsLayout.pStreamStrides[token.stream];
stream = pIndirectCommandsStreams[token.stream];
uint32_t offset = stream.offset + stride * s + token.offset;
const void* input = stream.buffer.pointer( offset )
switch(input.type){
VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV:
VkBindShaderGroupIndirectCommandNV* bind = input;
vkCmdBindPipelineShaderGroupNV(cmd, indirectCommandsLayout.pipelineBindPoint,
pipeline, bind->groupIndex);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV:
VkSetStateFlagsIndirectCommandNV* state = input;
if (token.indirectStateFlags & VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV){
if (state.data & (1 << 0)){
set VK_FRONT_FACE_CLOCKWISE;
} else {
set VK_FRONT_FACE_COUNTER_CLOCKWISE;
}
}
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV:
uint32_t* data = input;
vkCmdPushConstants(cmd,
token.pushconstantPipelineLayout
token.pushconstantStageFlags,
token.pushconstantOffset,
token.pushconstantSize, data);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV:
VkBindIndexBufferIndirectCommandNV* data = input;
// the indexType may optionally be remapped
// from a custom uint32_t value, via
// VkIndirectCommandsLayoutTokenNV::pIndexTypeValues
vkCmdBindIndexBuffer(cmd,
deriveBuffer(data->bufferAddress),
deriveOffset(data->bufferAddress),
data->indexType);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV:
VkBindVertexBufferIndirectCommandNV* data = input;
// if token.vertexDynamicStride is VK_TRUE
// then the stride for this binding is set
// using data->stride as well
vkCmdBindVertexBuffers(cmd,
token.vertexBindingUnit, 1,
&deriveBuffer(data->bufferAddress),
&deriveOffset(data->bufferAddress));
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV:
vkCmdDrawIndexedIndirect(cmd,
stream.buffer, offset, 1, 0);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV:
vkCmdDrawIndirect(cmd,
stream.buffer,
offset, 1, 0);
break;
// only available if VK_NV_mesh_shader is supported
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_NV:
vkCmdDrawMeshTasksIndirectNV(cmd,
stream.buffer, offset, 1, 0);
break;
}
}
}
31.2. Indirect Commands Generation And Execution
The generation of commands on the device requires a preprocess buffer.
To retrieve the memory size and alignment requirements of a particular
execution state call:
// Provided by VK_NV_device_generated_commands
void vkGetGeneratedCommandsMemoryRequirementsNV(
VkDevice device,
const VkGeneratedCommandsMemoryRequirementsInfoNV* pInfo,
VkMemoryRequirements2* pMemoryRequirements);
-
deviceis the logical device that owns the buffer. -
pInfois a pointer to aVkGeneratedCommandsMemoryRequirementsInfoNVstructure containing parameters required for the memory requirements query. -
pMemoryRequirementsis a pointer to a VkMemoryRequirements2 structure in which the memory requirements of the buffer object are returned.
// Provided by VK_NV_device_generated_commands
typedef struct VkGeneratedCommandsMemoryRequirementsInfoNV {
VkStructureType sType;
const void* pNext;
VkPipelineBindPoint pipelineBindPoint;
VkPipeline pipeline;
VkIndirectCommandsLayoutNV indirectCommandsLayout;
uint32_t maxSequencesCount;
} VkGeneratedCommandsMemoryRequirementsInfoNV;
-
sTypeis the type of this structure. -
pNextisNULLor a pointer to a structure extending this structure. -
pipelineBindPointis the VkPipelineBindPoint of thepipelinethat this buffer memory is intended to be used with during the execution. -
pipelineis the VkPipeline that this buffer memory is intended to be used with during the execution. -
indirectCommandsLayoutis the VkIndirectCommandsLayoutNV that this buffer memory is intended to be used with. -
maxSequencesCountis the maximum number of sequences that this buffer memory in combination with the other state provided can be used with.
The actual generation of commands as well as their execution on the device is handled as single action with:
// Provided by VK_NV_device_generated_commands
void vkCmdExecuteGeneratedCommandsNV(
VkCommandBuffer commandBuffer,
VkBool32 isPreprocessed,
const VkGeneratedCommandsInfoNV* pGeneratedCommandsInfo);
-
commandBufferis the command buffer into which the command is recorded. -
isPreprocessedrepresents whether the input data has already been preprocessed on the device. If it isVK_FALSEthis command will implicitly trigger the preprocessing step, otherwise not. -
pGeneratedCommandsInfois a pointer to a VkGeneratedCommandsInfoNV structure containing parameters affecting the generation of commands.
// Provided by VK_NV_device_generated_commands
typedef struct VkGeneratedCommandsInfoNV {
VkStructureType sType;
const void* pNext;
VkPipelineBindPoint pipelineBindPoint;
VkPipeline pipeline;
VkIndirectCommandsLayoutNV indirectCommandsLayout;
uint32_t streamCount;
const VkIndirectCommandsStreamNV* pStreams;
uint32_t sequencesCount;
VkBuffer preprocessBuffer;
VkDeviceSize preprocessOffset;
VkDeviceSize preprocessSize;
VkBuffer sequencesCountBuffer;
VkDeviceSize sequencesCountOffset;
VkBuffer sequencesIndexBuffer;
VkDeviceSize sequencesIndexOffset;
} VkGeneratedCommandsInfoNV;
-
sTypeis the type of this structure. -
pNextisNULLor a pointer to a structure extending this structure. -
pipelineBindPointis the VkPipelineBindPoint used for thepipeline. -
pipelineis the VkPipeline used in the generation and execution process. -
indirectCommandsLayoutis the VkIndirectCommandsLayoutNV that provides the command sequence to generate. -
streamCountdefines the number of input streams -
pStreamsis a pointer to an array ofstreamCountVkIndirectCommandsStreamNV structures providing the input data for the tokens used inindirectCommandsLayout. -
sequencesCountis the maximum number of sequences to reserve. IfsequencesCountBufferis VK_NULL_HANDLE, this is also the actual number of sequences generated. -
preprocessBufferis the VkBuffer that is used for preprocessing the input data for execution. If this structure is used with vkCmdExecuteGeneratedCommandsNV with itsisPreprocessedset toVK_TRUE, then the preprocessing step is skipped and data is only read from this buffer. -
preprocessOffsetis the byte offset intopreprocessBufferwhere the preprocessed data is stored. -
preprocessSizeis the maximum byte size within thepreprocessBufferafter thepreprocessOffsetthat is available for preprocessing. -
sequencesCountBufferis aVkBufferin which the actual number of sequences is provided as singleuint32_tvalue. -
sequencesCountOffsetis the byte offset intosequencesCountBufferwhere the count value is stored. -
sequencesIndexBufferis aVkBufferthat encodes the used sequence indices asuint32_tarray. -
sequencesIndexOffsetis the byte offset intosequencesIndexBufferwhere the index values start.
Referencing the functions defined in Indirect Commands Layout,
vkCmdExecuteGeneratedCommandsNV behaves as:
uint32_t sequencesCount = sequencesCountBuffer ?
min(maxSequencesCount, sequencesCountBuffer.load_uint32(sequencesCountOffset) :
maxSequencesCount;
cmdProcessAllSequences(commandBuffer, pipeline,
indirectCommandsLayout, pIndirectCommandsStreams,
sequencesCount,
sequencesIndexBuffer, sequencesIndexOffset);
// The stateful commands within indirectCommandsLayout will not
// affect the state of subsequent commands in the target
// command buffer (cmd)
|
Note
It is important to note that the values of all state related to the
|
Commands can be preprocessed prior execution using the following command:
// Provided by VK_NV_device_generated_commands
void vkCmdPreprocessGeneratedCommandsNV(
VkCommandBuffer commandBuffer,
const VkGeneratedCommandsInfoNV* pGeneratedCommandsInfo);
-
commandBufferis the command buffer which does the preprocessing. -
pGeneratedCommandsInfois a pointer to a VkGeneratedCommandsInfoNV structure containing parameters affecting the preprocessing step.