UE4 Graphics Programming_rhicreateboundshaderstate-程序员宅基地

技术标签: 图形引擎  引擎工具  开发工具  游戏引擎  Unreal  手机游戏  游戏开发  



Getting Started

There is a lot of rendering code in Unreal Engine 4 (UE4) so it is hard to get a quick high level view of what is going on. A good place to start reading through the code is FDeferredShadingSceneRenderer::Render, which is where a new frame is rendered on the rendering thread. It is also useful to do a 'profilegpu' command and look through the draw events. You can then do a Find in Files in Visual Studio on the draw event name to find the corresponding C++ implementation.

Useful console commands when working on rendering (Usually you get help when using '?' as parameter and the current state with no parameters):

Console Command Description
'stat unit' Shows overall frame time as well as the game thread, rendering thread, and GPU times. Whichever is the longest is the bottleneck. However, GPU time contains idle time, so is only the bottleneck if it is the longest and stands alone.
Ctrl+Shift+. or 'recompileshaders changed' Recompile shaders that have changed since you last saved the .usf file. This will automatically happen on load.
Ctrl+Shift+; or 'profilegpu' Measure GPU timings for the view being rendered. You can view the results in the UI that pops up or in the engine log.
'Vis' or 'VisualizeTexture' Visualize the contents of various render targets with the ability to save as bmp.
'show x' Toggles specified show flag. Use 'show' to get the list of showflags and their current state. In the editor, use the viewport UI instead.
'pause' Pauses the game, but continues rendering. Any simulation rendering work will stop.
'slomo x' Changes the game speed. Can be very useful for slowing down time without skipping simulation work, when profiling. For example 'slomo .01'
'debugcreateplayer 1' For testing splitscreen.
'r.CompositionGraphDebug' Execute to get a single frame dump of the composition graph of one frame (post processing and lighting).
'r.DumpShaderDebugInfo' When set to 1, will cause any shaders that are then compiled to dump debug info to GameName/Saved/ShaderDebugInfo.
'r.RenderTargetPoolTest' Clears the texture returned by the rendertarget pool with a special color to track down color leaking bugs.
'r.SetRes' Set the display resolution for the current game view. Has no effect in the editor.
'r.ViewportTest' Allows to test different Viewport rectangle configuations (in game only) as they can happen when using Matinee/Editor.

Useful command lines when working on rendering:

Commandline Description
-d3ddebug Enables the D3D11 debug layer, useful for catching API errors.
-sm4 Forces Feature Level SM4 with the D3D11 RHI.
-opengl3 / -opengl4 Forces use of OpenGL RHI at the specified feature level.
-ddc=noshared Prevents the use of network (shared) Derived Data Cache. Can be useful when debugging shader caching issues.

Modules

The renderer code exists in its own module, which is compiled to a dll for non-monolithic builds. This allows faster iteration as we do not have to relink the entire application when rendering code changes. The Renderer module depends on Engine because it has many callbacks into Engine. However, when the Engine needs to call some code in the Renderer, this happens through an interface, usually IRendererModule or FSceneInterface.

Scene representation

In UE4, the scene as the renderer sees it is defined by primitive components and the lists of various other structures stored in FScene. An Octree of primitives is maintained for accelerated spatial queries.

Primary scene classes

There is a Rendering Thread in UE4 which operates in parallel with the game thread. Most classes that bridge the gap between the game thread and rendering thread are split into two parts based on which thread has ownership of that state.

The primary classes are:

Class Description
UWorld A world contains a collection of Actors and Components that can interact with each other. Levels can be streamed in and out of the world, and multiple worlds can be active in the program.
ULevel Collection of Actors and Components that are loaded / unloaded together and saved in a single map file.
USceneComponent Base class of any object that needs to be added to an FScene, like lights, meshes, fog, etc.
UPrimitiveComponent Base class of anything that can be rendered or interact with physics. Also acts as the granularity of visibility culling and rendering property specification (casts shadows, etc). Just like all UObjects, the game thread owns all variables and state and the rendering thread should not access it directly.
ULightComponent Represents a light source. The Renderer is responsible for computing and adding its contribution to the scene.
FScene Renderer version of the UWorld. Objects only exist to the renderer once they are added to the FScene, which is called registering a component. The rendering thread owns all states in the FScene -the game thread cannot modify it directly.
FPrimitiveSceneProxy Renderer version of UPrimitiveComponent, mirrors UPrimitiveComponent state for the rendering thread. Exists in the engine module and intended to be subclassed to support different types of primitives (skeletal, rigid, BSP, etc). Implements some very important functions like GetViewRelevance, DrawDynamicElements, etc.
FPrimitiveSceneInfo Internal renderer state (specific to the FRendererModule implementation) that corresponds to a UPrimitiveComponent and FPrimitiveSceneProxy. Exists in the renderer module, so the engine cannot see it.
FSceneView Engine representation of a single view into an FScene. A scene can be rendered with different views in different calls to FSceneRenderer::Render (multiple editor viewports) or with multiple views in the same call to FSceneRenderer::Render (splitscreen in game). A new View is constructed for each frame.
FViewInfo Internal renderer representation of a view, exists in the renderer module.
FSceneViewState The ViewState stores private renderer information about a view which is needed across frames. In game, there is one view state per ULocalPlayer.
FSceneRenderer A class created each frame to encapsulate inter-frame temporaries.

Here is a list of the primary classes arranged by which module they are in. This becomes important when you are trying to figure out how to solve linker issues.

Engine Module Renderer Module
UWorld FScene
UPrimitiveComponent / FPrimitiveSceneProxy FPrimitiveSceneInfo
FSceneView FViewInfo
ULocalPlayer FSceneViewState
ULightComponent / FLightSceneProxy FLightSceneInfo

And the same classes arranged by which thread has ownership of their state. It is important to always be mindful of what thread owns the state you are writing code for, to avoid causing race conditions.

Game Thread Rendering Thread
UWorld FScene
UPrimitiveComponent FPrimitiveSceneProxy / FPrimitiveSceneInfo
  FSceneView / FViewInfo
ULocalPlayer FSceneViewState
ULightComponent FLightSceneProxy / FLightSceneInfo
Material classes
Class Description
FMaterial An interface to a material used for rendering. Provides access to material properties (e.g. blend mode). Contains a shader map used by the renderer to retrieve individual shaders.
FMaterialResource UMaterial's implementation of the FMaterial interface.
FMaterialRenderProxy A material's representation on the rendering thread. Provides access to an FMaterial interface and the current value of each scalar, vector, and texture parameter.
UMaterialInterface [abstract] Game thread interface for material functionality. Used to retrieve the FMaterialRenderProxy used for rendering and the UMaterial that is used as the source.
UMaterial A material asset. Authored as a node graph. Computes material attributes used for shading, sets blend mode, etc.
UMaterialInstance [abstract] An instance of a UMaterial. Uses the node graph in the UMaterial but provides different parameters (scalars, vectors, textures, static switches). Each instance has a parent UMaterialInterface. Therefore a material instance's parent may be a UMaterial or another UMaterialInstance. This creates a chain that will eventually lead to a UMaterial.
UMaterialInstanceConstant A UMaterialInstance that may only be modified in the editor. May provide scalar, vector, texture, and static switch parameters.
UMaterialInstanceDynamic A UMaterialInstance that may be modified at runtime. May provide scalar, vector, and texture parameters. It cannot provide static switch parameters and it cannot be the parent of another UMaterialInstance.
Primitive components and proxies

Primitive components are the basic unit of visibility and relevance determination. For example, occlusion and frustum culling happen on a per-primitive basis. Therefore it is important when designing a system to think about how big to make components. Each component has a bounds that is used for various operations like culling, shadow casting, and light influence determination.

Components only become visible to the scene (and therefore the renderer) when they are registered. Game thread code that changes a component's properties must call MarkRenderStateDirty() on the component to propagate the change to the rendering thread.

FPrimitiveSceneProxy and FPrimitiveSceneInfo

FPrimitiveSceneProxy is the rendering thread version of UPrimitiveComponent that is intended to be subclassed depending on the component type. It lives in the Engine module and has functions called during rendering passes. FPrimitiveSceneInfo is the primitive component state that is private to the renderer module.

Important FPrimitiveSceneProxy methods
Function Description
GetViewRelevance Called from InitViews at the beginning of the frame, and returns a populated FPrimitiveViewRelevance.
DrawDynamicElements Called to draw the proxy in any passes which the proxy is relevant to. Only called if the proxy indicated it has dynamic relevance.
DrawStaticElements Called to submit StaticMesh elements for the proxy when the primitive is being attached on the game thread. Only called if the proxy indicated it has static relevance.
Scene rendering order

The renderer processes the scene in the order that it wants to composite data to the render targets. For example, the Depth only pass is rendered before the Base pass, so that Heirarchical Z (HiZ) will be populated to reduce shading cost in the base pass. This order is statically defined by the order pass functions are called in C++.

Relevance

FPrimitiveViewRelevance is the information on what effects (and therefore passes) are relevant to the primitive. A primitive may have multiple elements with different relevance, so FPrimitiveViewRelevance is effectively a logical OR of all the element's relevancies. This means that a primitive can have both opaque and translucent relevance, or dynamic and static relevance; they are not mutually exclusive.

FPrimitiveViewRelevance also indicates whether a primitive needs to use the dynamic and/or static rendering path with bStaticRelevance and bDynamicRelevance.

Drawing Policies

Drawing policies contain the logic to render meshes with pass specific shaders. They use the FVertexFactory interface to abstract the mesh type, and the FMaterial interface to abstract the material details. At the lowest level, a drawing policy takes a set of mesh material shaders and a vertex factory, binds the vertex factory's buffers to the Rendering Hardware Interface (RHI), binds the mesh material shaders to the RHI, sets the appropriate shader parameters, and issues the RHI draw call.

Drawing Policy methods
Function Description
Constructor Finds the appropriate shader from the given vertex factory and material shader map, stores these references.
CreateBoundShaderState Creates an RHI bound shader state for the drawing policy.
Matches/Compare Provides methods to sort the drawing policy with others in the static draw lists. Matches must compare on all the factors that DrawShared depends on.
DrawShared Sets RHI state that is constant between drawing policies that return true from Matches. For example, most drawing policies sort on material and vertex factory, so shader parameters depending only on the material can be set, and the vertex buffers specific to the vertex factory can be bound. State should always be set here if possible instead of SetMeshRenderState, since DrawShared is called less times in the static rendering path.
SetMeshRenderState Sets RHI state that is specific to this mesh, or anything not set in DrawShared. This is called many more times than DrawShared so performance is especially critical here.
DrawMesh Actually issues the RHI draw call.

Rendering paths

UE4 has a dynamic path which provides more control but is slower to traverse, and a static rendering path which caches scene traversal as close to the RHI level as possible. The difference is mostly high level, since they both use drawing policies at the lowest level. Each rendering pass (drawing policy) needs to be sure to handle both rendering paths if needed.

Dynamic rendering path

The dynamic rendering path uses TDynamicPrimitiveDrawer and calls DrawDynamicElements on each primitive scene proxy to render. The set of primitives that need to use the dynamic path to be rendered is tracked by FViewInfo::VisibleDynamicPrimitives. Each rendering pass needs to iterate over this array, and call DrawDynamicElements on each primitive's proxy. DrawDynamicElements of the proxy then needs to assemble as many FMeshElements as it needs and submit them with DrawRichMesh or TDynamicPrimitiveDrawer::DrawMesh. This ends up creating a new temporary drawing policy, calling CreateBoundShaderState, DrawShared, SetMeshRenderState, and finally DrawMesh.

The dynamic rendering path provides a lot of flexibility because each proxy has a callback in DrawDynamicElements where it can execute logic specific to that component type. It also has minimal insertion cost but high traversal cost, because there is no state sorting, and nothing is cached.

Static rendering path

The static rendering path is implemented through static draw lists. Meshes are inserted into the draw lists when they are attached to the scene. During this insertion, DrawStaticElements on the proxy is called to collect the FStaticMeshElements. A drawing policy instance is then created and stored, along with the result of CreateBoundShaderState. The new drawing policy is sorted based on its Compare and Matches functions and inserted into the appropriate place in the draw list (see TStaticMeshDrawList::AddMesh). In InitViews, a bitarray containing visibility data for the static draw list is initialized and passed into TStaticMeshDrawList::DrawVisible where the draw list is actually drawn. DrawShared is only called once for all the drawing policies that match each other, while SetMeshRenderState and DrawMesh are called for each FStaticMeshElement (see TStaticMeshDrawList::DrawElement).

The static rendering path moves a lot of work to attach time, which significantly speeds up scene traversal at rendering time. Static draw list rendering is about 3x faster on the rendering thread for Static Meshes, which allows a lot more Static Meshes in the scene. Because static draw lists cache data at attach time, they can only cache view independent state. Primitives that are rarely reattached but often rendered are good candidates for the static draw lists.

The static rendering path can expose bugs because of the way it only calls DrawShared once per state bucket. These bugs can be difficult to detect, since they depend on the rendering order and the attach order of meshes in the scene. Special view modes such as lighting only, unlit, etc will force all primitives to use the dynamic path, so if a bug goes away when forcing the dynamic rendering path, there is a good chance it is due to an incorrect implementation of a drawing policy's DrawShared and/or the Matches function.

High level Rendering order

Here is a description of the control flow when rendering a frame starting from FDeferredShadingSceneRenderer::Render:

Operation Description
GSceneRenderTargets.Allocate Reallocates the global scene render targets to be large enough for the current view, if needed.
InitViews Initializes primitive visibility for the views through various culling methods, sets up dynamic shadows that are visible this frame, intersects shadow frustums with the world if necessary (for whole scene shadows or preshadows).
PrePass / Depth only pass RenderPrePass / FDepthDrawingPolicy. Renders occluders, outputting only depth to the depth buffer. This pass can operate in several modes: disabled, occlusion only, or complete depths, depending on what is needed by active features. The usual purpose of this pass is to initialize Hierarchical Z to reduce the shading cost of the Base pass, which has expensive pixel shaders.
Base pass RenderBasePass / TBasePassDrawingPolicy. Renders opaque and masked materials, outputting material attributes to the GBuffer. Lightmap contribution and sky lighting is also computed here and put in scene color.
Issue Occlusion Queries / BeginOcclusionTests Kicks off latent occlusion queries that will be used in the next frame's InitViews. These are done by rendering bounding boxes around the objects being queried, and sometimes grouping bounding boxes together to reduce draw calls.
Lighting Shadowmaps are rendered for each light and light contribution is accumulated to scene color, using a mix of standard deferred and tiled deferred shading. Light is also accumulated in the translucency lighting volumes.
Fog Fog and atmosphere are computed per-pixel for opaque surfaces in a deferred pass.
Translucency Translucency is accumulated into an offscreen render target where it has fogging applied per-vertex so it can integrate into the scene. Lit translucency computes final lighting in a single pass to blend correctly.
Post Processing Various post process effects are applied using the GBuffers. Translucency is composited into the scene.

This is a fairly simplified and high level view. To get more details, look through the relevant code or the log output of a 'profilegpu'.

Render Hardware Interface (RHI)

The RHI is a thin layer above the platform specific graphics API. The RHI abstraction level in UE4 is as low level as possible, with the intention that most features can be written in platform independent code and 'just work' on all platforms that support the required feature level.

Feature sets are quantized into ERHIFeatureLevel to keep the complexity low. If a platform cannot support all of the features required for a Feature Level, it must drop down in levels until it can.

Feature Level Description
SM5 Generally corresponds with D3D11 Shader Model 5, except only 16 textures can be used because of OpenGL 4.3 limits. Supports tessellation, compute shaders and cubemap arrays. The deferred shading path is supported.
SM4 Corresponds to D3D11 Shader Model 4, which is generally the same as SM5, except no tessellation, compute shaders or cubemap arrays. The deferred shading path is supported. Eye Adaptation is not supported as it uses compute shaders.
ES2 Corresponds to the features supported by most OpenGL ES2 mobile devices. Uses a pared down forward shading path.
Rendering state grouping

Render states are grouped based on what part of the pipeline they affect. For example, RHISetDepthState sets all state relevant to depth buffering.

Rendering state defaults

Since there are so many rendering states, it is not practical to set them all every time we want to draw something. Instead, UE4 has an implicit set of states which are assumed to be set to the defaults (and therefore must be restored to those defaults after they are changed), and a much smaller set of states which have to be set explicitly. The set of states that do not have implicit defaults are:

  • RHISetRenderTargets

  • RHISetBoundShaderState

  • RHISetDepthState

  • RHISetBlendState

  • RHISetRasterizerState

  • Any dependencies of the shaders set by RHISetBoundShaderState

All other states are assumed to be at their defaults (as defined by the relevant TStaticState, for example the default stencil state is set by RHISetStencilState(TStaticStencilState<>::GetRHI()).

Getting Started

There is a lot of rendering code in Unreal Engine 4 (UE4) so it is hard to get a quick high level view of what is going on. A good place to start reading through the code is FDeferredShadingSceneRenderer::Render, which is where a new frame is rendered on the rendering thread. It is also useful to do a 'profilegpu' command and look through the draw events. You can then do a Find in Files in Visual Studio on the draw event name to find the corresponding C++ implementation.

Useful console commands when working on rendering (Usually you get help when using '?' as parameter and the current state with no parameters):

Console Command Description
'stat unit' Shows overall frame time as well as the game thread, rendering thread, and GPU times. Whichever is the longest is the bottleneck. However, GPU time contains idle time, so is only the bottleneck if it is the longest and stands alone.
Ctrl+Shift+. or 'recompileshaders changed' Recompile shaders that have changed since you last saved the .usf file. This will automatically happen on load.
Ctrl+Shift+; or 'profilegpu' Measure GPU timings for the view being rendered. You can view the results in the UI that pops up or in the engine log.
'Vis' or 'VisualizeTexture' Visualize the contents of various render targets with the ability to save as bmp.
'show x' Toggles specified show flag. Use 'show' to get the list of showflags and their current state. In the editor, use the viewport UI instead.
'pause' Pauses the game, but continues rendering. Any simulation rendering work will stop.
'slomo x' Changes the game speed. Can be very useful for slowing down time without skipping simulation work, when profiling. For example 'slomo .01'
'debugcreateplayer 1' For testing splitscreen.
'r.CompositionGraphDebug' Execute to get a single frame dump of the composition graph of one frame (post processing and lighting).
'r.DumpShaderDebugInfo' When set to 1, will cause any shaders that are then compiled to dump debug info to GameName/Saved/ShaderDebugInfo.
'r.RenderTargetPoolTest' Clears the texture returned by the rendertarget pool with a special color to track down color leaking bugs.
'r.SetRes' Set the display resolution for the current game view. Has no effect in the editor.
'r.ViewportTest' Allows to test different Viewport rectangle configuations (in game only) as they can happen when using Matinee/Editor.

Useful command lines when working on rendering:

Commandline Description
-d3ddebug Enables the D3D11 debug layer, useful for catching API errors.
-sm4 Forces Feature Level SM4 with the D3D11 RHI.
-opengl3 / -opengl4 Forces use of OpenGL RHI at the specified feature level.
-ddc=noshared Prevents the use of network (shared) Derived Data Cache. Can be useful when debugging shader caching issues.

Modules

The renderer code exists in its own module, which is compiled to a dll for non-monolithic builds. This allows faster iteration as we do not have to relink the entire application when rendering code changes. The Renderer module depends on Engine because it has many callbacks into Engine. However, when the Engine needs to call some code in the Renderer, this happens through an interface, usually IRendererModule or FSceneInterface.

Scene representation

In UE4, the scene as the renderer sees it is defined by primitive components and the lists of various other structures stored in FScene. An Octree of primitives is maintained for accelerated spatial queries.

Primary scene classes

There is a Rendering Thread in UE4 which operates in parallel with the game thread. Most classes that bridge the gap between the game thread and rendering thread are split into two parts based on which thread has ownership of that state.

The primary classes are:

Class Description
UWorld A world contains a collection of Actors and Components that can interact with each other. Levels can be streamed in and out of the world, and multiple worlds can be active in the program.
ULevel Collection of Actors and Components that are loaded / unloaded together and saved in a single map file.
USceneComponent Base class of any object that needs to be added to an FScene, like lights, meshes, fog, etc.
UPrimitiveComponent Base class of anything that can be rendered or interact with physics. Also acts as the granularity of visibility culling and rendering property specification (casts shadows, etc). Just like all UObjects, the game thread owns all variables and state and the rendering thread should not access it directly.
ULightComponent Represents a light source. The Renderer is responsible for computing and adding its contribution to the scene.
FScene Renderer version of the UWorld. Objects only exist to the renderer once they are added to the FScene, which is called registering a component. The rendering thread owns all states in the FScene -the game thread cannot modify it directly.
FPrimitiveSceneProxy Renderer version of UPrimitiveComponent, mirrors UPrimitiveComponent state for the rendering thread. Exists in the engine module and intended to be subclassed to support different types of primitives (skeletal, rigid, BSP, etc). Implements some very important functions like GetViewRelevance, DrawDynamicElements, etc.
FPrimitiveSceneInfo Internal renderer state (specific to the FRendererModule implementation) that corresponds to a UPrimitiveComponent and FPrimitiveSceneProxy. Exists in the renderer module, so the engine cannot see it.
FSceneView Engine representation of a single view into an FScene. A scene can be rendered with different views in different calls to FSceneRenderer::Render (multiple editor viewports) or with multiple views in the same call to FSceneRenderer::Render (splitscreen in game). A new View is constructed for each frame.
FViewInfo Internal renderer representation of a view, exists in the renderer module.
FSceneViewState The ViewState stores private renderer information about a view which is needed across frames. In game, there is one view state per ULocalPlayer.
FSceneRenderer A class created each frame to encapsulate inter-frame temporaries.

Here is a list of the primary classes arranged by which module they are in. This becomes important when you are trying to figure out how to solve linker issues.

Engine Module Renderer Module
UWorld FScene
UPrimitiveComponent / FPrimitiveSceneProxy FPrimitiveSceneInfo
FSceneView FViewInfo
ULocalPlayer FSceneViewState
ULightComponent / FLightSceneProxy FLightSceneInfo

And the same classes arranged by which thread has ownership of their state. It is important to always be mindful of what thread owns the state you are writing code for, to avoid causing race conditions.

Game Thread Rendering Thread
UWorld FScene
UPrimitiveComponent FPrimitiveSceneProxy / FPrimitiveSceneInfo
  FSceneView / FViewInfo
ULocalPlayer FSceneViewState
ULightComponent FLightSceneProxy / FLightSceneInfo
Material classes
Class Description
FMaterial An interface to a material used for rendering. Provides access to material properties (e.g. blend mode). Contains a shader map used by the renderer to retrieve individual shaders.
FMaterialResource UMaterial's implementation of the FMaterial interface.
FMaterialRenderProxy A material's representation on the rendering thread. Provides access to an FMaterial interface and the current value of each scalar, vector, and texture parameter.
UMaterialInterface [abstract] Game thread interface for material functionality. Used to retrieve the FMaterialRenderProxy used for rendering and the UMaterial that is used as the source.
UMaterial A material asset. Authored as a node graph. Computes material attributes used for shading, sets blend mode, etc.
UMaterialInstance [abstract] An instance of a UMaterial. Uses the node graph in the UMaterial but provides different parameters (scalars, vectors, textures, static switches). Each instance has a parent UMaterialInterface. Therefore a material instance's parent may be a UMaterial or another UMaterialInstance. This creates a chain that will eventually lead to a UMaterial.
UMaterialInstanceConstant A UMaterialInstance that may only be modified in the editor. May provide scalar, vector, texture, and static switch parameters.
UMaterialInstanceDynamic A UMaterialInstance that may be modified at runtime. May provide scalar, vector, and texture parameters. It cannot provide static switch parameters and it cannot be the parent of another UMaterialInstance.
Primitive components and proxies

Primitive components are the basic unit of visibility and relevance determination. For example, occlusion and frustum culling happen on a per-primitive basis. Therefore it is important when designing a system to think about how big to make components. Each component has a bounds that is used for various operations like culling, shadow casting, and light influence determination.

Components only become visible to the scene (and therefore the renderer) when they are registered. Game thread code that changes a component's properties must call MarkRenderStateDirty() on the component to propagate the change to the rendering thread.

FPrimitiveSceneProxy and FPrimitiveSceneInfo

FPrimitiveSceneProxy is the rendering thread version of UPrimitiveComponent that is intended to be subclassed depending on the component type. It lives in the Engine module and has functions called during rendering passes. FPrimitiveSceneInfo is the primitive component state that is private to the renderer module.

Important FPrimitiveSceneProxy methods
Function Description
GetViewRelevance Called from InitViews at the beginning of the frame, and returns a populated FPrimitiveViewRelevance.
DrawDynamicElements Called to draw the proxy in any passes which the proxy is relevant to. Only called if the proxy indicated it has dynamic relevance.
DrawStaticElements Called to submit StaticMesh elements for the proxy when the primitive is being attached on the game thread. Only called if the proxy indicated it has static relevance.
Scene rendering order

The renderer processes the scene in the order that it wants to composite data to the render targets. For example, the Depth only pass is rendered before the Base pass, so that Heirarchical Z (HiZ) will be populated to reduce shading cost in the base pass. This order is statically defined by the order pass functions are called in C++.

Relevance

FPrimitiveViewRelevance is the information on what effects (and therefore passes) are relevant to the primitive. A primitive may have multiple elements with different relevance, so FPrimitiveViewRelevance is effectively a logical OR of all the element's relevancies. This means that a primitive can have both opaque and translucent relevance, or dynamic and static relevance; they are not mutually exclusive.

FPrimitiveViewRelevance also indicates whether a primitive needs to use the dynamic and/or static rendering path with bStaticRelevance and bDynamicRelevance.

Drawing Policies

Drawing policies contain the logic to render meshes with pass specific shaders. They use the FVertexFactory interface to abstract the mesh type, and the FMaterial interface to abstract the material details. At the lowest level, a drawing policy takes a set of mesh material shaders and a vertex factory, binds the vertex factory's buffers to the Rendering Hardware Interface (RHI), binds the mesh material shaders to the RHI, sets the appropriate shader parameters, and issues the RHI draw call.

Drawing Policy methods
Function Description
Constructor Finds the appropriate shader from the given vertex factory and material shader map, stores these references.
CreateBoundShaderState Creates an RHI bound shader state for the drawing policy.
Matches/Compare Provides methods to sort the drawing policy with others in the static draw lists. Matches must compare on all the factors that DrawShared depends on.
DrawShared Sets RHI state that is constant between drawing policies that return true from Matches. For example, most drawing policies sort on material and vertex factory, so shader parameters depending only on the material can be set, and the vertex buffers specific to the vertex factory can be bound. State should always be set here if possible instead of SetMeshRenderState, since DrawShared is called less times in the static rendering path.
SetMeshRenderState Sets RHI state that is specific to this mesh, or anything not set in DrawShared. This is called many more times than DrawShared so performance is especially critical here.
DrawMesh Actually issues the RHI draw call.

Rendering paths

UE4 has a dynamic path which provides more control but is slower to traverse, and a static rendering path which caches scene traversal as close to the RHI level as possible. The difference is mostly high level, since they both use drawing policies at the lowest level. Each rendering pass (drawing policy) needs to be sure to handle both rendering paths if needed.

Dynamic rendering path

The dynamic rendering path uses TDynamicPrimitiveDrawer and calls DrawDynamicElements on each primitive scene proxy to render. The set of primitives that need to use the dynamic path to be rendered is tracked by FViewInfo::VisibleDynamicPrimitives. Each rendering pass needs to iterate over this array, and call DrawDynamicElements on each primitive's proxy. DrawDynamicElements of the proxy then needs to assemble as many FMeshElements as it needs and submit them with DrawRichMesh or TDynamicPrimitiveDrawer::DrawMesh. This ends up creating a new temporary drawing policy, calling CreateBoundShaderState, DrawShared, SetMeshRenderState, and finally DrawMesh.

The dynamic rendering path provides a lot of flexibility because each proxy has a callback in DrawDynamicElements where it can execute logic specific to that component type. It also has minimal insertion cost but high traversal cost, because there is no state sorting, and nothing is cached.

Static rendering path

The static rendering path is implemented through static draw lists. Meshes are inserted into the draw lists when they are attached to the scene. During this insertion, DrawStaticElements on the proxy is called to collect the FStaticMeshElements. A drawing policy instance is then created and stored, along with the result of CreateBoundShaderState. The new drawing policy is sorted based on its Compare and Matches functions and inserted into the appropriate place in the draw list (see TStaticMeshDrawList::AddMesh). In InitViews, a bitarray containing visibility data for the static draw list is initialized and passed into TStaticMeshDrawList::DrawVisible where the draw list is actually drawn. DrawShared is only called once for all the drawing policies that match each other, while SetMeshRenderState and DrawMesh are called for each FStaticMeshElement (see TStaticMeshDrawList::DrawElement).

The static rendering path moves a lot of work to attach time, which significantly speeds up scene traversal at rendering time. Static draw list rendering is about 3x faster on the rendering thread for Static Meshes, which allows a lot more Static Meshes in the scene. Because static draw lists cache data at attach time, they can only cache view independent state. Primitives that are rarely reattached but often rendered are good candidates for the static draw lists.

The static rendering path can expose bugs because of the way it only calls DrawShared once per state bucket. These bugs can be difficult to detect, since they depend on the rendering order and the attach order of meshes in the scene. Special view modes such as lighting only, unlit, etc will force all primitives to use the dynamic path, so if a bug goes away when forcing the dynamic rendering path, there is a good chance it is due to an incorrect implementation of a drawing policy's DrawShared and/or the Matches function.

High level Rendering order

Here is a description of the control flow when rendering a frame starting from FDeferredShadingSceneRenderer::Render:

Operation Description
GSceneRenderTargets.Allocate Reallocates the global scene render targets to be large enough for the current view, if needed.
InitViews Initializes primitive visibility for the views through various culling methods, sets up dynamic shadows that are visible this frame, intersects shadow frustums with the world if necessary (for whole scene shadows or preshadows).
PrePass / Depth only pass RenderPrePass / FDepthDrawingPolicy. Renders occluders, outputting only depth to the depth buffer. This pass can operate in several modes: disabled, occlusion only, or complete depths, depending on what is needed by active features. The usual purpose of this pass is to initialize Hierarchical Z to reduce the shading cost of the Base pass, which has expensive pixel shaders.
Base pass RenderBasePass / TBasePassDrawingPolicy. Renders opaque and masked materials, outputting material attributes to the GBuffer. Lightmap contribution and sky lighting is also computed here and put in scene color.
Issue Occlusion Queries / BeginOcclusionTests Kicks off latent occlusion queries that will be used in the next frame's InitViews. These are done by rendering bounding boxes around the objects being queried, and sometimes grouping bounding boxes together to reduce draw calls.
Lighting Shadowmaps are rendered for each light and light contribution is accumulated to scene color, using a mix of standard deferred and tiled deferred shading. Light is also accumulated in the translucency lighting volumes.
Fog Fog and atmosphere are computed per-pixel for opaque surfaces in a deferred pass.
Translucency Translucency is accumulated into an offscreen render target where it has fogging applied per-vertex so it can integrate into the scene. Lit translucency computes final lighting in a single pass to blend correctly.
Post Processing Various post process effects are applied using the GBuffers. Translucency is composited into the scene.

This is a fairly simplified and high level view. To get more details, look through the relevant code or the log output of a 'profilegpu'.

Render Hardware Interface (RHI)

The RHI is a thin layer above the platform specific graphics API. The RHI abstraction level in UE4 is as low level as possible, with the intention that most features can be written in platform independent code and 'just work' on all platforms that support the required feature level.

Feature sets are quantized into ERHIFeatureLevel to keep the complexity low. If a platform cannot support all of the features required for a Feature Level, it must drop down in levels until it can.

Feature Level Description
SM5 Generally corresponds with D3D11 Shader Model 5, except only 16 textures can be used because of OpenGL 4.3 limits. Supports tessellation, compute shaders and cubemap arrays. The deferred shading path is supported.
SM4 Corresponds to D3D11 Shader Model 4, which is generally the same as SM5, except no tessellation, compute shaders or cubemap arrays. The deferred shading path is supported. Eye Adaptation is not supported as it uses compute shaders.
ES2 Corresponds to the features supported by most OpenGL ES2 mobile devices. Uses a pared down forward shading path.
Rendering state grouping

Render states are grouped based on what part of the pipeline they affect. For example, RHISetDepthState sets all state relevant to depth buffering.

Rendering state defaults

Since there are so many rendering states, it is not practical to set them all every time we want to draw something. Instead, UE4 has an implicit set of states which are assumed to be set to the defaults (and therefore must be restored to those defaults after they are changed), and a much smaller set of states which have to be set explicitly. The set of states that do not have implicit defaults are:

  • RHISetRenderTargets

  • RHISetBoundShaderState

  • RHISetDepthState

  • RHISetBlendState

  • RHISetRasterizerState

  • Any dependencies of the shaders set by RHISetBoundShaderState

All other states are assumed to be at their defaults (as defined by the relevant TStaticState, for example the default stencil state is set by RHISetStencilState(TStaticStencilState<>::GetRHI()).

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/pizi0475/article/details/47442657

智能推荐

未能完成该操作pkdownloaderror错误8_老泵司都不一定知道的臂架操作干货-程序员宅基地

文章浏览阅读170次。入行开泵车需要学习很多技能臂架操作就是必不可少的一项关于泵车臂架操作老泵司凭借多年的经验也不一定能轻松搞定新手泵司就更不用说了今天,铁妹给泵友们带来了干货福利臂架的正确操作步骤解读臂架展开顺序详解将1号臂架展开,当1号臂架展开至与竖直方向的夹角小于15°时,方可展开2号臂架。以避免4、5号臂架与驾驶室、转台发生碰撞。未能展开1臂错误案例将2号臂架展开至水平位置时,才允许展开3号臂架。以避免1号臂架..._未能完成操作pkdownloaderror错误8

在Struts 2中实现文件上传_struts 上传 web.xml-程序员宅基地

文章浏览阅读744次。前一阵子有些朋友在电子邮件中问关于Struts 2实现文件上传的问题, 所以今天我们就来讨论一下这个问题。实现原理Struts 2是通过Commons FileUpload文件上传。Commons FileUpload通过将HTTP的数据保存到临时文件夹,然后Struts使用fileUpload拦截器将文件绑定到Action的实例中。从而我们就能够以本地文件方式的操作浏览器上传的文件。_struts 上传 web.xml

【C++实现】传送门_c++传送门-程序员宅基地

文章浏览阅读827次。切换地图// 贪吃蛇.cpp : 定义控制台应用程序的入口点。//#include "stdafx.h"#include <windows.h>#include <iostream>#include <time.h>using namespace std;#define KEY_DOWN(vk_code) GetAsyncKeyState(vk_code)&0x8000?1:0int _tmain(int argc, _TCHAR* argv_c++传送门

如何把excel中的多行数据按行数拆分成多个_excel如何将两列数据如何按照行数分割成多个工作簿-程序员宅基地

文章浏览阅读5.7k次,点赞3次,收藏17次。原创链接 知乎 郭大牛 谢谢!解决了问题仅此记录一下https://zhuanlan.zhihu.com/p/81580481?from_voters_page=true首先打开一个excel 添加宏将代码编辑到宏中Sub copybat() Dim i, j, k, m, r As Integer Dim n, total_data As Long Dim path As String Dim title_area, data_column, data_a._excel如何将两列数据如何按照行数分割成多个工作簿

lintcode--最大子数组差_b越小越好-程序员宅基地

文章浏览阅读220次。给定一个整数数组,找出两个不重叠的子数组A和B,使两个子数组和的差的绝对值|SUM(A) - SUM(B)|最大。返回这个最大的差值。 注意事项子数组最少包含一个数您在真实的面试中是否遇到过这个题? Yes样例给出数组[1, 2, -3, 1],返回 6/*当A>B 的时候A越大越好_b越小越好

vue项目中axios的简单封装_从其他项目中拷贝axios封装好的index.js文件-程序员宅基地

文章浏览阅读310次,点赞4次,收藏7次。前言:​ 很多朋友的vue项目的开发过程中,习惯性的将数据的请求放在了组建中,如下图,这样的话,当后台数据发生变化的时候,当项目体量较大,文件较多时,更改起来就不太方便,所以为了方便管理数据接口,实现快速的开发,我们可以将数据接口通过API封装的形式来进行管理。​ [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-izgdSoMk-1618278390082)(D:\全栈学习\学习总结笔记\img\未命名1618275359.png)]实现AP_从其他项目中拷贝axios封装好的index.js文件

随便推点

米尔Zynq UltraScale MPSoC核心板资料介绍_米尔zynq资料-程序员宅基地

文章浏览阅读1.6k次。米尔Zynq UltraScale MPSoC核心板(MYC-CZU3EG)是采用Xilinx新一代Zynq处理器(具体型号XCZU3EG-1SFVC784,未来可选用XCZU2CG,XCZU3CG.XCZU4EV,XCZU5EV)。该核心板是业界最小尺寸Zynq UltraScale 核心板,采用16纳米制程,相比Znyq7000系列每瓦性能提升5倍,且单芯片融合4核心Cortex-A53(Up..._米尔zynq资料

mysql jdbcUrl中的serverTimezone_servertimezone=prc-程序员宅基地

文章浏览阅读1.4w次,点赞6次,收藏7次。一般我们在配置mysql数据库的时候都是四个参数 driver-class-name: com.mysql.cj.jdbc.Driver jdbc-url: username: password: 这次在开发的时候遇到了一个非常奇怪的问题在本地测试的时候 好好的,到测试环境上面就取不到数据,多次检查下来 感觉到是jdbc 的url 出现了问题,在navicat中可以查到数..._servertimezone=prc

RabbitMQ服务的安装与使用_数据订阅消费rabbitmq,本机需要安装rabbitmq服务吗?-程序员宅基地

文章浏览阅读1k次。一、RabbitMQ的介绍MQ为MessageQueue,消息队列是应用和应用程序之间的通信方法。RabbitMQ是一个开源的,在AMQP基础上完整的,可复用的企业消息系统。支持主流的操作系统,Linux、Windows、MacOX等。多种开发语言支持,Java、Python、Ruby、.Net、PHP、C/C++、node.js等其官网地址:http://www.rabitmq.comlinux平台中RabbitMQ下载地址:https://www.rabbitmq.._数据订阅消费rabbitmq,本机需要安装rabbitmq服务吗?

Shell能做什么_shell脚本能做什么-程序员宅基地

文章浏览阅读3.8k次。1,自动化批量系统初始化程序(update,软件安装,时区设置。。。)2,自动化批量软件部署(LAMP,LNMP,Tomcat。。。)3,管理应用程序4,日志的分析处理程序(输出网站的访问量)5,可以写自动化备份恢复程序(MySql完全备份/增量)6,自动化管理程序(批量远程修改密码,软件的升级,配置更新)7,自动化的信息采集及监控程序(收集系统/应用状态信息,CPU,Mem,Net等)8,配合Zabbix实现信息采集9,可以写一个shell脚本实现自动化扩容10,可以使_shell脚本能做什么

python学习第八天:unittest框架运行部分测试用例及excel取数_python 从框架里面提取单个用例-程序员宅基地

文章浏览阅读731次。用例执行:右击 unittest 运行 python 运行 unittest.main() 运行所有的测试用例用例的组织会把测试用例的代码放到一个统一的文件夹当中,目录当中。tests/testcases收集用例需要把每个测试用例模块当中的测试用例收集到一起,一起执行。运行用例的流程1, 写用例方法。用例方法当道一个目录当中。 2, 写脚本 run.py, (收集用例,运行用例) loader 收集用例, suite = discover() ..._python 从框架里面提取单个用例

uniapp 组件化_uniapp. topage-程序员宅基地

文章浏览阅读1.2k次。看效果:点击 list 跳转,点击按钮也跳转index.vue代码:<template> <view class="content"> <list value="vuesssapp" @onClick="fun2"> <!-- value要在 list.vue 的 props登记,然后就可以被 list.vue使用了 ..._uniapp. topage

推荐文章

热门文章

相关标签