Render lab built on top of vulkan, aiming to create a real time planet scale large scene. Also some widely adopted techs are implemented, such as deferred rendering, physical based rendering, bloom, screen space ambient occlusion, screen space reflection, depth of field, skeleton animation, etc
Finally I’ve managed to get runtime atmosphere working(thanks to the great work and detailed doc from https://github.com/ebruneton/precomputed_atmospheric_scattering), along with a runtime IBL images generating scheme.
Youtube Video:
Screen Shots:
Also some sun rise/set shots:
I placed some extremely huge simple geometries in the scene, to illustrate the sense of the scale atmosphere could give you:
What you can do in the scene:
This project works ONLY FOR WINDOWS for now. But it could be ported to other platforms potentially by touching only a few parts of code.
Install Vulkan SDK
Visit https://vulkan.lunarg.com/sdk/home, download sdk and install Vulkan SDK. You’ll have “VK_SDK_PATH” automatically.
Install CMake
Visit https://cmake.org/download/, download and install CMake.
Generate Project
Open command prompt, direct to the root of your local clone(E.g. “C:\VulkanLearn” for me), and type cmake . -G “Visual Studio [version] Win64”(E.g. [version]=15 2017 for me). Open generated project and build.
I created this project aiming to get familiar with Vulkan through varies common rendering technologies. It is also a minor engine that handles scene management and data to coordinate with underlay Vulkan and get things drawn on screen. I’ve already added a lot of functionalities helping to create a scene by a few lines of code. However, there’s still a vast gap between this project and a common game engine, both in terms of utilities that helps to ease the work, and a UI editor to do things dynamically rather than code stuff and rebuild.
The scene graph of this application is similar to unity. I mean I implemented them the way exactly as how unity works. The whole scene is combined with many objects and each of them might contain one or more components that handles varies kinds of work.
I use triple ring swapchain images as a base count to organize rendering work per-frame. Each one of frames has its index used to acquire corresponding resources as well as synchronization primitives.
Every frame of a specific index 0, 1 and 2 holds its own resources, including command pool, frame buffers and synchronization primitives. I don’t use per-frame descriptor pool however, since I allocated 3 times larger space of every uniforms and bind them with specific offset according to frame index, so that I don’t really have to change descriptor sets and descriptor pools.
Memory management is sophisticated in my project. There’re 2 levels of memory management, memory level that is the management of relationship between memory and its holders(buffers, images), and buffer & image level, as the name suggests, it manages both buffers & images memory, to ease the use of them during rendering organization, and avoid per-frame operations to improve performance.
Memory level management isn’t actually general, since you can’t just simply allocate a chunk of memory for everything. Buffers and Images must be bound to separate memory(At least validation layer told me so), and different images cannot share the same memory(Also told by validation layer). Therefore, I separated memory usage to buffer and image.
There’ll be 32 chunks of memory in “Buffer Memory Pool”, exactly the same as Vulkan physical device provided “VK_MAX_MEMORY_TYPES”, and each of them consists of a size, data ptr, handle to Vulkan memory objects, and a KEY. The key here is very important as it acts as a role to index the actual binding information, and it’ll be generated and kept by each buffer, to look up informations in binding table, and using “type index” to acquire Vulkan memory object from “Buffer Memory Pool”
Image buffer memory management is a lot simpler, since each image must bind a different memory object(I’m not sure why). So image buffer pool doesn’t update a binding list for multiple images. And a binding info table is not necessary too. The only thing left same as buffer memory management is lookup table, which is used for key->memory node indexing. Do remember the key is also kept inside every image.
I created a class “SharedBufferManager” to manage a big buffer from which varies types of buffers will allocate. During the time of command buffer generation, this big buffer will be bound along with an offset and range. I do this to follow the best practice of NVdia’s document, without knowing why;). I do know that for uniform buffers, binding them with “vkoffsets” is a lot cheaper than switching descriptor sets, not to mention update them. This way I can avoid either switching and updating descriptor sets, seems like a perfect path to go.
Every buffer are I use is created from this “SharedBufferManager”. It contains a key that is used to index to its owne sub-region of the “SharedBufferManager” buffer with information like “numBytes” and “offset”. And the class “SharedBufferManager” buffer is a normal buffer which also has another types of key that could be used to index in memory manager to find its information and Vulkan object “VkDeviceMemory”. The final graph is something like this (Red one “Shared Buffer” stands for kind of buffer that the application actually use):
There’re multiple shared buffer managers, and each one of them is used with a specific purpose:
Each material contains 4 descriptor sets, corresponding to global, per-frame, per-object and material.
Simply a material class is not enough. I need multiple instances of the same material that holds different values, and a way to hookup these instances to actual mesh vertices.
First I’d like to show the whole data structure of the database like skeleton animation system.
During initialization of a scene, if a bone structure is detected in a mesh, its information will be stored into Animation Diction(Animation information, per-object key frame information), Per-Bone Indirect Data(Used to index to per-bone data) and Per-Bone Data(Holds t-pos default bone transforms for all bones). When a model with skeleton animation is about to render, we have to acquire 2 transforms:
Both and combined with object’s model view projection matrix, we can have final clip space position of a frame:
However, we can’t just simply multiply all of them that easily. Since we need Dual Quaternion Interpolation to acquire a final bone transform among multiple bones. Therefore, result matrix of will be converted into dual quaternion and set to Per-Frame Bone Data, and vertex shader could get this through its bone indices and Per-Frame Bone Indirect Data. Shader will do interpolation and transform work, and the rest will be exactly the same as a normal model.
If you’re interested in dual quaternion, I just happen to mark some concepts of it down in here.
Here’s the layout of my GBuffer. You can see some channels are marked as reserved. It’s just because I’m too lazy to remove, besides, I might add something that requires one or more channels in future:)
You could notice that there’s no position in GBuffer, since positions are reconstructed using depth buffer, and it saves tons of bandwidth per-frame. In order to acquire an accurate world space position that is pretty far away from camera, I use reverted depth buffer, i.e. near plane depth 1 and infinite far plane 0. The reason doing this is that floating point tends to spend more of its bits towards 0. If you don’t use reversed depth, the nonlinear depth output from a projection matrix plus a non-uniform accuracy distribution of float is gonna ruin the whole reconstruction. However, a reverse would completely change the whole situation, as they two will cancel out each other’s distribution, resulting a more balanced distribution(still not uniform though).
This pass does exactly the same job as GBuffer Pass, except that it deals only with meshes that has bones and weighted skin vertices.
We can render motion vector through various objects shown on the screen. However for those pixels that are never touched by any of them, no motion vectors will be produced. This doesn’t seem like a problem for skybox since it doesn’t move with character. But it’ll fail if a character rotates its view direction, motion blur is not able to apply here. Therefore, an extra pass here to fill the gap, to record motion only caused by view rotation.
These 2 passes mainly convert per pixel motion into tile, and each tile contains larges motion it records within its rect. The result is used both in Temporal Resolve Pass and Post Process Pass.