• Register

Massive space battles where two opposing teams try to bring down their carrier ships. Built in a custom c++ engine to achieve battles on a scale not found in other games.

Post news Report RSS DevBlog 18 - Using profiler tools to make the game fast; Instancing && Batch Rendering

Using profiling tools to optimize and explore making the game run fast! Renderdoc and Visual Studio Profiler.

Posted by on



Using profiling to inform coding decisions; a real life example.
To start off, I added the ability for ships to respawn. Giving some forgiveness if the player dies.
That works by adding a spawn component to the entity that needs to respawn fighter ships.
When respawning, you need to know how long you must wait. Which requires text.

sa vid17 respawntextrendering


So, I created a text rendering system.
I didn't really want to deal with using fonts or any of that.
I had an idea to create a digial clock font and I wanted to test that out.
This works by having quads that are either on or off.
We can define letters based on turning on certain quads.

sa vid17 inworld text rendering

But I wanted to use some rendering tricks to make this faster.
So for each letter, I used a bitvector to compact what should and should not be rendered.
In the vertex shader, I collapse quads based on this bitvector.
this lets me have a single draw call per letter glyph.

text render vertex shader

But I wanted to take this a bit further.
What if we could have a single draw call for an entire text block?
I used OpenGL instancing to achieve that.
I pack all of the bitvectors into a single array on the CPU.
Then I do a large instance render in a single draw call, yielding all the text rendered.

This is great... but can we render all text in a single draw call?
Here is where I used batching. Each text block prepares all its data and then requests a render.
But accumulate this data and batch it together.
Then when it is done we invoke one large draw call to render all text at once.
(technically, there is a cut off and it will render all batched calls if some threshold is met)

So, in terms of speed single glyphs should be slower than instancing and instancing should be slower than batching.
But this is not what I observed.
I used visual studio's c++ profiling to find the bottleneck.
Turns out it was a silly bug I should have caught, A previous approaches loop was meant to be removed but still remained multiplying the amount of work done by each character in the string.

sa vid17 visualstudio profiler o

I used renderdoc to ensure that the draw calls were looking correct.
I created a stress test level to push the system to the limits and determine if things were working.
I saw massive performance improvements using the batching system.

sa vid17 debug with renderdoc

Post a comment

Your comment will be anonymous unless you join the community. Or sign in with your social account: