Author Topic: [SOLVED] VertexBuffer caching for chunks rendering (Read 300 times)

Ray1184 · « **on:** April 05, 2024, 05:03:36 pm »

Hi everyone,
I've some questione about VertexBuffer usage for my engine implementation.

My engine is basically a standard top-down tile based game engine, where map could be very large.
In order to keeping good performances, in the init phase, I split in N chunks of 16x16 tiles, and I render only the visible chunks (in this way I can keep high FPS even on low end devices).
To further improve performance, I use VertexBuffers for static chunks, uploading vertices only one time.

I created a VertexBufferProvider class, actually a simple VertexBuffer cache, working in this way:

void hpms::TilesPoolRenderingWorkflow::Render(hpms::Window* window, hpms::Drawable* item)
{
    sf::VertexBuffer* vertexBuffer = hpms::VertexBufferProvider::GetVertexBuffer(item->GetId(), sf::PrimitiveType::Triangles, 0);

    if (item->IsUpdateVertices() || item->IsForceAll())
    {
        auto* chunk = dynamic_cast<hpms::TilesPool*>(item);
        // other code...
 

If the VertexBuffer does not exists inside the cache, so the VertexBufferProvider creates and provides, otherwise just provides.
Currently the VertexBuffer cache doesn't have a size, so if I have 2000 chunks and the player walk all along the map, 2000 VB will be created.
My idea was to use a FIFO queue, just to have a reasonable amount of VB. In this case which will be a reasonable VB cache size considering that each chunk contains 1536 vertices (16x16 TILES x 6 vertices each)?

Thanks a lot

Ray

eXpl0it3r · « **Reply #1 on:** April 06, 2024, 05:32:02 pm »

The reason the internet advises to write games not engines is because when writing games, you naturally run into the necessary limits, so you can write code that matches the game and aren't just randomly guessing and most likely totally overestimating your requirements.

As such, I recommend to sit down and think about what kind of game you want to make - pick something very small - and then decide how much tiles you realistically need.

Secondly, writing performance optimized code without having actual profiling results, rarely works out.
Are you really struggling with frametime and is the vertex transfer really the bottleneck? Or could your dynamic_cast actually have a way heavier tax on the CPU for example?

Never optimize in a vacuum, which however means, you'd need a realistic load, which means, you'd have to have a game to run, instead of just an engine, thus see my first point.

Nobody here, can give you a number that isn't just a random guess, since we don't know the game or what the performance profile looks like.

Ray1184 · « **Reply #2 on:** April 06, 2024, 06:16:16 pm »

Hi, thanks for your reply.
I generalized by writing "engine". I'm actually writing my own engine for my own game, not a general purpose engine.
In my game I will have some standard maps, where performances are not a problem, but I planned to add some procedurally generated open world locations, that could reachs up to 16x16k tiles (16 millions of chunks).

Maybe my question was wrong. Sure there isn't a magical number for VB caching, I actually wanted to understand if VB caching could be a good strategy for optimization and, if it is, whether there is some sort of "safety threshold" beyond which it would be better not to go.
In this case every VB will holds a chunk of 256 tiles, so 512 triangles.

You're right about the dynamic cast inside the rendering loop, in fact, I'm already thinking about an alternative solution ('til now doesn't seems the bottleneck, but I prefer to avoid it)

kojack · « **Reply #3 on:** April 06, 2024, 07:00:37 pm »

I made a project a while ago (using SFML) that needed to display all of australia (in enough detail to watch a car driving).
I used a pool of 9 render texture tiles each as big as the screen. As the view centre crossed into a new tile, I reused 3 of the tiles to generate the next 3 tiles. If you don't cross the boundary, it's just rendering at most 4 sprites to do the ground.
For smaller tiles without a larger landblock, you could progressively render tiles into the render texture as you approach the edge, so its not rendering all of them in one hit (frame time spikes).

Although these days I'd use a tile shader. Technically a 16k x 16k world of tiles can be done with 1 triangle and a 16k x 16k tile index texture (to then look up a tile texture atlas). Moving around in the world is just sending a position to the shader.

Ray1184 · « **Reply #4 on:** April 07, 2024, 01:40:14 pm »

Quote from: kojack on April 06, 2024, 07:00:37 pm

I made a project a while ago (using SFML) that needed to display all of australia (in enough detail to watch a car driving).
I used a pool of 9 render texture tiles each as big as the screen. As the view centre crossed into a new tile, I reused 3 of the tiles to generate the next 3 tiles. If you don't cross the boundary, it's just rendering at most 4 sprites to do the ground.
For smaller tiles without a larger landblock, you could progressively render tiles into the render texture as you approach the edge, so its not rendering all of them in one hit (frame time spikes).

So, as I understood you render the screen tile and 8 nearest quads, if you move for example on left you will discard 3 tiles on the right and prepare other 3 tiles on left.
But in your case, net of rendering issues, I assume that you don't load the entire map, but you load in different steps (a sort of "streaming"?)

I also found out that my performances problem was not a rendering issues, but the structure of chunks. Currently I check for each chunk whether it fit into screen, but with million of chunks has very poor performances, I will opt for a better structure such quadtree

Ray1184 · « **Reply #5 on:** April 07, 2024, 08:27:34 pm »

Ok I solved, problem was related to a wrong chunks searching. With latest implementation I don't see any performance problem even with 16k maps. Thanks to everyone

kojack · « **Reply #6 on:** April 08, 2024, 12:40:35 am »

Cool that you solved it.

Quote from: Ray1184 on April 07, 2024, 01:40:14 pm

So, as I understood you render the screen tile and 8 nearest quads, if you move for example on left you will discard 3 tiles on the right and prepare other 3 tiles on left.

Yep.

Quote from: Ray1184 on April 07, 2024, 01:40:14 pm

But in your case, net of rendering issues, I assume that you don't load the entire map, but you load in different steps (a sort of "streaming"?)

Mine was a bit different, since my world data for each world chunk had several forms. The main data was a collection of 15 million road connections between GPS points. I loaded that for all of Australia at once (since I needed to do A* pat finding on it). It was less than 200MB of data for the whole country.
If I stored that area as small tiles I'd probably stream it in.

Ray1184 · « **Reply #7 on:** April 08, 2024, 10:12:07 am »

Quote from: kojack on April 08, 2024, 12:40:35 am

Mine was a bit different, since my world data for each world chunk had several forms. The main data was a collection of 15 million road connections between GPS points. I loaded that for all of Australia at once (since I needed to do A* pat finding on it). It was less than 200MB of data for the whole country.
If I stored that area as small tiles I'd probably stream it in.

Ah ok I understand, in my case tiles are 16px (with 4/5x upscaling for recreate old school pixelate fx), so big maps could reach several gb. In this case I've to think about a sort of chunk streaming or something else such map splicing.

Author Topic: [SOLVED] VertexBuffer caching for chunks rendering (Read 300 times)

Ray1184

[SOLVED] VertexBuffer caching for chunks rendering

eXpl0it3r

Re: VertexBuffer caching for chunks rendering

Ray1184

Re: VertexBuffer caching for chunks rendering

kojack

Re: VertexBuffer caching for chunks rendering

Ray1184

Re: VertexBuffer caching for chunks rendering

Ray1184

Re: VertexBuffer caching for chunks rendering

kojack

Re: [SOLVED] VertexBuffer caching for chunks rendering

Ray1184

Re: [SOLVED] VertexBuffer caching for chunks rendering