Hi, I'm working every then and now with SFML since a couple of years, so I'd say I'm an experienced user.
I'm working on a strategy game and currently I'm improving graphics and graphics performance. Most of my map is covered by grass which is animated by vertex shaders (see attached screenshot to get an idea about looks and perspective etc.; looks a bit crappy on the screenshot somehow
).
This grass is simply implemented by untextured triangles in vertex arrays. To counteract the resulting horrible aliasing I'm downsampling from a 4k rendertexture in which I draw everything.
Of couse there are NPCs and other things moving in my game, so I can't draw all grass triangles in one vertex array --> there needs to be a specific order of things being drawn. Therefore I divided my grass into vertex arrays covering each 1024x32 pixels, collect everything (grass vertex arrays, NPCs, etc.) in a vector, sort it and draw it.
Unsurprisingly when zooming out I sooner or later run into performance problems.
To be specific, here are some measurements:
- 2000 Drawcalls (mostly from grass vertex arrays)
- 300k Triangles (each representing one grass blade)
-> resulting in 50-60 FPS
I want to achieve about 100 FPS under these conditions without loosing too much graphics quality.
To get a deeper insight I did some more measurements:
- I increased grass vertex array "covering-size" to 1024x64, therefore reducing number of draw calls. This results only in slightly increased performance: 1300 Drawcalls, 330k Triangles -> 65 FPS -> I'm likely not Draw call limited
- I decreased the area covered by a single triangle while keeping the number of triangles constant. Results in zoomed out conditions in same FPS: each 1000 Drawcalls, 275k Trianles, 75 FPS. --> likely not Fillrate limited
- looked at CPU and GPU utilization: 12%; (about 1 1/2 core), GPU jumps between 99% and 0%, most of the time in the range 0-50% !without the scene changing, really strange!. Additionally, both GPU and CPU do not boost to the max.
This led me to the conclusion that I may be bandwidth limited? I'm not sure about SFML's internal implementation of vertex arrays but are they sent from CPU to GPU every frame?
Therefore I have a couple of questions about possibilities to improve performance:
- if vertex arrays are sent every frame to the GPU, is there a possibilty for me to change SFML to keep vertex arrays in GPU memory (without needing to crash SFML's whole design)?
- would you try to use geometry shaders to ideally cut bandwidth by 2/3s by generating the triangles on the gpu?
- are there other possibilities to improve performance of SFML internally for me (without the need to break the design or recode everything)?
- can you suggest profiling programs etc. to find the bottleneck?
- Do you have other suggestions (maybe a complete different design?) to achive the same with better performance? Does OpenGL offer some stuff I could use without too much effort implementing it? (I have to say I'm not really experienced with pure OpenGL)
I think there is a huge bottleneck somewhere because I thing my PC should be able to manage minimum 5k Drawcalls and a million Triangles with the same performance:
i7 5820k
16 GB RAM
R9 290
Thanks in advance and kind regards
Chaia*