Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Maximum capability of VertexArray rendering  (Read 3874 times)

0 Members and 1 Guest are viewing this topic.

sidewinder

  • Newbie
  • *
  • Posts: 16
    • View Profile
Maximum capability of VertexArray rendering
« on: May 07, 2019, 01:03:18 pm »
Hello!

First of, I have tried to find the answer to these question in the forums, I can find a lot of people talking around the subject, but no-one that actually answers the questions more then touching the subjects. If there are, please give me the link(s) :-)

I'm playing around with different rendering techniques in SFML and from what I can understand drawing VertexArrays directly would be the most efficient way without resorting to writing yourself in opengl?

When I tried VertexArrays it was a lot quicker than sprites/shapes, but I feel that I cant really grasp what the maximum capability is. So I decided to test this.

Here is my minimal test-code for testing with 1 million particles where each particle is a square or quad:

#include "SFML/Graphics.hpp"
#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
  //Create objects
  sf::RenderWindow window;
  window.create(sf::VideoMode(1024, 768), "Test");

  //One million rectangle particles are 4 million vertices
  sf::VertexArray particles(sf::Quads, 4000000);

  sf::Clock clock;
  int microsecondsInSecond = 1000000;

  sf::Text text;
  sf::Font font;
  //Change this to an actual font location
  font.loadFromFile("content/VeraMono.ttf");
  text.setFont(font);
  text.setCharacterSize(24);
  text.setFillColor(sf::Color::Red);

  while (window.isOpen())
  {
    // handle events
    sf::Event event;
    while (window.pollEvent(event))
    {
      if (event.type == sf::Event::Closed)
        window.close();
    }

    //simulate updating the position and color of the particles
    for (int i = 0; i < 1000000; i++)
    {
      particles[i * 4].position = sf::Vector2f(100, 100);
      particles[i * 4 + 1].position = sf::Vector2f(200, 100);
      particles[i * 4 + 2].position = sf::Vector2f(200, 200);
      particles[i * 4 + 3].position = sf::Vector2f(100, 200);
      for (int j = i * 4; j < i * 4 + 4; j++)
      {
        particles[i].color.r = static_cast<sf::Uint8>(255);
        particles[i].color.g = static_cast<sf::Uint8>(255);
        particles[i].color.b = static_cast<sf::Uint8>(255);
        particles[i].color.a = static_cast<sf::Uint8>(255);
      }
    }

    //clear window with black
    window.clear(sf::Color::Black);

    //Draw
    window.draw(particles);
    window.draw(text);
    window.display();

    //Print current aproximate fps
    std::string fps("fps: " + std::to_string((microsecondsInSecond / clock.restart().asMicroseconds())));
    text.setString(fps);
  }
}
 

On a pretty decent desktop computer this gives me an approximate fps of around 7-8. That seems way to low :-/
With 100 000 particles I get around 70.

So three questions:

  • Did I make some mistake in my code that makes it slow?
  • Is there some other way in SFML to make it even faster?
  • Am I naive to think that a million particles each frame should be doable? (I recall people mentioning "drawing millions of vertices" in other threads, but I could have misunderstood)
  • Is one million particles unreasonable in any way? Should I try to rewrite my particle effects so that they use fewer particles?

For a naive example when drawing a circle of light, that could be for example 360 triangles to form up the circle, 360 * 3= 1080. If 100 000 particles is the limit for keeping above 60 fps, that means I could only keep 100 lights on screen at the same time? That seems low...


Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Maximum capability of VertexArray rendering
« Reply #1 on: May 07, 2019, 01:32:50 pm »
Quote
Did I make some mistake in my code that makes it slow?
Nop. What you should keep in mind, in case it was not clear, is that you're updating every vertex every frame. So you upload 4000000 * sizeof(Vertex) = 80 MB of data 8 times per second to your GPU memory, which makes a bandwidth of 640 MB/s. I don't know how high your system can go, but that seems quite a high amount. So I wouldn't call it "slow", and I don't think you can do much better as long as everything changes every frame.

Quote
Is there some other way in SFML to make it even faster?
You can try sf::VertexBuffer, which is the same as sf::VertexArray but which lives on the GPU. However since you have to reupload the data every frame anyway, I doubt it would make a significant difference.

Quote
Am I naive to think that a million particles each frame should be doable? (I recall people mentioning "drawing millions of vertices" in other threads, but I could have misunderstood)
Depends on the system where your app is running... And with static data uploaded once on the GPU (sf::VertexBuffer), yes it would be much more doable.

Quote
Is one million particles unreasonable in any way? Should I try to rewrite my particle effects so that they use fewer particles?
One million is roughly the number of pixels on your screen, so yes, one million rectangles (not even points) seems a little bit too much. But it depends what particle effect you're trying to achieve of course.

Quote
100 lights on screen at the same time? That seems low...
That sounds really high to me. What kind of environment has 100 different lights in the same area?

You should first decide what effect(s) you want to implement, and then try to find the most efficient way to do it -- the latter depends on the former ;) Not the other way round.
Laurent Gomila - SFML developer

sidewinder

  • Newbie
  • *
  • Posts: 16
    • View Profile
Re: Maximum capability of VertexArray rendering
« Reply #2 on: May 07, 2019, 05:07:44 pm »
Thanks for the answer!

You can try sf::VertexBuffer, which is the same as sf::VertexArray but which lives on the GPU. However since you have to reupload the data every frame anyway, I doubt it would make a significant difference.

Ah, I see, looking at the documentation I guess this would be the way to do it for example updating only the first and fourth.


sf::Vertex particles[40000];
sf::VertexBuffer buffer(sf::Quads);
buffer.create(40000);

    //Update the first particle
    buffer.update(particles,1080,0);
    //Update the fourth particle
    buffer.update(particles,1080,1080*4);

window.draw(buffer);

 

One weird thing though, when I set the sf::Vertex particles[size]; to a large enough size the whole program refuse to start Do you have any idea why or is that something compiler dependent?

That sounds really high to me. What kind of environment has 100 different lights in the same area?

Maybe :) I was thinking of some diablo 3 -esque scene where lots of spells are filling up the whole screen ;)


You should first decide what effect(s) you want to implement, and then try to find the most efficient way to do it -- the latter depends on the former ;) Not the other way round

You are right ofc, here are the stuff I was trying to implement.

First I did this fire particle effect:


And then I wanted to enhance it a bit with some lighting effects, and I realized that the fps dropped quite a bit compared to the fire. Not unreasonably but I felt that I wanted to know more of that the actual limitations would be, and what was possible.



Buffer seems like the way to go. And an adjustment of my expectations ;)

There is always the possibility of pre-rendering some of the stuff as well, but where is the fun in that?



 
« Last Edit: May 07, 2019, 09:52:03 pm by sidewinder »

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 10801
    • View Profile
    • development blog
    • Email
Re: Maximum capability of VertexArray rendering
« Reply #3 on: May 07, 2019, 08:13:31 pm »
Not mentioned here so far, but often times, especially for effects like fire, you don't actually want/need to update all the data from and by the CPU, but instead you can just write a shader. That way you move the fire "calculation" from the CPU to the GPU and you only have to transfer a few vertices or ideally just one position.

Additionally, don't forget that games are usually just one deception after the other. So just because you see other games having implemented X, doesn't actually mean that they did, it's more likely that they just made it look like they did, while applying various tricks to make it look so. ;)
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

sidewinder

  • Newbie
  • *
  • Posts: 16
    • View Profile
Re: Maximum capability of VertexArray rendering
« Reply #4 on: May 07, 2019, 09:50:04 pm »
Not mentioned here so far, but often times, especially for effects like fire, you don't actually want/need to update all the data from and by the CPU, but instead you can just write a shader. That way you move the fire "calculation" from the CPU to the GPU and you only have to transfer a few vertices or ideally just one position.

Additionally, don't forget that games are usually just one deception after the other. So just because you see other games having implemented X, doesn't actually mean that they did, it's more likely that they just made it look like they did, while applying various tricks to make it look so. ;)

Good insights!

About shaders, I actually used shaders to blur and fade the light, but as I use them its something I apply to the draw like:

window.draw(lightSprite, &shader);

But the way you say "ideally just one position", how would that look? Can you do a draw with just the shader? Or an empty vertexArray or what? Also, can you keep a state in the shader, like previous position on particles and such?

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Maximum capability of VertexArray rendering
« Reply #5 on: May 08, 2019, 08:39:56 am »
Quote
One weird thing though, when I set the sf::Vertex particles[size]; to a large enough size the whole program refuse to start Do you have any idea why or is that something compiler dependent?
Your array lives on the stack, which has a very limited size. Use a dynamically allocated array on the heap (std::vector) instead.
Laurent Gomila - SFML developer

sidewinder

  • Newbie
  • *
  • Posts: 16
    • View Profile
Re: Maximum capability of VertexArray rendering
« Reply #6 on: May 08, 2019, 01:41:42 pm »
Thanks!

So now I got it working, and I get a pretty good grasp of which parts take the longest.

Just drawing the pre-filled buffer gives me about 1500 fps
Updating the whole buffer gives no real change as you predicted, 7-8 fps
Updating 1/10 of the buffer, 250 fps.

Feels like I can make quite a lot of improvements with this!