Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Optimizing Rendering  (Read 8419 times)

0 Members and 1 Guest are viewing this topic.

jmcmorris

  • Jr. Member
  • **
  • Posts: 64
    • View Profile
Optimizing Rendering
« on: July 16, 2015, 01:27:35 am »
Hello! I was wanting to see if anyone had any idea on how I can go about optimizing my rendering. I've been profiling my code and have found that a great deal of time is spent with GlContext::setActivate(). I am using several RenderTexture that I render to, display() and then pass to a shader. Because of this the program is constantly switching what it's rendering to which, from what I understand, is what calls setActivate().

Some possibilities might be:
  • an amazing optimization that makes switching the context not so expensive (I can dream, right?)
  • Rework my shaders so they don't need this. Although I'm not sure that is possible...
  • I'm being stupid
  • Something else?

One thing I have done so far is to try to minimize the switching by grouping the use of a RenderTexture together.

I can post code here however it is quite spread out. I'm mostly looking for ideas at the moment and/or seeing if I'm just doing something completely wrong. Thanks!

(Oh, and I'm on either SFML 2.1 or 2.2. I can't remember but it is fairly recent. I think from Sept. 2014.)

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11016
    • View Profile
    • development blog
    • Email
Re: Optimizing Rendering
« Reply #1 on: July 16, 2015, 01:49:10 am »
Do you have a performance issue?

Did you profile in release mode?
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

jmcmorris

  • Jr. Member
  • **
  • Posts: 64
    • View Profile
Re: Optimizing Rendering
« Reply #2 on: July 16, 2015, 05:23:15 am »
Yes to both of those. My rendering is taking up a great deal of time and so I was investigating if it is possible to slim that down.

binary1248

  • SFML Team
  • Hero Member
  • *****
  • Posts: 1405
  • I am awesome.
    • View Profile
    • The server that really shouldn't be running
Re: Optimizing Rendering
« Reply #3 on: July 16, 2015, 10:33:25 am »
How much time exactly does setActive() take up? And how did you determine this? I can imagine that the number of calls is relatively high, but setActive() really doesn't do much. It does way less than the actual drawing functions, that's for sure. As such, I don't really see how it can take up so much of the time.

Also, when switching contexts frequently, the time you measure (if you did measure the time), might not all come from SFML's code. When switching contexts, the operating system tries to flush the command queue of the previous context and this might block in high load situations. If you issued a lot of drawing to your RenderTextures and switch contexts, those draw commands will need to be flushed out of the queue before the operating system lets the code continue which leads to a high time spent in the makeCurrent() call of the context implementation.

You really shouldn't focus on trying to cut down on the number of calls to setActive(). It is called as much as necessary and no more. I'm pretty sure the majority of the time spent in your code is still distributed over other things, especially the other parts of the drawing process.

The easiest way to make things better is the same as with any other application:
  • Minimize state changes, batch similar/identical things to draw together
  • Minimize draw calls
  • Minimize target changes
  • Minimize the number of shader changes, if there are similar shaders, figure out how you can combine them and differentiate via uniforms
  • Go for broke and just learn a bit of OpenGL, this opens up a whole new world of possibilities (including shader subroutines which SFML doesn't natively support)
SFGUI # SFNUL # GLS # Wyrm <- Why do I waste my time on such a useless project? Because I am awesome (first meaning).

eigenbom

  • Full Member
  • ***
  • Posts: 228
    • View Profile
Re: Optimizing Rendering
« Reply #4 on: September 03, 2015, 06:57:28 am »
Instead of starting my own thread I'm just going to chime in here and say that I'm experiencing the same issues. I have lots of Render Textures and the profiler reveals that wglMakeCurrent is taking a substantial amount of time.  Afaict SFML uses one GL context per render texture and then has to switch between them.  I was curious whether the setActive() call was the buffer being flushed as binary suggested, so I made two minimal examples.

(A) uses 20 manually created FBOs and (B) uses 20 sf::RenderTextures. The code for each is below. (A) runs at around 340 fps and (B) runs at 220 fps on my machine. MSVC's profiler shows (B) on the right and you can see wglMakeCurrent takes up a big chunk of the total run-time.

I talked about it a little on twitter and I think that having one context per render texture is not a great approach when rendertexture count is high. Also, a discussion here comments about a few strategies to optimise the use of many fbos. One strategy is to have a single FBO and then rebind different textures to them as you render. It's a very specific use case, but unfortunately seems a little tricky with the one context-per-render texture approach of sfml.



#include <GL/glew.h>
#include <SFML/Graphics.hpp>
// #include <SFML/OpenGL.hpp>
// #include "mm/common/profiler.h"
#include <array>
#include <iostream>
#include <Windows.h>

// as per: http://developer.download.nvidia.com/devzone/devcenter/gamegraphics/files/OptimusRenderingPolicies.pdf
extern "C" {
        _declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;
}

int main1();
int main2();
void drawThing();

int main(){    
        // return main1();
        return main2();
};


static const int W = 800;
static const int N = 20;

int main1()
{
        sf::RenderWindow window(sf::VideoMode(W, W), "test");
        window.setFramerateLimit(1000);

       
        std::array<sf::RenderTexture, N> rts;
        for (auto& rt: rts)     rt.create(W, W);
       
        sf::Clock clock;
        int frameCount = 0;
        while (window.isOpen())
        {
                // BROFILER_FRAME("App");

                sf::Event event;
                while (window.pollEvent(event))
                {
                        if (event.type == sf::Event::Closed) window.close();
                        if (event.type == sf::Event::KeyPressed){
                                if (event.key.code == sf::Keyboard::Escape){
                                        window.close();
                                }
                        }
                }

                // draw to texture
                for (auto& rt : rts){
                        rt.clear(sf::Color(255,0,0));
                        rt.setActive(true);
                        //rt.pushGLStates();
                        rt.setView(sf::View(sf::FloatRect(0,W,W,W)));

                        glViewport(0, 0, W, W);

                        glMatrixMode(GL_PROJECTION);
                        glPushMatrix();
                        glLoadIdentity();
                        glOrtho(0.0, W, 0.0, W, -1.0, 1.0);
                        glMatrixMode(GL_MODELVIEW);
                        glPushMatrix();
                        glLoadIdentity();
               
                        drawThing();

                        glPopMatrix();
                        glPopMatrix();

                        /*
                        sf::RectangleShape shape(sf::Vector2f(W / 2, W / 2));
                        shape.setOrigin(W / 4, W / 4);
                        shape.setPosition(W / 2, W / 2);
                        rt.draw(shape);
                        */



                        rt.display();

                        //rt.popGLStates();
                }
                window.setActive(true);
                window.pushGLStates();
                // draw textures to window
                window.clear();
                for (auto& rt : rts){
                        auto spr = sf::Sprite(rt.getTexture());
                        window.draw(spr);
                }
                window.display();
                window.popGLStates();

                frameCount++;
                if (frameCount > 100){
                        std::cout << 1./(clock.getElapsedTime().asSeconds() / 100) << std::endl;
                        frameCount = 0;
                        clock.restart();
                }
        }

        return EXIT_SUCCESS;
}

std::pair<int,int> makeFBO(int w, int h){
        GLuint tex, fb;

        glGenTextures(1, &tex);
        glBindTexture(GL_TEXTURE_2D, tex);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

        //NULL means reserve texture memory, but texels are undefined
        glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, w, h, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL);
        //-------------------------
        glGenFramebuffers(1, &fb);
        glBindFramebuffer(GL_FRAMEBUFFER, fb);

        //Attach 2D texture to this FBO
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tex, 0);
       
        //Does the GPU support current FBO configuration?
        GLenum status;
        status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
        switch (status)
        {
        case GL_FRAMEBUFFER_COMPLETE:
                std::cout << "good";
                break;
        default:
                std::exit(EXIT_FAILURE);
        }

        return { fb, tex };
}

int main2()
{      
        sf::RenderWindow window(sf::VideoMode(W, W), "test");
        window.setFramerateLimit(1000);
        glewInit();
        std::array<std::pair<int,int>, N> fbos;
        for (int i = 0; i < N; i++){
                fbos[i] = makeFBO(W, W);
        }
       
        sf::Clock clock;
        int frameCount = 0;
        while (window.isOpen())
        {
                // BROFILER_FRAME("App");

                sf::Event event;
                while (window.pollEvent(event))
                {
                        if (event.type == sf::Event::Closed) window.close();
                        if (event.type == sf::Event::KeyPressed){
                                if (event.key.code == sf::Keyboard::Escape){
                                        window.close();
                                }
                        }
                }

                // draw to fbos
                for (auto& fbo: fbos){

                        window.pushGLStates();

                        glMatrixMode(GL_PROJECTION);
                        glPushMatrix();
                        glLoadIdentity();
                        glOrtho(0.0, W, 0.0, W, -1.0, 1.0);
                        glMatrixMode(GL_MODELVIEW);
                        glPushMatrix();
                        glLoadIdentity();

                        glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fbo.first);
                        glClearColor(1, 0, 0, 1);
                        glClear(GL_COLOR_BUFFER_BIT);

                        drawThing();

                        glPopMatrix();
                        glPopMatrix();

                        window.popGLStates();

                }
                glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);

                // draw textures to window
                window.clear();

                window.pushGLStates();
                glMatrixMode(GL_PROJECTION);
                glPushMatrix();
                glLoadIdentity();
                glOrtho(0.0, W, 0.0, W, -1.0, 1.0);
                glMatrixMode(GL_MODELVIEW);
                glPushMatrix();
                glLoadIdentity();
               
                for (auto& fb: fbos){
                        glColor3f(1, 1, 1);
                        glEnable(GL_TEXTURE_2D);
                        glBindTexture(GL_TEXTURE_2D, fb.second);

                        glBegin(GL_QUADS);
                        glTexCoord2f(0, 0); glVertex3f(0, 0, 0);
                        glTexCoord2f(0, 1); glVertex3f(0, W, 0);
                        glTexCoord2f(1, 1); glVertex3f(W, W, 0);
                        glTexCoord2f(1, 0); glVertex3f(W, 0, 0);
                        glEnd();

                        // auto spr = sf::Sprite(rt.getTexture());
                        // window.draw(spr);                                           
                }

                glPopMatrix();
                glPopMatrix();

                window.popGLStates();
               
                window.display();

                frameCount++;
                if (frameCount > 100){
                        std::cout << 1. / (clock.getElapsedTime().asSeconds() / 100) << std::endl;
                        frameCount = 0;
                        clock.restart();
                }
        }

        return EXIT_SUCCESS;
}

void drawThing(){
        int W = 500;

        // do drawing  
        glBegin(GL_QUADS);
        glColor3f(1, 1, 1);
        glVertex2f(0, 0);
        glVertex2f(W, 0);
        glVertex2f(W, W/2);
        glVertex2f(0, W / 2);
        glColor3f(1, 0, 1);
        glVertex2f(0, 0);
        glVertex2f(W/2, 0);
        glVertex2f(W/2, W / 4);
        glVertex2f(0, W / 4);
        glEnd();


}