SFML community forums

Help => Graphics => Topic started by: jmcmorris on July 16, 2015, 01:27:35 am

Title: Optimizing Rendering
Post by: jmcmorris on July 16, 2015, 01:27:35 am
Hello! I was wanting to see if anyone had any idea on how I can go about optimizing my rendering. I've been profiling my code and have found that a great deal of time is spent with GlContext::setActivate(). I am using several RenderTexture that I render to, display() and then pass to a shader. Because of this the program is constantly switching what it's rendering to which, from what I understand, is what calls setActivate().

Some possibilities might be:

One thing I have done so far is to try to minimize the switching by grouping the use of a RenderTexture together.

I can post code here however it is quite spread out. I'm mostly looking for ideas at the moment and/or seeing if I'm just doing something completely wrong. Thanks!

(Oh, and I'm on either SFML 2.1 or 2.2. I can't remember but it is fairly recent. I think from Sept. 2014.)
Title: Re: Optimizing Rendering
Post by: eXpl0it3r on July 16, 2015, 01:49:10 am
Do you have a performance issue?

Did you profile in release mode?
Title: Re: Optimizing Rendering
Post by: jmcmorris on July 16, 2015, 05:23:15 am
Yes to both of those. My rendering is taking up a great deal of time and so I was investigating if it is possible to slim that down.
Title: Re: Optimizing Rendering
Post by: binary1248 on July 16, 2015, 10:33:25 am
How much time exactly does setActive() take up? And how did you determine this? I can imagine that the number of calls is relatively high, but setActive() really doesn't do much. It does way less than the actual drawing functions, that's for sure. As such, I don't really see how it can take up so much of the time.

Also, when switching contexts frequently, the time you measure (if you did measure the time), might not all come from SFML's code. When switching contexts, the operating system tries to flush the command queue of the previous context and this might block in high load situations. If you issued a lot of drawing to your RenderTextures and switch contexts, those draw commands will need to be flushed out of the queue before the operating system lets the code continue which leads to a high time spent in the makeCurrent() call of the context implementation.

You really shouldn't focus on trying to cut down on the number of calls to setActive(). It is called as much as necessary and no more. I'm pretty sure the majority of the time spent in your code is still distributed over other things, especially the other parts of the drawing process.

The easiest way to make things better is the same as with any other application:
Title: Re: Optimizing Rendering
Post by: eigenbom on September 03, 2015, 06:57:28 am
Instead of starting my own thread I'm just going to chime in here and say that I'm experiencing the same issues. I have lots of Render Textures and the profiler reveals that wglMakeCurrent is taking a substantial amount of time.  Afaict SFML uses one GL context per render texture and then has to switch between them.  I was curious whether the setActive() call was the buffer being flushed as binary suggested, so I made two minimal examples.

(A) uses 20 manually created FBOs and (B) uses 20 sf::RenderTextures. The code for each is below. (A) runs at around 340 fps and (B) runs at 220 fps on my machine. MSVC's profiler shows (B) on the right and you can see wglMakeCurrent takes up a big chunk of the total run-time.

I talked about it a little on twitter (https://twitter.com/eigenbom/status/639265708497436676) and I think that having one context per render texture is not a great approach when rendertexture count is high. Also, a discussion here (http://stackoverflow.com/questions/2198541/what-is-the-best-way-to-handle-fbos-in-opengl) comments about a few strategies to optimise the use of many fbos. One strategy is to have a single FBO and then rebind different textures to them as you render. It's a very specific use case, but unfortunately seems a little tricky with the one context-per-render texture approach of sfml.

(http://bp.io/wp/wp-content/uploads/2015/09/wglmakecurrent.png)

#include <GL/glew.h>
#include <SFML/Graphics.hpp>
// #include <SFML/OpenGL.hpp>
// #include "mm/common/profiler.h"
#include <array>
#include <iostream>
#include <Windows.h>

// as per: http://developer.download.nvidia.com/devzone/devcenter/gamegraphics/files/OptimusRenderingPolicies.pdf
extern "C" {
        _declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;
}

int main1();
int main2();
void drawThing();

int main(){    
        // return main1();
        return main2();
};


static const int W = 800;
static const int N = 20;

int main1()
{
        sf::RenderWindow window(sf::VideoMode(W, W), "test");
        window.setFramerateLimit(1000);

       
        std::array<sf::RenderTexture, N> rts;
        for (auto& rt: rts)     rt.create(W, W);
       
        sf::Clock clock;
        int frameCount = 0;
        while (window.isOpen())
        {
                // BROFILER_FRAME("App");

                sf::Event event;
                while (window.pollEvent(event))
                {
                        if (event.type == sf::Event::Closed) window.close();
                        if (event.type == sf::Event::KeyPressed){
                                if (event.key.code == sf::Keyboard::Escape){
                                        window.close();
                                }
                        }
                }

                // draw to texture
                for (auto& rt : rts){
                        rt.clear(sf::Color(255,0,0));
                        rt.setActive(true);
                        //rt.pushGLStates();
                        rt.setView(sf::View(sf::FloatRect(0,W,W,W)));

                        glViewport(0, 0, W, W);

                        glMatrixMode(GL_PROJECTION);
                        glPushMatrix();
                        glLoadIdentity();
                        glOrtho(0.0, W, 0.0, W, -1.0, 1.0);
                        glMatrixMode(GL_MODELVIEW);
                        glPushMatrix();
                        glLoadIdentity();
               
                        drawThing();

                        glPopMatrix();
                        glPopMatrix();

                        /*
                        sf::RectangleShape shape(sf::Vector2f(W / 2, W / 2));
                        shape.setOrigin(W / 4, W / 4);
                        shape.setPosition(W / 2, W / 2);
                        rt.draw(shape);
                        */



                        rt.display();

                        //rt.popGLStates();
                }
                window.setActive(true);
                window.pushGLStates();
                // draw textures to window
                window.clear();
                for (auto& rt : rts){
                        auto spr = sf::Sprite(rt.getTexture());
                        window.draw(spr);
                }
                window.display();
                window.popGLStates();

                frameCount++;
                if (frameCount > 100){
                        std::cout << 1./(clock.getElapsedTime().asSeconds() / 100) << std::endl;
                        frameCount = 0;
                        clock.restart();
                }
        }

        return EXIT_SUCCESS;
}

std::pair<int,int> makeFBO(int w, int h){
        GLuint tex, fb;

        glGenTextures(1, &tex);
        glBindTexture(GL_TEXTURE_2D, tex);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

        //NULL means reserve texture memory, but texels are undefined
        glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, w, h, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL);
        //-------------------------
        glGenFramebuffers(1, &fb);
        glBindFramebuffer(GL_FRAMEBUFFER, fb);

        //Attach 2D texture to this FBO
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tex, 0);
       
        //Does the GPU support current FBO configuration?
        GLenum status;
        status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
        switch (status)
        {
        case GL_FRAMEBUFFER_COMPLETE:
                std::cout << "good";
                break;
        default:
                std::exit(EXIT_FAILURE);
        }

        return { fb, tex };
}

int main2()
{      
        sf::RenderWindow window(sf::VideoMode(W, W), "test");
        window.setFramerateLimit(1000);
        glewInit();
        std::array<std::pair<int,int>, N> fbos;
        for (int i = 0; i < N; i++){
                fbos[i] = makeFBO(W, W);
        }
       
        sf::Clock clock;
        int frameCount = 0;
        while (window.isOpen())
        {
                // BROFILER_FRAME("App");

                sf::Event event;
                while (window.pollEvent(event))
                {
                        if (event.type == sf::Event::Closed) window.close();
                        if (event.type == sf::Event::KeyPressed){
                                if (event.key.code == sf::Keyboard::Escape){
                                        window.close();
                                }
                        }
                }

                // draw to fbos
                for (auto& fbo: fbos){

                        window.pushGLStates();

                        glMatrixMode(GL_PROJECTION);
                        glPushMatrix();
                        glLoadIdentity();
                        glOrtho(0.0, W, 0.0, W, -1.0, 1.0);
                        glMatrixMode(GL_MODELVIEW);
                        glPushMatrix();
                        glLoadIdentity();

                        glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fbo.first);
                        glClearColor(1, 0, 0, 1);
                        glClear(GL_COLOR_BUFFER_BIT);

                        drawThing();

                        glPopMatrix();
                        glPopMatrix();

                        window.popGLStates();

                }
                glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);

                // draw textures to window
                window.clear();

                window.pushGLStates();
                glMatrixMode(GL_PROJECTION);
                glPushMatrix();
                glLoadIdentity();
                glOrtho(0.0, W, 0.0, W, -1.0, 1.0);
                glMatrixMode(GL_MODELVIEW);
                glPushMatrix();
                glLoadIdentity();
               
                for (auto& fb: fbos){
                        glColor3f(1, 1, 1);
                        glEnable(GL_TEXTURE_2D);
                        glBindTexture(GL_TEXTURE_2D, fb.second);

                        glBegin(GL_QUADS);
                        glTexCoord2f(0, 0); glVertex3f(0, 0, 0);
                        glTexCoord2f(0, 1); glVertex3f(0, W, 0);
                        glTexCoord2f(1, 1); glVertex3f(W, W, 0);
                        glTexCoord2f(1, 0); glVertex3f(W, 0, 0);
                        glEnd();

                        // auto spr = sf::Sprite(rt.getTexture());
                        // window.draw(spr);                                           
                }

                glPopMatrix();
                glPopMatrix();

                window.popGLStates();
               
                window.display();

                frameCount++;
                if (frameCount > 100){
                        std::cout << 1. / (clock.getElapsedTime().asSeconds() / 100) << std::endl;
                        frameCount = 0;
                        clock.restart();
                }
        }

        return EXIT_SUCCESS;
}

void drawThing(){
        int W = 500;

        // do drawing  
        glBegin(GL_QUADS);
        glColor3f(1, 1, 1);
        glVertex2f(0, 0);
        glVertex2f(W, 0);
        glVertex2f(W, W/2);
        glVertex2f(0, W / 2);
        glColor3f(1, 0, 1);
        glVertex2f(0, 0);
        glVertex2f(W/2, 0);
        glVertex2f(W/2, W / 4);
        glVertex2f(0, W / 4);
        glEnd();


}