Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Triple Buffering one more time  (Read 6696 times)

0 Members and 2 Guests are viewing this topic.

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Triple Buffering one more time
« on: July 15, 2018, 08:05:30 pm »
I'm involved in the developement of the app that frequently reloads batches of images from disk. That causes significant stuttering of the animations. Enabling triple buffering in the driver does not help because the Texture::loadFromStream() function causes a delay of 16-32ms first time it's called in a frame.

I've narrowed it down to the glGenTextures(1, &texture) call in Texture::create() function.
When I disable vsync there is no delay.

So my questions are:

1. Why glGenTextures() is causing the lag?
2. Is there any way to make the TripleBuffering actually do what it's intended to do, so the app has 1 frame more for image reloading?
3. Is there any way to rewrite the image loading function, so it does not recreate the texture every time Texture::loadFromStream() is called, but instead use some persistent one and just call swap() on them?


Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32498
    • View Profile
    • SFML's website
    • Email
Re: Triple Buffering one more time
« Reply #1 on: July 16, 2018, 08:44:46 am »
If your textures are all the same size, you can try loading a sf::Image and then call Texture::update.
Laurent Gomila - SFML developer

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #2 on: July 16, 2018, 08:58:58 am »
They are not unfortunately, they are screenshots, although small in dimensions 320x224 avreage, the resolutions are all over the place.
« Last Edit: July 16, 2018, 09:02:08 am by oomek »

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #3 on: July 16, 2018, 09:18:00 am »
What you said made me thinking though. I could scan the directory and load the largest file and load it so it reserves the the texture of that size and then call the update (const Image &image, unsigned int x, unsigned int y). Do you think that would work?

Update: Apparently not, this function only adds an offset, but do not resize the texture.
« Last Edit: July 16, 2018, 09:47:19 am by oomek »

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11078
    • View Profile
    • development blog
    • Email
Re: Triple Buffering one more time
« Reply #4 on: July 16, 2018, 10:49:52 am »
The slowest part in your chain is most likely the disk reading. As such you could off-load reading from the disk into a separate thread and then load from memory into VRAM in the main thread.

However if creating a texture takes that long to create, then I wonder more whether your OpenGL driver is outdated.
Official FAQ: https://www.sfml-dev.org/faq/
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #5 on: July 16, 2018, 11:31:44 am »
The lag doesn't change if I preload the images to memory buffer and then call loadFromMemory. I've checked with timers what is the main cause of the delay, so I wrapped the timer around glGenTextures() and the program sits there waiting until all the queued screen buffers are displayed so That's why it's constant 16-32ms (1-2 frames)
When I disable vsync there is no lag and the whole image loading is bottlenecked by my SSD (0-2ms)

I've tested it with the latest nvidia and amd drivers on both my pc's and the results are identical.

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #6 on: July 16, 2018, 12:50:06 pm »
Here is a test app I wrote so you can see for yourself how long it takes to load each image when you press left/right

#include <SFML/Graphics.hpp>
#include <iostream>
#include <filesystem>

namespace std {
    namespace filesystem = experimental::filesystem::v1;
}

namespace fs = std::filesystem;

int main() {

    // path to the folder with pngs
    std::string path = "images";

    // number of displayed elements, on my machine it's stuttering around 40
    int elements = 40;

    std::string largest = path + "/sweetl2.png"; // largest file 640x480

    sf::Font font;
    font.loadFromFile( "C:/Windows/Fonts/arial.ttf" );


    sf::Clock clock;
    int total_clock = 0;

    // building a file table here
    std::vector<std::string> file_table( 0 );
    for ( auto & p : fs::directory_iterator( path ))
    {
        fs::path file{ p };
        if ( file.extension().string() == ".png" )
            file_table.push_back( file.string() );
    }

    if ( file_table.size() == 0 )
    {
        std::cout << " Did not find any png files" << std::endl;
        std::cout << " Press enter to exit.";
        std::cin.get();
        return 0;
    }

    sf::RenderWindow window;
    window.create( sf::VideoMode::getDesktopMode(), "Borderless Fullscreen", sf::Style::None );
    window.setVerticalSyncEnabled( true );
    window.setMouseCursorVisible( false );
    window.setKeyRepeatEnabled( true );
    sf::Vector2u size = window.getSize();
   
    sf::RectangleShape bar( sf::Vector2f( 32, size.y ));
    bar.setFillColor( sf::Color( 255, 255, 255, 200 ));
    sf::Vector2f bar_pos = sf::Vector2f( 0, 0 );

    std::vector<sf::Sprite> sprite_array( elements );
    std::vector<sf::Texture> texture_array( elements );
    std::vector<sf::Text> text_array( elements );
    sf::Text total;
    total.setPosition( 0.0, size.y / elements );
    total.setOrigin( 0.0, -100.0 );
    total.setFont( font );

    int offset = 0;

    for ( int i = 0; i < elements; i++ )
    {
        sf::Texture texture;
        texture_array[i].loadFromFile( largest );
        sprite_array[i].setTexture( texture_array[i] );
        auto texture_size = sprite_array[i].getTexture()->getSize();
        sprite_array[i].setScale( float( size.x ) / texture_size.x, float( size.y ) / texture_size.y );
        sprite_array[i].setPosition( size.x / elements * i, 0.0 );
    }

    for ( int i = 0; i < elements; i++ )
    {
        text_array[i].setFont( font );
        text_array[i].setCharacterSize( 20 );
        text_array[i].setOrigin( 0.0, -50.0 );
        text_array[i].setPosition( size.x / elements * i, size.y / elements );
    }

    bool reload = true;

    while ( window.isOpen() ) {
        sf::Event event;
        while ( window.pollEvent( event ))
        {
            if (( event.type == sf::Event::Closed )
                || ( event.key.code == sf::Keyboard::Escape ))  window.close();
            else if ( event.type == sf::Event::KeyPressed )
            {
                if ( event.key.code == sf::Keyboard::Left )
                {
                    reload = true;
                    offset--;
                }
                else if ( event.key.code == sf::Keyboard::Right )
                {
                    reload = true;
                    offset++;
                }
            }

        }

        if ( reload )
        {
            reload = false;
            total_clock = 0;

            for ( int i = 0; i < elements; i++ )
            {
                clock.restart();
               
                // first method 16-32ms lag on first image, bar is stuttering
                // when pressing left/right
                texture_array[i].loadFromFile( file_table[(i + offset) % (file_table.size() - 1)] );

                // second method 16ms lag on first image, bar is stuttering very rarely,
                // sprite textures are not scaling to the sprite size
                // sf::Image image;
                // image.loadFromFile( file_table[( i + offset ) % ( file_table.size() - 1 )] );
                // texture_array[i].update( image );
               
                text_array[i].setString( std::to_string( clock.getElapsedTime().asMilliseconds() ));
                total_clock += clock.getElapsedTime().asMilliseconds();
                sprite_array[i].setTexture( texture_array[i], true );
                auto texture_size = sprite_array[i].getTexture()->getSize();
                sprite_array[i].setScale( float( size.x ) / texture_size.x / elements, float( size.y ) / texture_size.y / elements );
            }
        }

        window.clear();

        for ( auto & s : sprite_array ) window.draw( s );

        bar_pos.x += 4;
        if ( bar_pos.x > size.x - 32 ) bar_pos.x = 0;
        bar.setPosition( bar_pos );
        window.draw( bar );

        for ( auto & t : text_array ) window.draw( t );
        total.setString( "Total: " + std::to_string( total_clock ) + "ms" );
        window.draw( total );

        window.display();
    }

    return 0;
}
 

Compiled in VS2015 with SFML 2.5.0


And a zip file with a batch of random images from the ones I use:

https://mega.nz/#!68NWkQ4A!7iSUTrud3C8AAvTg5ajzvMJoQIiPmeHCUx9Yix9jm3k
« Last Edit: July 16, 2018, 12:56:29 pm by oomek »

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #7 on: July 16, 2018, 04:19:53 pm »
I’m still baffled why glGenTextures is so laggy when vsync is enabled. I tried to find some inforation about it, but I found nothing.

binary1248

  • SFML Team
  • Hero Member
  • *****
  • Posts: 1405
  • I am awesome.
    • View Profile
    • The server that really shouldn't be running
Re: Triple Buffering one more time
« Reply #8 on: July 17, 2018, 02:29:43 am »
It's been said on the forums many times already: You need to be careful when measuring OpenGL so naively.

As said many times before, OpenGL is probably the most asynchronous API you will ever see. You need to be lucky (or just very knowledgeable) to run into a function that actually blocks CPU execution until all its effects are complete when writing your typical (not purposely stupid) OpenGL code. Timing specific functions and assuming that the corresponding OpenGL operation must be responsible for any delays is just wrong. Under the hood, nothing prevents the driver from doing something you previously requested only when calling another function in the future.

There are a few things that have to be stated:
  • Because you activated V-Sync and presumably are using a 60Hz monitor, the driver will try to target a 16ms frame.
  • Where/When the driver does the waiting is completely implementation defined. Typically it does the waiting in the buffer swap i.e. inside sf::Window::display(). However, there is nothing preventing it from waiting somewhere else if it has good reason to do so. Double/Triple buffering allows it to stretch time a little since queued frames are only sent out 1-2 frames later.
  • On my system, the delay when loading new textures doesn't occur consistently when loading the first texture. It jumps about between the first 4 textures. However, the delay is always the same.
  • If you time the .display() call and/or the whole frame you will notice that a) the .display() call will not block for the usual 16ms when one of the delays during texture loading occurs and b) the total frame time is 16ms, meaning that all the delay either comes from .display() or texture loading.

Now, we need to ask ourselves why we notice stuttering. Humans have a more or less good ability to perceive changes better than "absolute" values. This means in our case, we notice stuttering not because a frame is just "slow" in general but because it is "slower" than all the other frames. A jump from 16ms frame time to 33ms frame time might already be enough to cause this, even though funnily enough our brains would not consider something as "constantly stuttering" if it ran at a constant 30 FPS or 33ms frame. What is happening here however is a bit more severe than 33ms which means that in those "longer frames" there is suddenly time that is unaccounted for.

Just digging through the call chain of sf::Texture::loadFromFile I noticed that there are still a few places left where glFlush() is still being called. glFlush()/glFinish() and friends are all pretty meh. They belong to those functions that were conceived in a time where single core CPUs were still a thing and GPUs were still glorified co-processors. There is really no good reason any more why you would need to use glFinish() in this day and age, there are so many better performing alternatives to the scenarios that would make it necessary. glFlush() is really weird. If you ask me it is mainly there to overcome driver implementation quirks left over from the early days. It might still be necessary in some special cases, but in typical situations, it really isn't. Like I said, SFML still makes use of glFlush(), now only in sf::Texture and sf::Shader because I threw a bunch of glFlush() calls out from other places just a few months ago.

Commenting out the glFlush()es actually "fixed" the stuttering for me. From my experience, glFlush() is a mixed bag. Sometimes you don't notice it, other times it can be almost as bad performance-wise as a glFinish(). I guess it depends on the driver and whether it likes you/your application/your system. What many people don't know is that an implicit flush is performed when switching OpenGL contexts as well. Put another way, if you want good performance, don't switch contexts if you don't have to. I already improved this a little with my sf::RenderTexture optimization in SFML 2.5. What to do with the remaining explicit glFlush()es I haven't thought about yet. It will be hard to work around since sf::RenderTarget caches state really aggressively and you would need to rebind resources if you left out the glFlush()es.

If you feel brave and like to take risks, you can comment out the glFlush()es and see if it works out for you.
SFGUI # SFNUL # GLS # Wyrm <- Why do I waste my time on such a useless project? Because I am awesome (first meaning).

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #9 on: July 17, 2018, 10:01:46 am »
First of all, thank you for the time you've spent preparing that elaborate. I read it but I have some doubts which I'd like to ask you to clarify.


On my system, the delay when loading new textures doesn't occur consistently when loading the first texture. It jumps about between the first 4 textures.

Is the delay always 16 or 33 ms on the first image? It is on my machine. If not can you set the window sf::style from none to fullscreen, as you may not have the borderless optimizations running for some reason ( there are many factors, but the most common one is other than 100% desktop dpi scaling if you are on Windows 10, or focus assist enabled in settings, multi monitor configuration, etc )

Quote
Where/When the driver does the waiting is completely implementation defined. Typically it does the waiting in the buffer swap i.e. inside sf::Window::display()

On my machine display() does waiting always, except the situation when the driver is already late if loading images takes longer, so it's either not waiting or waiting for the remaining time to catch the next vsync. If the wait can be trigger by as you said by any function other than buffer swap that really sucks.

Quote
If you feel brave and like to take risks, you can comment out the glFlush()es and see if it works out for you.

Thanks for the tip, I'll do that and report back.

oomek

  • Jr. Member
  • **
  • Posts: 90
    • View Profile
    • Email
Re: Triple Buffering one more time
« Reply #10 on: July 17, 2018, 11:53:45 am »
I've commented the glFlush'es but unfortunately I still get 33ms wait on reloading and stuttering. But why it doesn't wait for 16ms like in the second method when I just call update() When I do this the animation is smooth even with 60  elements. I have triple buffering set to ON in the driver. Oh and btw, I hope you are not holding the arrow keys, as it will always be stuttering when you are continously reloading images.
« Last Edit: July 17, 2018, 11:57:08 am by oomek »

binary1248

  • SFML Team
  • Hero Member
  • *****
  • Posts: 1405
  • I am awesome.
    • View Profile
    • The server that really shouldn't be running
Re: Triple Buffering one more time
« Reply #11 on: July 17, 2018, 07:16:38 pm »
What can I say... I must admit, one of the big downsides to OpenGL is that its performance is really hard to predict in specific situations. In order to provide the abstraction it provides to us (which is still based on a long outdated model of the hardware) and at a reasonable speed, the specification leaves many things open to the implementers in order for them to be able to optimize better. This also means less guarantees for us and generally more guesswork. Unless you are someone like John Carmack, if you really need highly predictable performance characteristics you have to use either Direct3D or Vulkan (which was designed from the ground up to "fix" these kinds of problems). Maybe one day there will be a Vulkan implementation for SFML...
SFGUI # SFNUL # GLS # Wyrm <- Why do I waste my time on such a useless project? Because I am awesome (first meaning).