Author Topic: Different implementation for setFramerateLimit (Read 11411 times)

Laurent · « **on:** September 28, 2014, 07:41:43 pm »

Edit: Original thread start:

Quote from: Zyl on September 28, 2014, 07:25:48 pm

I tried some SDL. I find SFML better for OOP, Documentation and Guides/Tutorials.

2 cents:
- GL core spec should be used by default. Yes I can recompile SFML, but defaults matter.
- http://www.sfml-dev.org/tutorials/2.0/window-window.php says it uses sf:sleep for managing framerate limit, and that sf:sleep's "resolution depends on the underlying OS, and can be as high as 10 or 15 milliseconds". I find it very important to use the best (most accurate) solution on every OS. i.e. Windows has Multimedia Timers. Using an inaccurate thread-sleep in non-v-synced mode kind of defeats the purpose of a non-v-synced mode (reduced input lag).

Cheers.

Edit: Laurent's response:

Quote

http://www.sfml-dev.org/tutorials/2.0/window-window.php says it uses sf:sleep for managing framerate limit, and that sf:sleep's "resolution depends on the underlying OS, and can be as high as 10 or 15 milliseconds". I find it very important to use the best (most accurate) solution on every OS. i.e. Windows has Multimedia Timers.

Please have a look at the source code before making (wrong) comments about it

Zyl · « **Reply #1 on:** September 29, 2014, 12:42:58 am »

Quote from: Laurent on September 28, 2014, 07:41:43 pm

Please have a look at the source code

From SFML-2.1\src\SFML\System\Win32\SleepImpl.cpp:

Code: [Select]

#include <SFML/System/Win32/SleepImpl.hpp>
#include <windows.h>


namespace sf
{
namespace priv
{
////////////////////////////////////////////////////////////
void sleepImpl(Time time)
{
    ::Sleep(time.asMilliseconds());
}

} // namespace priv

} // namespace sf

This code relays directly to WinAPI's Sleep-function, which performs a not very accurate sleep.

From SFML-2.1\src\SFML\Window\Window.cpp:

Code: [Select]

void Window::display()
{
    // Display the backbuffer on screen
    if (setActive())
        m_context->display();

    // Limit the framerate if needed
    if (m_frameTimeLimit != Time::Zero)
    {
        sleep(m_frameTimeLimit - m_clock.getElapsedTime());
        m_clock.restart();
    }
}

Here, sf::sleep() is used for simulating delay, relaying to WinAPI Sleep() (in the case of Windows, anyway). The calculation is correct, but the result is fed to a function which can not and will not respect the delay. This reflects what I said. This also reflects what is written in the Tutorial.

binary1248 · « **Reply #2 on:** September 29, 2014, 01:14:04 am »

Ever heard of... you know... the master branch of SFML at GitHub?

eXpl0it3r · « **Reply #3 on:** September 29, 2014, 02:02:19 am »

The latest changes should make the sleep call a bit more accurate, as binary1248 pointed out.

The setFramerateLimit has only been implemented for convenience. It's okay for small projects, but wasn't meant as a general solution for everything. As such it was never meant as the best option fof the given OS, but it was built for an easy cross-platform implementation, which is using "sleep".

In my opinion if a better resolution is required, then it's the easiest to just write a functiin that provudes it.

Zyl · « **Reply #4 on:** September 29, 2014, 02:43:49 am »

Quote from: binary1248 on September 29, 2014, 01:14:04 am

Ever heard of... you know... the master branch of SFML at GitHub?

I might have heard of it. I can't seem to remember?

The code is better, but for frame timing you should add a spinlock which queries a high precision timer. e.g. sf::sleepSpinMicroseconds(4500) does sf::sleep(4) and then spins till the remaining ~500µs are over. Rather use some cycles for active waiting than not match the requested framerate. Dorky code for a dorky kernel, but it gives better results.

https://github.com/SFML/SFML/blob/master/src/SFML/Window/Window.cpp#L349 Might want to not call sf::sleep() if the argument is <= 0? No time to lose in this case.

@exploiter: Is that how you guys see it? I believe this is important, as I said. Going without v-sync is common, and precise thread-sleeps are a massive programming myth you should not want anyone less than an advanced programmer have to deal with, especially not cross-platform. It's among the top 2 of things I'd expect a GL windowing API to do well, right after getting the rendering to appear on the screen.

eXpl0it3r · « **Reply #5 on:** September 29, 2014, 09:50:00 am »

If all the problem you have is an inaccurate FPS, then you're probably on the wrong path here. You shouldn't write code that is depending on a stable FPS, because a stable FPS is in all cases just a theoretical thought and will never work out in practice.

One of the main reasons why sf::sleep is used, is to take away load from the CPU. Suggesting to use a spinlock (which is questionable on its own anyways...) would defeat that purpose.
The same goes for the Multimedia Timers, what do you do with your thread when waiting for the next timer event? Busy-waiting loop?

Killing your CPU just to get 60 FPS for 95% of the instead of 60+-5 FPS isn't really what we want to achieve with setFramerateLimit. If there's a better way to implement a frame limiter which takes the load off the CPU, we're all ears - but please open a dedicated topic in the feature request forum. But don't forget to give an example on how said technique would be used. And don't forget it should be cross-platform at best.

Zyl · « **Reply #6 on:** September 29, 2014, 05:43:27 pm »

Quote from: eXpl0it3r on September 29, 2014, 09:50:00 am

code that is depending on a stable FPS

My code does not depend on stable FPS. I do.

Quote from: eXpl0it3r on September 29, 2014, 09:50:00 am

One of the main reasons why sf::sleep is used, is to take away load from the CPU. Suggesting to use a spinlock (which is questionable on its own anyways...) would defeat that purpose.

How does an idle CPU improve the performance of your system? All the threading-mechanisms in the world strive to maximize CPU-usage. The current code even rounds down to nearest millisecond-integer, so you actually render more frames than requested, stressing the graphics card, which is much more susceptible to clocking down at high load/temperature than the CPU is, and that also becomes much more noticable.

Quote from: eXpl0it3r on September 29, 2014, 09:50:00 am

The same goes for the Multimedia Timers, what do you do with your thread when waiting for the next timer event? Busy-waiting loop?

This is a GL windowing API. We program highly interactive, highly demanding applications, which are absorbing all of the user's attention. There should be nothing more important than getting that one single application to run well while it is active.

Quote from: eXpl0it3r on September 29, 2014, 09:50:00 am

Killing your CPU

Personally, I do not believe in this. If your CPU takes a serious hit in durability from prolonged load, something is wrong.

Quote from: eXpl0it3r on September 29, 2014, 09:50:00 am

If there's a better way to implement a frame limiter which takes the load off the CPU

There isn't. This discussion is just about ideology of priority right now, not about scientific determination of the best solution. I might present my solution later. Then you can judge.

Jesper Juhl · « **Reply #7 on:** September 29, 2014, 06:03:56 pm »

While the snake game I'm currently implementing for my 5 year old daughter could easily run with 3-4digit FPS, I don't want that. It would kill the battery of her tablet and wouldn't make any visual difference what-so-ever. So I slow it down to ~60FPS - doesn't matter if it's 65, 60, 55 or even 30, I take a variable frame-rate into account. What does matter is saving CPU cycles and power. So I want the CPU to sleep, not spin to give the fastest possible response.

Many applications are like this.

Hiura · « **Reply #8 on:** September 29, 2014, 06:14:03 pm »

Could a mod move this discussion elsewhere please?

[edit: thanks]

@Zyl, I think you have misunderstood eXpl0it3r's message and/or you have some misconception about parallel programming and performance measurement. Let me highlight two main mistakes I see in your reasoning.

Quote from: Zyl on September 29, 2014, 05:43:27 pm

How does an idle CPU improve the performance of your system?

What eXpl0it3r means is that we use `sf::sleep` to let some other (light-)process use the CPU for a while since we can not use it for any useful means right now. Therefore, you improve the system performance by letting someone else doing something instead of doing nothing.

Quote from: Zyl on September 29, 2014, 05:43:27 pm

All the threading-mechanisms in the world strive to maximize CPU-usage.

There's a slight mistake here: they strive to maximise the CPU-usage/«outcome» ratio performance. By outcome I mean for example the number of frame per second or the result of a formula, etc...

If you just maximise the CPU-usage, then the following app is maybe the best app ever:

def spin = while (true);
cores = system.getCorsCount();
for i in [0..cores] dispatch thread(spin);

It guarantees it will use all the CPU power you have to do one thing: nothing.

Spinning is not a good thing. Avoid it as much as you can! In fact I'll go even further: avoid as much any synchronisation barrier as you can (including locks).

Plus Jesper's comment is very important when it comes to mobile devices. Using spinning takes away all the optimisations an operating system could use to improve battery life (like cutting power to some CPU cores).

Zyl · « **Reply #9 on:** September 29, 2014, 06:32:31 pm »

Power: Yes. This is a good argument, but only so much for mobile use cases.

Maximization: Yes, obviously. However, if perfect frame times increase perceived product value, spinning does not do nothing. That is all I'm saying. It's not a good solution I agree. I wouldn't have suggested a combination with sleeping if I did. I only set my priorities different than you do, because jitter annoys me, much more so than the idea of having a few % unutilizable cycles on one core.

eXpl0it3r · « **Reply #10 on:** September 29, 2014, 07:46:16 pm »

I still think you're mislead by the idea that jittery movement can be "fixed" with letting the CPU usage go to 100%. If you have problems with smooth movement, implement a fixed time step with proper interpolation.
Letting your stuff run at 10000 FPS only seems to fix the issue, but it actually just makes it appear less obvious. Using a fixed time step with proper interpolation will smooth the movement and will work for 30, 60, 120 or higher FPS.

I suggest you take an honest look at about any game or real time application out there. Count how many use 100% of the CPU when no huge calculation is going on. Personally I've yet to see anyone advocating spin-locks for FPS control in any kind of application. Thus, since nobody really uses such techniques and there have been more question on how to reduce the CPU usage (i.e. when people didn't use VSync or any kind of framerate limiting), the implementation of setFramerateLimit won't change in a way you've suggested. But as I said before, if there's a better alternative to sleep(), which reduces CPU load and is more precise, we would really be interested in hearing about it.

Rejecting your idea however doesn't mean, that you can't persuade it. I won't advise you to do so, but it's very well possible and for special cases that need more precision probably advised, to implement your own framerate limiter.

wintertime · « **Reply #11 on:** September 29, 2014, 07:54:09 pm »

Coming from a different angle:
I once read that using sleep inside a realtime application is bad, because it is equivalent to telling the Windows scheduler "I wont have anything to do for a very long time and I dont care when exactly I get woken up, power down the CPU please" and that the right solution for doing a short wait would be to instead block on WaitMessage, GetMessage, WaitForSingleObject, MsgWaitForMultipleObjectsEx or similar function to signal the Windows scheduler "I have urgent work to do soon, but I'm temporarily blocked by this/these specific things, please wake me up asap when they are ready".
I think using SetTimer is a way to achieve this, but there may be some better way.

Btw., the same problem with sleep also exists in sf::Window::waitEvent method, because it does not use the native GetMessage, but instead does Peekmessage, checks Joystick data and sleeps in a loop. If the Event handling could unify handling of joystick and other messages it may be beneficial, even though it is a much less used method than pollEvent. Thats because if internally the joystick handling and message handling could be better combined, the wait from setFrameLimit may possibly be deferred and let intermediate message handling happen while not enough time has passed for that frame, through remembering frame change in display and some clever logic inside the *Event methods.

Zyl · « **Reply #12 on:** October 07, 2014, 06:27:09 pm »

I have now experimented with this.

Results:

The MSDN documentation on timerBeginPeriod() and timerEndPeriod() is likely wrong/misleading. These functions should scope the rendering loop (program execution), not the sleep interval, as this much more consistently improves sleep time accuracy, as experimentation shows. Encapsulating Sleep() with these functions also allowed for Sleep() to undersleep, which normally should not happen at all.
The (Windows Vista and later) desktop is (designed to appear) vertically synced. Swapping buffers (sf::RenderWindow::display()) will show a result only starting after the next v-blank (Definition, Relevancy in OpenGL), so unless you are syncing the point in time that your rendering finishes with the next v-blank, you will drop a frame quite regularly: You are only swapping buffers with the window content, not the screen; you rely on the OS to do the actual swap, and that swap is vertically synced. Trying to time this is by all sane means a pointless effort. Good reads: https://superuser.com/questions/558007/how-does-windows-aero-prevent-screen-tearing and https://answers.microsoft.com/en-us/windows/forum/windows_8-performance/dwm-and-vsync/aec4ef8e-e3ea-4255-a557-640e9c63eccc.
- As a consequence, the best perceived performance and frame stability in windowed mode on Windows is achieved by using the vertical sync extension, all in spite of the resulting perceived input lag.
- How vertical sync is implemented depends on the system however. My system guarantees 100% CPU usage when using v-sync, as some code spins to check v-blank status. The people who made it didn't know better than us. There are ways to work that out (kind of).
Improving sync with Sleep() by adding spin-locks may appear to improve perceived performance, and in tests cost ~80ms of spinning per second while rendering at 60Hz. As render time fluctuates however, which it naturally may very well do (especially in dynamic scenes), you will encounter mentioned problem on Windows, when in windowed mode: You are syncing the start of the rendering, not the moment it finishes, and hence frames get dropped entirely, because you might sometimes swap buffers twice between two v-blanks. In fullscreen mode (again, this is all on Windows), it may appear to stabilize the screen tearing line from missing synchronization. This as well, however, may become irrelevant as frame rendering times fluctuate.

Spin-locking is useful and effective in fullscreen mode when you cannot afford to render extra frames because the GPU happens to be the bottleneck. A big plus would be if you could do some useful work while you need to spin. It is not notably effective otherwise.
Code to react to long/short sleeping of the sleep-function however can improve framerate-stability reliably by changing the sleep-time given to it. This is neither (too) hard to write, expensive to execute nor difficult to maintain amongst different operating systems, and is the recommended solution from what I have tried. It's also better from a thread-management point of view as your program's CPU-requirements don't fluctuate chaotically. Example. <- This could also be nicely wrapped up in a FramerateTimer class or similar.

The biggest problem in visual quality, when rendering without vertical sync, occurs when two frames which are not one but multiple intervals apart appear right next to each other.

Assume: You render your scene with an FPS-limit of 90 (11.11ms) to ensure definite fluid motion on your 60Hz (16.66ms) monitor. Although it may fluctuate, it always stays above 60Hz. Frame A is rendered and swapped to the front buffer. As it so happens, a v-blank just occured and then for 11.11ms the upper two thirds of the screen are drawn using frame A. Frame B is then drawn to the lower and upper third, leaving A in the middle. As frame C starts to appear on the screen it draws directly over frame A, which was 22.22ms ago. This is twice as much delay between touching frames than is usually visible, and looks like part of the scene suddenly "jumping" ahead. Had you rendered at exactly 60 FPS, this case wouldn't have happened. A good solution is to target a framerate slightly below screen refresh rate, e.g. 59.7. Then, the tearing line will (ideally) move downwards over the course of some seconds, and not sleeping the correct amount of time won't result in the problem as likely as otherwise.

tl;dr: If a stable framerate matters at all, and your application is windowed, use v-sync, even if it is expensive and introduces input delay. In fullscreen, and when not using v-sync, render as close to slightly below (~ -0.5%) the monitor refresh rate as you can, whereas 1Hz less is better than 10Hz more.

Laurent · « **Reply #13 on:** October 07, 2014, 10:03:00 pm »

Quote

The MSDN documentation on timerBeginPeriod() and timerEndPeriod() is likely wrong/misleading. These functions should scope the rendering loop (program execution), not the sleep interval

The problem with this function is that it changes the resolution of the system-wide scheduler. We already discussed related issues on this forum, if you're interested.

About frame synchronization and smooth display: if you have not read it yet, I recommend to search related posts on the Qt blog. As far as I remember, they have made a lot of researches about the subject for their QtQuick back-end (based on OpenGL), and came up with interesting algorithms.

Author Topic: Different implementation for setFramerateLimit (Read 11411 times)

Laurent

Different implementation for setFramerateLimit

Zyl

Re: Re: SFML 3 - What is your vision?

binary1248

Re: Re: SFML 3 - What is your vision?

eXpl0it3r

Re: Re: SFML 3 - What is your vision?

Zyl

Re: Re: SFML 3 - What is your vision?

eXpl0it3r

Re: Re: SFML 3 - What is your vision?

Zyl

Re: Re: SFML 3 - What is your vision?

Jesper Juhl

Re: Re: SFML 3 - What is your vision?

Hiura

Re: Re: SFML 3 - What is your vision?

Zyl

Re: Different implementation for setFramerateLimit

eXpl0it3r

Re: Different implementation for setFramerateLimit

wintertime

Re: Different implementation for setFramerateLimit

Zyl

Re: Different implementation for setFramerateLimit

Laurent

Re: Different implementation for setFramerateLimit