Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower  (Read 5409 times)

0 Members and 1 Guest are viewing this topic.

Ezekiel

  • Newbie
  • *
  • Posts: 23
    • View Profile
Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« on: August 15, 2016, 02:26:26 am »
Greetings All:

I have been upgrading one of my sfml 1.6 VS2008 programs to sfml 2.4.0 VS2013 and found the math cycles between graphics frames are significantly slower than the previous version.

I made a simplified version for the comparison, and the math cycles are about ~18 times slower than the original.  The demo code can be pre-compile selected to run on either Visual Studio versions.  Here is the code...

(click to show/hide)

I could use some help to find the faster path using the newer sfml 2.4.0.  There could also be some differences in the compilers and/or computer hardware interface as well.

Specs:
AMD 64 X2 Dual Core Processor 5200+
Windows 7
NVIDIA GeForce 7600 GT

Ezekiel

  • Newbie
  • *
  • Posts: 23
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #1 on: August 16, 2016, 03:27:33 am »
Greetings All:

I did some additional testing using time measuring by adding clock points in the software loop.  I found 2 changes between SFML 1.6 and SFML 2.4.0 that affect the speed of the main loop.

1st, the speed calling the App.pollEvent(Event) command.  Times Measured:
SFML 1.6    SFML 2.4.0
~3us         ~29us

I improved that issue by only polling the Event's on the loop following the frame display, not every loop.  This change improved the 1.6 version by about 25%, and the 2.4.0 rate about 10 times.  After that, the 1.6 loop speed is still about 2.5 * the 2.4.0 loop speed.

2nd, the speed calling the getElapsedTime was different between the 1.6 and 2.4.0 versions.  Times Measured:
SFML 1.6    SFML 2.4.0
~1.2us         3.000us

As shown, the 2.4.0 is longer by about 2.5 * than the 1.6 time, indicating the difference causes the result I see.  Also as shown, the 2.4.0 time is way more accurate, the 1.6 time was very randomized.  Calling the getElapsedTime frequently is required to perform drawing the frame accurately, so I haven't found a good option to solve this yet.

The good news is the time of the App.Draw sequence is about 2.8ms for both versions, so it doesn't seem there aren't time result changes to the graphics section.

Here is the new version test code with the added time measurement...
(click to show/hide)

Be forewarned, any measurements I made can be dramatically different with different hardware or operating systems.

I hope this is found interesting.  Any suggestions would be appreciated.

Mr_Blame

  • Full Member
  • ***
  • Posts: 192
    • View Profile
    • Email
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #2 on: August 16, 2016, 11:59:53 am »
Did you compile both tests in release configuration? SFML versions(from 2.0 i think) have a lot of checks in their code when compiled in debug.

Ezekiel

  • Newbie
  • *
  • Posts: 23
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #3 on: August 16, 2016, 01:25:20 pm »
Greetings Mr_Blame:

I tested both types as mentioned and compared the speeds between them before posting the responses.  I used Release as Debug tends to slow things down quite a bit, so I didn't try Debug.  The timing measurements are posted in the program window when you get it running.  It obviously can be different than my measurements on different hardware or operating systems

If you want to test the 1.6 version, VS 2008 is the latest version of the old SFML library version, so that Visual Studio will be required I believe.

I used 2.3 for a while, and upgraded to 2.4.0 a few days ago, and it required no changes at all, but I'm not familiar with 2.0.  I'm also not very skilled linking all the lib's, so that may cause errors as well.  I use the following in the 2.4.0 include folder, and an older version posted in an old post I made a while ago for the 1.6 version.  The versions less than 2.2 may be more similar than the 1.6 version, but I'm not sure.

include reference to lib's...
(click to show/hide)

I that helps.

dabbertorres

  • Hero Member
  • *****
  • Posts: 505
    • View Profile
    • website/blog
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #4 on: August 16, 2016, 07:27:35 pm »
Well, at lines 149-152:
image.create(WindowX, WindowY, sf::Color(0, 50, 25));
texture.create(WindowX, WindowY);
texture.update(image);

You're creating an image in RAM and uploading it to VRAM every single frame. Of course that's going to be slow. I'm assuming this is demo code since the image isn't changing at all, but, I wouldn't be surprised if this is at least part of your slow-down.

However, the best thing to do, since you're using Visual Studio, would be to use its built-in profiler to figure out which function calls are taking up the most time.

Ezekiel

  • Newbie
  • *
  • Posts: 23
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #5 on: August 16, 2016, 10:38:33 pm »
Greetings dabbertorres:

The section you mentioned is only done on Event Resized, however around line 300 in 2.4.0 mode, a similar sequence is done except the resizing functions.  In the main loop, every time through a new random pixel is generated (line 200 to 220) and modified the single pixel in the image (230 in 2.4.0 mode).  The image is modified thousands of times during each frame, so this is best done in ram, as I understand.  Any improvements to the display time would be appreciated.

Here are some pictures of the two versions attached below.  The first line shows how many pixels are modified per frame, and the next shows time per loop, while the pixel is updated and the time is checked if it's frame-time yet.  The next 4 lines are tested during each frame, the display time, poll time, solve time, and pixel mod time.  The last 2 lines time is by set the 2 getElapsedTime calls, as the Math Loop time does the same since it calls the function once.

I hope you find that helps.

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11034
    • View Profile
    • development blog
    • Email
AW: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #6 on: August 17, 2016, 12:33:54 am »
As has been suggested, run a profiler and see where you're losing time.

Single pixel manipulation should still be fast enough, but the question really is, why do you need to do that? Texture manipulations should really happen through shaders nowadays.

And finally, does it really negatively impact your game or is it just a comparison for fun?
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

Ezekiel

  • Newbie
  • *
  • Posts: 23
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #7 on: August 17, 2016, 10:22:39 am »
Greetings eXpl0it3r:

The program I upgraded from SFML 1.6 to 2.4.0 is a Mandelbrot Fractal program.  Since its much larger program, I copied out the small sections needed to check the differences.  Since the fractal section is complicated and uses other classes, I made a random pixel section so it could be done in 20 lines in a single main function, and still behave similar requirements in its graphics.  Both formats do some math to solve a single pixel in an array of pixels, and update the display every frame.

Most of the changes required for the upgrade 1.6 to 2.xx are spelling or capitalization and a few getSize().x rather than GetWidth(), except for the graphics section requiring a texture step displaying an image.  I apologize for my assumption, when the graphics with the only change mentioned was much slower, it seemed to be a graphics changing the speed.  However, as mentioned in previous posts, the time changes are in the poll event call and getElapsedTime call in the main loop, so this should have been placed in the window or system section, respectively.  Both versions display the image in the same amount of time, so graphics are affect by timing changes, but aren't the culprit.

The program does a lot of calculation to solve a single pixel, so waiting for the poll event or getElapsedTime for a longer time slows the progression rate, and image is the only place to modify a single pixel in ram I could find.  If there are any faster ways, let me know and I will see if I can modify my test program.  I will also have to get to know the profiler, I haven't used that before.

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11034
    • View Profile
    • development blog
    • Email
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #8 on: August 17, 2016, 11:31:06 am »
Since the Mandelbrot Fractal only needs to be "recalculated" when you change the parameter that you look at it, you could update the texture only when the image has changed.

The event polling has been show to be slow when there are issues with gamepads/joysticks, however 29us is still very fast and should never really be a problem.
3ms for retrieving the elapsed time sounds a bit long, but on the other hand the question is, why do you need to call it many times during one frame iteration?

So am I getting the right conclusion that so far there hasn't been any actual performance issue?
0.029ms + 3ms are still way below 16.66ms that you need to render something at 60fps.

My Mandelbrot fractal renderer I wrote with SFML 2 doesn't suffer from rendering issues, but the main problem is to generate the pixel information quickly enough. This is usually not possible within 16.666ms which is why I used loading screens (iirc) and tried to parallelize the calculation.

And as a final note, fractals can also be rendered on the GPU directly with shaders. :D
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

Mario

  • SFML Team
  • Hero Member
  • *****
  • Posts: 879
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #9 on: August 17, 2016, 10:15:52 pm »
Did my own test run in VS 2015 (only 2.4):

  • GPU load on a GTX 680 is hardly measurable (work per frame takes between ~40µs and 610µs).
  • Most time spent indeed happens to be texture.update(image);, which is understandable. (57.1%)
  • App.pollEvent(event) takes the second place due to joystick handling. (13.9%)
  • Third place and no longer significant are the calls to clock.getElapsedTime(). (0.7%)
So with these stats I still decided to have a closer look into sf::Texture::update() and this is where it indeed gets interesting:



Maybe I'm missing something, but I'm not really sure what I should think of this. This might indeed be some bug or quirks, but it's not really in SFML's code, because this "throwaway" object has a very simple connstructor:



This function only retrieves the current active texture identifier (read: an integer).

According to OpenGL.org this behavior is actually to be expected:

Quote
Note that this solution emphasizes correctness over performance; the glGetIntegerv​ call may not be particularly fast.

They provide another workaround (Direct State Access), although right now I'm not sure whether that's feasible in our situation or not. This is definitely something to look into I guess.



But besides that. It's still possible you've got some other bottleneck somewhere, so you'd have to do the measurement/analysis yourself.



Edit: Replaced the TextureSaver with some simple pushing/popping OpenGL states (not 100% sure this would work here) and also tried removing the code together. The load moves to actual texture loading, but overall I didn't get any noticeable performance difference (still around 3000 fps, if I manipulate the framerate capping).

« Last Edit: August 17, 2016, 10:53:52 pm by Mario »

Ezekiel

  • Newbie
  • *
  • Posts: 23
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #10 on: August 18, 2016, 01:48:50 am »
Greetings All:

I forgot to show what my first performance issue was.  See attached picture...

...it was about 10 times slower.  As mentioned in the second post, I solved the pollEvent() issue by only calling once per frame, and it also improved the 1.6 version.  I enabled that issue by commenting out "if(FetchPollTime)" on line 132.  29us every loop is large wast of time, once frame is not a big deal.  Anyone can check that if desired.

eXpl0it3r, the getElapsedTime() time is 3us, not 3ms, so that would be MUCH WORSE if it was in milliseconds.

Modifying the random pixel loop would be easy, since solving each pixel is in a fixed number of assembly instructions, so I could calculate and go through the number of times taking ~16.5ms, then check the time frequently to start the frame draw at the right time.  Fractals are more difficult, some pixels take 1 check (not even an iteration), and some never escape so take iteration limit.  The limit can be in the billions for deep zooms.  So after a pixel is calculated, checking if the frame display time is ready shows the progress.  And now as I recall, on deep zooms, any Event functions can be delayed since a single pixel can take a long time.

Well I have my task in front of me, calculate pixels until frame-time, without asking the clock.  Also I haven't performed using treads for multi-core efficiency or using GPU shading for that as well.  Is there class/functions in SFML that are convenient for GPU processing?  And if so, does the shading prefer float, double or integer math?

Mr_Blame

  • Full Member
  • ***
  • Posts: 192
    • View Profile
    • Email
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #11 on: August 18, 2016, 10:12:34 am »
Do not call pollEvent once per frame, event queue may overflow!

Mario

  • SFML Team
  • Hero Member
  • *****
  • Posts: 879
    • View Profile
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #12 on: August 18, 2016, 12:34:20 pm »
Mr_Blame is right, while that might work in general, there are cases where it will screw up, especially once you've got major frame drops.

Process all the events you need. 30µs is nothing. If that is a problem for you, you're indeed running the event loop too often, which might be due to the way you try to handle your events. Just slow down the event loop, if there isn't anything to process (just always render between update loops).

As for your question regarding GPU processing: SFML allows you to use geometry, vertex, and pixel shaderrs. To access stuff like CUDA, you'd have to use another library.

However, I guess it's still the best idea to just render your fractal in a pixel shader, especially considering shaders are very good and efficient at calculating things in parallel (on modern cards several thousand pixels in parallel). Here's a simple example for a Mandelbrot renderer.

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11034
    • View Profile
    • development blog
    • Email
Re: Upgraded from 1.6 to 2.4.0 is ~18 Times Slower
« Reply #13 on: August 18, 2016, 01:26:06 pm »
I still don't see what the actual problem is here. Do you have a performance issue or not?

Are you just saying that SFML 2.4.0 performs differently than 1.6? If so, SFML's goal is not to be the fastest and best performing library out there. We focus on a clean API and maintainable code. As such updates don't imply that performance will stay the same or get better, as long as the difference are insignificant.
In the past we had reports that the joystick checks within the event loop can be quite slow on Windows. In such cases where people don't need joysticks/gamepads, but need more frametime, we suggested to comment out that joystick calls in the event polling SFML code and rebuild SFML.
Other solutions include the more complex and error prone way to put event handling into a dedicated thread with waitEvent(). That way the OS events can be handled independently and you can forward the events in a secure way (no race conditions, protected shared memory, etc) to your main thread. Only go this path if you really know, what you're doing.
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/