Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Vertex array limit 205 and dramatic perforomance degradation (CPU high load)  (Read 6646 times)

0 Members and 1 Guest are viewing this topic.

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
I found some strange behavior with View zooming.
I'm using empty scene with a grid rendered with Vertices for testing.
When I starting app it has default zoom 1,35 and app consumes < 1% CPU (most of the time 0%, sometimes 1%).

But when I change zooming there is some strange happens at threshold between value 3,1046 and 3,1726.
When I reach its threshold CPU usage jumping up to 26% (100% load of single CPU core) and remains at that level on any zoom which is higher that this threshold.
If I change zoom back, CPU load jumps down to 0%.

What happens? Why there is so abrupt increasing of CPU load?
I think zoom should not affect it. At least not so brokenly.

I measured it with VSyncEnabled and SetFrameLimit(0).
I tested with no VSync and it seems that this threshold doesn't affect fps. But what happens with CPU load?
It happens with window size 800x800. When I change window size threshold also shifted.

For example with window size 400x400:
Zoom   CPU load
6,2112    0%
6,3452    26% (100% load of single CPU core)

Window size 900x900:
Zoom   CPU load
2,7601    0%
2,8201    26%

I'm using SFML.NET binding with CSFML 2.2 and SFML 2.2 on Windows 7 x64 and nVidia GeForce 460, driver version 10.18.13.5390

I tried to comment grid rendering, so render loop consists of render debug text only and this issue disappears.
I can zoom out up to the value 55 and CPU load stay on 0%...

Render grid code is the following:
        private void RenderGrid()
        {
            var alpha = 0.2F * _camera.Scale;
            alpha = alpha < 0F ? 0F : alpha;
            alpha = alpha > 1F ? 1F : alpha;
            var color = new Color(0x00, 0x00, 0x00, (byte)(255 * alpha));

            const float step = 50F;
            var size = (Vector2f)_window.Size / _camera.Scale + new Vector2f(step, step) * 2F;
            var offset = _camera.Center - size / 2F - new Vector2f(step, step);
            var vertices = new List<Vertex>();
            for (var y = step - (offset.Y % step); y < size.Y; y += step)
            {
                var sx = offset.X + 0.5F;
                var sy = offset.Y + y + 0.5F;
                vertices.Add(new Vertex(new Vector2f(sx, sy), color));
                vertices.Add(new Vertex(new Vector2f(sx + size.X, sy), color));
            }
            for (var x = step - (offset.X % step); x < size.X; x += step)
            {
                var sx = offset.X + x + 0.5F;
                var sy = offset.Y + 0.5F;
                vertices.Add(new Vertex(new Vector2f(sx, sy), color));
                vertices.Add(new Vertex(new Vector2f(sx, sy + size.Y), color));
            }
            _window.Draw(vertices.ToArray(), PrimitiveType.Lines);
        }
 

_camera.Scale is used in the following way:
            _viewWorld.Size = (Vector2f)_window.Size;
            _viewWorld.Zoom(1F / _camera.Scale);
            _viewWorld.Center = _camera.Center;
            _window.SetView(_viewWorld);
 

Actually I mention values which is assigned to view.Zoom. i.e.:  Zoom = 1F / _camera.Scale;

I tried to hardcode alpha channel of color:
            var alpha = 1F;
            alpha = alpha < 0F ? 0F : alpha;
            alpha = alpha > 1F ? 1F : alpha;
 

But it didn't affect the issue...

I measured vertex count which triggers jump of CPU load and it seems that it's constant. CPU load jumps from 0% to 26% when vertex count grows from 204 to 208 and it independent from window size...

I tested memory allocation for vertex list by allocating fixed size array for 500 vertices, but it not affect the result. CPU load jumps from zero to the maximum when vertex count is more than 204...

Finally I tested if the issue really affected by zoom. To do it I replaced zoom with fixed value 3F. And actually zoom doesn't affect the issue. It depends on vertex count only.

So, CPU load jumps when line count is higher than 204/2=102 lines.
Why?
« Last Edit: October 07, 2015, 12:59:23 pm by mkalex777 »

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11016
    • View Profile
    • development blog
    • Email
View zoom and performance (Draw lines with vertex array)
« Reply #1 on: October 07, 2015, 07:41:14 am »
Profile it.
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
Re: View zoom and performance (Draw lines with vertex array)
« Reply #2 on: October 07, 2015, 10:36:58 am »
Profile it.

What I need to profile? Profiler shows that most of the time is burned on window.Draw. nvidia profiler shows that most of the time is burned on glwritearray. And it seems that it not depends on 204 vertex threshold. Also, I didn't see fps jumps with disabled vblank sync at this threshold. But when vsync is enabled I see that cpu load jumps from 0% to full core consumption
« Last Edit: October 07, 2015, 10:40:06 am by mkalex777 »

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11016
    • View Profile
    • development blog
    • Email
View zoom and performance (Draw lines with vertex array)
« Reply #3 on: October 07, 2015, 10:42:29 am »
What if you actually limit your framerate with setFramerateLimit(60)?
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
Re: View zoom and performance (Draw lines with vertex array)
« Reply #4 on: October 07, 2015, 11:13:54 am »
What if you actually limit your framerate with setFramerateLimit(60)?

I'm using SetFrameLimit(0).

I tried it with SetFrameLimit(60). I got stable high CPU load, so it eats CPU like hell even with vertex count < 204 :)

But actually my display refresh rate is 75 Hz, so I tried it also with SetFrameLimit(75). And I get the same behavior as with SetFrameLimit(0) - CPU load jumps from zero to the maximum on 204 vertex threshold.

I performed a small test. I replaced grid renderer with drawing circle shape with radius 100 and changing it's point count. And I get the same threshold - when point count reach 205, CPU load jumps from 0% to 26% (maximum single CPU core consumption).
It's logical, if there is some vertex count threshold it should be the same with any drawing. And it really triggers jump of CPU load even with circleshape.

So, what's going on with window.Draw when it reach vertex count > 204?

Update: I found that there is some limitations on vertex count in OpenGL, it is controlled with GL_MAX_ELEMENTS_VERTICES and GL_MAX_ELEMENTS_INDICES.
When vertex count limit exceeded, performance can drop off a cliff.
Maximum limit for nVidia card is 65536.
I tried to find usage of such attributes in SFML source code and didn't find it.
205 vertices in SFML is too small limit, it much smaller than nVidia card limitation=65536.
So, where is bottle neck?

Yes! It's really vertex count limit which triggers performance dropdown.

I changed source code int the following way:
        private void RenderGrid()
        {
            var alpha = 0.2F * _camera.Scale;
            alpha = alpha < 0F ? 0F : alpha;
            alpha = alpha > 1F ? 1F : alpha;
            var color = new Color(0x00, 0x00, 0x00, (byte)(255 * alpha));

            const float step = 50F;
            var size = (Vector2f)_window.Size / _camera.Scale + new Vector2f(step, step) * 2F;
            var offset = _camera.Center - size / 2F - new Vector2f(step, step);
            var vertices = new List<Vertex>();
            for (var y = step - (offset.Y % step); y < size.Y; y += step)
            {
                var sx = offset.X + 0.5F;
                var sy = offset.Y + y + 0.5F;
                vertices.Add(new Vertex(new Vector2f(sx, sy), color));
                vertices.Add(new Vertex(new Vector2f(sx + size.X, sy), color));
            }
            _window.Draw(vertices.ToArray(), PrimitiveType.Lines); //  <= added lines to decrease
            vertices.Clear();                                                                //  <= vertex array length
            for (var x = step - (offset.X % step); x < size.X; x += step)
            {
                var sx = offset.X + x + 0.5F;
                var sy = offset.Y + 0.5F;
                vertices.Add(new Vertex(new Vector2f(sx, sy), color));
                vertices.Add(new Vertex(new Vector2f(sx, sy + size.Y), color));
            }
            _window.Draw(vertices.ToArray(), PrimitiveType.Lines);
        }
 

So, now I split vertex array into two parts and make two calls to the _window.Draw instead of one call.
And yes - threshold is shifted and now it triggers high CPU load with vertex count > 410

So, I changed source in the following way, to make window.Draw per each line:
            for (var y = step - (offset.Y % step); y < size.Y; y += step)
            {
                var sx = offset.X + 0.5F;
                var sy = offset.Y + y + 0.5F;
                var array = new[]
                {
                    new Vertex(new Vector2f(sx, sy), color),
                    new Vertex(new Vector2f(sx + size.X, sy), color),
                };
                _window.Draw(array, PrimitiveType.Lines);
            }
            for (var x = step - (offset.X % step); x < size.X; x += step)
            {
                var sx = offset.X + x + 0.5F;
                var sy = offset.Y + 0.5F;
                var array = new[]
                {
                    new Vertex(new Vector2f(sx, sy), color),
                    new Vertex(new Vector2f(sx, sy + size.Y), color),
                };
                _window.Draw(array, PrimitiveType.Lines);
            }
 

And now high CPU load disappears for any count of lines :)
But it's just a background. Vertices also used in RectangleShape and CircleShape, so there is definitely needs to increase vertex limit from 205 to at least 30000.

So, how I can increase this limit?
Limit for 205 vertices looks just like joke, it so small... :)
« Last Edit: October 07, 2015, 12:27:10 pm by mkalex777 »

victorlevasseur

  • Full Member
  • ***
  • Posts: 206
    • View Profile
There's no Vertex limit. I use vertex array containing far more than 205 vertices and I don't have any problem...

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
There's no Vertex limit. I use vertex array containing far more than 205 vertices and I don't have any problem...

It is really exists, at least with SFML 2.2 and SFML.NET binding. And vertex arrays with more than 205 vertices actually works, but with such amount it loads CPU (not GPU) like hell

Nexus

  • SFML Team
  • Hero Member
  • *****
  • Posts: 6287
  • Thor Developer
    • View Profile
    • Bromeon
And vertex arrays with more than 205 vertices actually works, but with such amount it loads CPU (not GPU) like hell
205 is not an "amount", it's absolutely nothing. Today's CPUs can easily handle millions of vertices if clever data structures and algorithms are used...

How come you've had problems with that, what was the actual code that led to it?
Zloxx II: action platformer
Thor Library: particle systems, animations, dot products, ...
SFML Game Development:

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
And vertex arrays with more than 205 vertices actually works, but with such amount it loads CPU (not GPU) like hell
205 is not an "amount", it's absolutely nothing. Today's CPUs can easily handle millions of vertices if clever data structures and algorithms are used...

How come you've had problems with that, what was the actual code that led to it?


Look previous posts for the code sample.
In short:

If you pass to RenderWindow.Draw more than 204 vertices (PrimitiveType.Lines), CPU load jupms up to the maximum.
If you pass to RenderWindow.Draw less than 204 vertices, CPU load remains at 0% (zero).

The same thing with shapes.
If you set CircleShape.PointCount more than 205, CPU load jumps up to the maximum.
If you set CircleShape.PointCount less than 205, CPU load remains at 0% (zero)

Set vsyncenabled(true) and set framelimit(0) for monitoring CPU load. Otherwise you will see always maximum cpu load
« Last Edit: October 08, 2015, 01:08:54 am by mkalex777 »

Nexus

  • SFML Team
  • Hero Member
  • *****
  • Posts: 6287
  • Thor Developer
    • View Profile
    • Bromeon
Sorry I missed that. But it's obviously not a problem that can be generalized, hundreds of SFML users are using vertex arrays with many more vertices just fine. I have developed thor::ParticleSystem which can easily contain thousands of particles based on sf::VertexArray, without problems.

Everything so far indicates a problem in your graphics driver... Can you try to test another version (maybe even an earlier one)? By the way, this whole grid code is not useful for us, we need minimal complete examples in order to be sure that nothing else is interfering. It should be fairly easy to write one that draws only a vertex array and does nothing else.

Can you also compile CSFML (or even SFML) code directly? The fewer abstraction layers, the better...
Zloxx II: action platformer
Thor Library: particle systems, animations, dot products, ...
SFML Game Development:

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
Sorry I missed that. But it's obviously not a problem that can be generalized, hundreds of SFML users are using vertex arrays with many more vertices just fine. I have developed thor::ParticleSystem which can easily contain thousands of particles based on sf::VertexArray, without problems.

Everything so far indicates a problem in your graphics driver... Can you try to test another version (maybe even an earlier one)? By the way, this whole grid code is not useful for us, we need minimal complete examples in order to be sure that nothing else is interfering. It should be fairly easy to write one that draws only a vertex array and does nothing else.

Can you also compile CSFML (or even SFML) code directly? The fewer abstraction layers, the better...

Actually I'm using SFML.NET binding which uses CSFML 2.2. I will try to test this issue on the latest SFML later. I need to configure build on windows machine.

The issue actually hidden, because when you are passing more than 205 vertices it still works and not so slow. So, it can give illusion that all works fine. But it's not.
When I optimized code to split vertex arrays into parts with 100 vertices each and pass it separately to RenderWindow.Draw, I get much better performance and low CPU load!

PS: last time I fought with performance issues and found a lot of tricks to improve SFML app performance, now my app works 10x times faster than before and I get for about 500 fps with usual gameplay mode. With vblank sync enabled it loads CPU just for 3-7%...  :)
« Last Edit: October 11, 2015, 02:44:39 am by mkalex777 »

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
I tested this limitation on CSFML 2.3 and it is reproduced - limit is 205 vertices.

victorlevasseur

  • Full Member
  • ***
  • Posts: 206
    • View Profile
And with SFML 2.3 (in C++) ?

mkalex777

  • Full Member
  • ***
  • Posts: 206
    • View Profile
And with SFML 2.3 (in C++) ?

I still not tested it with clean SFML, need to configure build. I primary worked with CSFML/SFML.NET 2.2, 2.3

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11016
    • View Profile
    • development blog
    • Email
Vertex array limit 205 and dramatic perforomance degradation (CPU high load)
« Reply #14 on: October 12, 2015, 08:49:21 am »
Maybe one day you'll realize that your driver acts weird and it has nothing to do with SFML.

Have you disabled "Threaded Optimization" as suggested before?
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/