Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Slow window.clear() after drawing to RenderTexture [Intel GPU]  (Read 4821 times)

0 Members and 1 Guest are viewing this topic.

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Slow window.clear() after drawing to RenderTexture [Intel GPU]
« on: February 21, 2015, 11:36:10 am »
Hi, I'm experimenting with my lighting system adding more lights. Unfortunately, having ~25 lights sources is extremly slow on my netbook. So I started cleaning up the code to a minimal example and here it is (see spoiler)
(click to show/hide)
(shader code)
(click to show/hide)

I removed the entire scene-drawing-process and left (A) drawing the lighting texture to a render texture and (B) clearing the scene. So the texture isn't drawn at all - but clearing the window takes a lot of time - but it's empty! Here's the output from on my machine:
146ms to clear scene
0ms to clear scene
144ms to clear scene
0ms to clear scene
145ms to clear scene
1ms to clear scene
70ms to clear scene
0ms to clear scene
147ms to clear scene
0ms to clear scene
147ms to clear scene
0ms to clear scene
145ms to clear scene
0ms to clear scene
144ms to clear scene
0ms to clear scene
145ms to clear scene
0ms to clear scene
71ms to clear scene
1ms to clear scene
94ms to clear scene
59ms to clear scene
68ms to clear scene
71ms to clear scene
55ms to clear scene
72ms to clear scene
146ms to clear scene
0ms to clear scene
70ms to clear scene
 
I'm not sure where those "0ms" come from - but maybe my graphics card is limiting the framerate on its own.
So, those many many ms for only clearing the screen... I experimented a bit:
The more light sources are drawn to the render texture the longer it takes to clear the window.

Somewhere at the forum I read that Render Textures are using FBO if available. But glxinfo says GL_EXT_framebuffer_object is available.

So I don't know what's wrong :S I already have the latest version of my graphics card (which is a no-name intel chip) using the official repository of my ubuntu distribution (which isn't outdated either^^). Going to the journey about "finding a non-official driver which is more up-to-date" is currently too dangerous for me. I'm glad my netbook is working great - so I don't want to touch the running machine :P I the past, I always had problems after trying to update my graphics drivers (doesn't matter which machine :S )...

So maybe you've got an idea except "update your graphics card driver" .. it is up-to-date (from my distributions point of view).

Kind regards

/EDIT: Improved Thread Titel
« Last Edit: February 23, 2015, 02:08:49 pm by Glocke »
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

Mario

  • SFML Team
  • Hero Member
  • *****
  • Posts: 879
    • View Profile
Re: Slow window.clear()
« Reply #1 on: February 21, 2015, 04:05:43 pm »
Have you tried whether VSync has any impact on this? Although I'd expect it to slow down displaying the scene rather than clearing it.

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear()
« Reply #2 on: February 21, 2015, 04:39:50 pm »
Have you tried whether VSync has any impact on this? Although I'd expect it to slow down displaying the scene rather than clearing it.
I tried it but it had no effect.
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear()
« Reply #3 on: February 23, 2015, 11:42:01 am »
I tested a dummy shader (which doesn't effect the pixels): Clearing the window runs a lot faster.
Does anyone have an idea what's wrong?

/EDIT: Here's some more simplified code
#include <iostream>
#include <SFML/Graphics.hpp>

std::string const my_shader = \
        "uniform vec2 center;" \
        "uniform float radius;" \
        "uniform float intensity;" \
        "void main() {" \
        "       float dist = distance(gl_TexCoord[0].xy, center);" \
        "       float color = exp(-1.5*dist/(radius/3.f));" \
        "       gl_FragColor = vec4(gl_Color.xyz * color, intensity / 255.0);" \
        "}";

std::string const dummy_shader = "int main() { gl_FragColor = gl_Color; }";

int main(int argc, char* argv[]) {
        sf::RenderWindow window{{640, 480}, "Test"};
        sf::RenderTexture buffer;
        buffer.create(window.getSize().x, window.getSize().y);
       
        // used to draw enlightened area
        sf::VertexArray array{sf::Quads};
        array.append({{0.f, 0.f}});
        array.append({{640.f, 0.f}});
        array.append({{640.f, 480.f}});
        array.append({{0.f, 480.f}});
        for (std::size_t i = 0u; i < 4u; ++i) {
                array[i].texCoords = array[i].position;
                array[i].color = sf::Color::Red;
        }
       
        sf::Shader shader;
        if (argc > 1) {
                shader.loadFromMemory(my_shader, sf::Shader::Fragment);
        } else {
                shader.loadFromMemory(dummy_shader, sf::Shader::Fragment);
        }
       
        sf::Clock clock;
        while (window.isOpen()) {
                sf::Event event;
                while (window.pollEvent(event)) {
                        if (event.type == sf::Event::Closed) {
                                window.close();
                        }
                }
               
                clock.restart();
                buffer.clear();
                for (auto i = 0u; i < 50u; ++i) {
                        buffer.draw(array, &shader);
                }
                buffer.display();
                std::cout << clock.restart().asMilliseconds() << "ms render buffer" << std::endl;
               
                window.clear();
                std::cout << clock.restart().asMilliseconds() << "ms to clear scene" << std::endl;
               
                sf::Sprite tmp{buffer.getTexture()};
                window.draw(tmp);
                window.display();
                std::cout << clock.restart().asMilliseconds() << "ms to draw buffer" << std::endl;
        }
}

Calling it without arguments uses the dummy shader, calling it with any arguments uses the previous shader.
« Last Edit: February 23, 2015, 11:47:41 am by Glocke »
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear()
« Reply #4 on: February 23, 2015, 01:39:50 pm »
Meanwhile, I updated my graphics card driver using my vendor's official tool. Unfortunately it didn't solve my problem.

I also created some profiling analysis using gcc's -pg flag. But the output doesn't help from my point of view. Any ideas about the analysis?

Btw
lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx Integrated Graphics Controller
 
« Last Edit: February 23, 2015, 02:05:44 pm by Glocke »
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

binary1248

  • SFML Team
  • Hero Member
  • *****
  • Posts: 1405
  • I am awesome.
    • View Profile
    • The server that really shouldn't be running
Re: Slow window.clear() after drawing to RenderTexture
« Reply #5 on: February 23, 2015, 02:10:12 pm »
From my experience measuring the OpenGL pipeline, clearing any buffer (default framebuffer, i.e. your window or secondary buffer, i.e. your RenderTexture using the FBO implementation) has to wait for all fragment operations to complete, probably because there is a single bus from the per-sample stage to the actual memory containing the data.

There is a reason why it is called a pipeline. OpenGL commands have to be executed in order since you can do things like read back from a framebuffer in your shader. You can think of clearing anything like running a special "clear fragment shader" on it, since the majority of the work that has to be done is essentially the same.

Your dummy shader is probably being heavily optimized by the driver since there are no data dependencies in it. Since it is a pass-through shader, all invocations are probably executed in parallel as opposed to 1 per available compute unit.

As for the frame timings in your original post, you have to remember: Just because you don't see anything, doesn't mean that the GPU is not doing anything. You are drawing to an off-screen surface (FBO) and this takes up GPU time. You could perfectly read back the data using getTexture() and it has to be what you expect, the specification guarantees this and since the driver/GPU can't predict what you will do in the future, it has to perform all operations that you request. Any commands that the driver sends down to the GPU will get fed into the pipeline regardless of whether it "makes sense" to you or not. If you want to save time, do less work, regardless of what kind of work it is. All work takes time, and the question is always how efficiently it is done, and this is not only true for GPUs ;).
SFGUI # SFNUL # GLS # Wyrm <- Why do I waste my time on such a useless project? Because I am awesome (first meaning).

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear() after drawing to RenderTexture
« Reply #6 on: February 23, 2015, 03:16:36 pm »
Thanks a lot for these details!

If you want to save time, do less work, regardless of what kind of work it is. All work takes time, and the question is always how efficiently it is done, and this is not only true for GPUs ;).
Do you have concrete recommendations referring to my problem? My games makes heavy use of light effects, which are based on my shader. Of course I could prerender lightmaps for each light (depending on radius, intensity and color). But "animating" those lights would be difficult.

My current idea is to slightly modify the radius and intensity of a light source over time to create e.g. flaring lights. This would be difficult when using prerendered lightmaps: I'd need to precalculate each frame of those "flaring animations" and handle lots of additional textures.

In my opinion, having about ~25 visible light sources in a scene isn't much - even on a netbook. I'm not quite sure how to improve my system. Any ideas?

Btw I didn't tested it on another system, yet. My other computer are much more powerful, so my self-made "used to run on a netbook"-restriction cannot be tested on these machines. Of course a netbook is the exactly opposit of a gaming machine... that's why I focus on 2d :)

(modified example code inside spoiler)
(click to show/hide)
« Last Edit: February 23, 2015, 03:33:23 pm by Glocke »
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #7 on: February 23, 2015, 03:39:10 pm »
Quote
Of course I could prerender lightmaps for each light (depending on radius, intensity and color). But "animating" those lights would be difficult.
Create a generic light attenuation texture (grayscale), and a sprite that uses it:
- scale it to change the radius
- change its alpha to change its intensity
- change its color to change its... color
... then draw that sprite, no shader needed, straight-forward and cheap to animate.

Wouldn't it work?
Laurent Gomila - SFML developer

binary1248

  • SFML Team
  • Hero Member
  • *****
  • Posts: 1405
  • I am awesome.
    • View Profile
    • The server that really shouldn't be running
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #8 on: February 23, 2015, 03:41:43 pm »
My games makes heavy use of light effects, which are based on my shader. Of course I could prerender lightmaps for each light (depending on radius, intensity and color). But "animating" those lights would be difficult.

My current idea is to slightly modify the radius and intensity of a light source over time to create e.g. flaring lights. This would be difficult when using prerendered lightmaps: I'd need to precalculate each frame of those "flaring animations" and handle lots of additional textures.
Cache whatever make sense for future use and simply recalculate the rest. Static lights can obviously be reused, however it lies in the nature of dynamic lighting that it is expensive. Even big games struggle to render a large number of dynamic lights simultaneously. Generally, in order to save memory, they resort to using light/shadow maps of lower resolution than the targets on which they are cast, leading to the "blockiness" if you look very closely at the edges of shadows in certain games. A common misconception is that newer games use up so much graphics memory because of larger and larger texture sizes. While that is true, it is only half of the story. In the settings of such games you can often adjust the lighting/shadow quality which has a direct impact on memory usage which is the result of what I just explained.

In my opinion, having about ~25 visible light sources in a scene isn't much - even on a netbook. I'm not quite sure how to improve my system. Any ideas?
Maybe you should try lighting/shadowing using the more "classical" method that is used in 3D as well. Instead of overlaying a "light texture", you directly compute the alterations the lighting makes when rendering the final fragment. This way, you will have to potentially compute more, but it will save you the texture lookup which can be quite expensive on a lot of hardware. The same applies to shadows. Read on how they are done in 3D and apply it to 2D with optimizations where they can be done.

Of course a netbook is the exactly opposit of a gaming machine... that's why I focus on 2d :)
This is another misunderstanding that a lot of people have. Just because something is 2D doesn't mean the GPU has less work to do. Internally, it does just as much as it would for a 3D scene. Obviously the vertex count is going to be lower (1 less dimension) but all the rest, matrix multiplication, perspective divide, fragment operations, framebuffer operations etc. will be the same regardless whether you are in 2D or 3D.
SFGUI # SFNUL # GLS # Wyrm <- Why do I waste my time on such a useless project? Because I am awesome (first meaning).

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #9 on: February 23, 2015, 03:45:35 pm »
Create a generic light attenuation texture (grayscale), and a sprite that uses it:
- scale it to change the radius
- change its alpha to change its intensity
- change its color to change its... color
... then draw that sprite, no shader needed, straight-forward and cheap to animate.

Wouldn't it work?
This idea is awesome! I already have a maximum light radius, so rendering this texture for sf::Color::White and the maximum light radius while startup should work. So I'd draw my scene and then iterate my lights: move, scale, colorize (incl. alpha) the sprite and draw it as usual (sf::BlendMultiply). Did I understand your idea right?


Read on how they are done in 3D and apply it to 2D with optimizations where they can be done.
Thanks, I'll do so!

This is another misunderstanding that a lot of people have. Just because something is 2D doesn't mean the GPU has less work to do. Internally, it does just as much as it would for a 3D scene. Obviously the vertex count is going to be lower (1 less dimension) but all the rest, matrix multiplication, perspective divide, fragment operations, framebuffer operations etc. will be the same regardless whether you are in 2D or 3D.
Of course you're right! But those one dimension less reduces a lot of complexity in the people's mind, but not at the rendering process .. unfortunately :D
« Last Edit: February 23, 2015, 03:48:49 pm by Glocke »
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #10 on: February 23, 2015, 04:07:48 pm »
Quote
This idea is awesome! I already have a maximum light radius, so rendering this texture for sf::Color::White and the maximum light radius while startup should work. So I'd draw my scene and then iterate my lights: move, scale, colorize (incl. alpha) the sprite and draw it as usual (sf::BlendMultiply). Did I understand your idea right?
Yes, except you'd have to draw all your light sprites to a render-texture with normal blending, and this render-texture to your scene with multiplicative blending.
Laurent Gomila - SFML developer

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #11 on: February 23, 2015, 04:33:48 pm »
you'd have to draw all your light sprites to a render-texture with normal blending, and this render-texture to your scene with multiplicative blending.
I just modified the code in that way.
(See spoiler)
(click to show/hide)

It's now running faster, but using it's quite slow (already faster, but already slow, yet: >40ms). Maybe I should consider using less dynamic lights, aren't I?

/EDIT: Of course the maximum texture size isn't necessary. I'll shrink this later.
« Last Edit: February 23, 2015, 04:40:10 pm by Glocke »
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #12 on: February 23, 2015, 04:46:29 pm »
Disable v-sync when you benchmark your code.
Laurent Gomila - SFML developer

Glocke

  • Sr. Member
  • ****
  • Posts: 289
  • Hobby Dev/GameDev
    • View Profile
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #13 on: February 23, 2015, 04:58:07 pm »
Disable v-sync when you benchmark your code.
Enabling and Disabling had no effect (I guess my driver is vsync'ing anyway).

Btw I use now MAX_RADIUS * 1.5f as size of the rendering texture. The intuitive solution might be MAX_RADIUS * 2.f, but the shader is non-linear, so 1.5f fits best shrinking (redundant) pure black pixels.
Combining lightmaps and applying them now needs 15-32ms, depending on the lights' radius.
Current project: Racod's Lair - Rogue-inspired Coop Action RPG

binary1248

  • SFML Team
  • Hero Member
  • *****
  • Posts: 1405
  • I am awesome.
    • View Profile
    • The server that really shouldn't be running
Re: Slow window.clear() after drawing to RenderTexture [Intel GPU]
« Reply #14 on: February 23, 2015, 05:01:54 pm »
Enabling and Disabling had no effect (I guess my driver is vsync'ing anyway).
https://github.com/SFML/SFML/issues/727
SFGUI # SFNUL # GLS # Wyrm <- Why do I waste my time on such a useless project? Because I am awesome (first meaning).