I'm new to shaders and have a situation I need advanced eyes on.
I have a container called LayeredDrawable that wraps around sf::Sprite and also shares a pointer to an applied shader. This way I can bind and unbind shaders easily. In my engine, I take these drawable instances, apply their shader, and bake it into the render target (window). Here's the code for that:
void Engine::Draw(LayeredDrawable* _drawable)
{
// For now, support at most one shader.
// Grab the shader and image, apply to a new render target, pass this render target into Draw()
LayeredDrawable* context = _drawable;
sf::Shader* shader = context->GetShader();
if (shader != nullptr) {
sf::RenderTexture* postFX = new sf::RenderTexture();
const sf::Texture* original = context->getTexture();
postFX->create(original->getSize().x, original->getSize().y);
postFX->draw(sf::Sprite(*context->getTexture()), shader); // bake
context->setTexture(postFX->getTexture());
Draw(context, false);
context->setTexture(*original);
delete postFX;
}
else {
Draw(context, true);
}
}
The signature for the final draw call is `Draw(sf::Drawable, bool applyShaders)`. If applyShaders is true, there are global (scene) shaders that are applied on the final pass. Some elements like UI don't need this and pass false. In the above example, if a sprite container already has a shader we don't want another shader to affect it. The sprite is passed to the render window target:
void Engine::Draw(Drawable* _drawable, bool applyShaders)
{
if (!_drawable) {
return;
}
if (applyShaders) {
window->draw(*_drawable, state);
}
else {
window->draw(*_drawable);
}
}
All of this works great and fast. Today I created a new GUI component that uses shaders for some selectable elements. I noticed that after 3 selected items, the game slowed down significantly. When an item is selected, a greyscale shader is applied to the sprite to show it cannot be selected. Here's the code for that:
void ChipSelectionCust::Draw()
{
Engine::GetInstance().Draw(custSprite, false);
if (IsInView()) {
cursorSmall.setPosition(2.f*(7.0f + (cursorPos*16.0f)), 2.f*103.f); // TODO: Make this relative to cust instead of screen
for (int i = 0; i < chipCount; i++) {
icon.setPosition(2.f*(9.0f + (i*16.0f)), 2.f*106.f);
icon.SetShader(nullptr);
if (!queue[i].state) {
icon.SetShader(&greyscale);
Engine::GetInstance().Draw(&icon);
}
else {
Engine::GetInstance().Draw(icon, false);
}
}
icon.SetShader(nullptr);
for (int i = 0; i < selectCount; i++) {
icon.setPosition(2*97.f, 2.f*(25.0f + (i*16.0f)));
Engine::GetInstance().Draw(icon, false);
}
if (cursorPos < 5) {
// Draw the selected chip info
label.setColor(sf::Color::White);
if (cursorPos < chipCount) {
// Draw the selected chip card
sf::IntRect cardSubFrame = TextureResourceManager::GetInstance().GetTextureRectFromChipID(queue[cursorPos].data->GetID());
chipCard.setTextureRect(cardSubFrame);
chipCard.SetShader(nullptr);
if (!queue[cursorPos].state) {
chipCard.SetShader(&greyscale);
Engine::GetInstance().Draw((LayeredDrawable*)&chipCard);
}
else {
Engine::GetInstance().Draw(chipCard, false);
}
label.setPosition(2.f*16.f, 16.f);
label.setString(queue[cursorPos].data->GetShortName());
Engine::GetInstance().Draw(label, false);
// the order here is very important:
label.setString(std::to_string(queue[cursorPos].data->GetDamage()));
label.setOrigin(label.getLocalBounds().width*2.f, 0);
label.setPosition(2.f*(label.getLocalBounds().width + 60.f), 143.f);
Engine::GetInstance().Draw(label, false);
label.setPosition(2.f*16.f, 143.f);
label.setOrigin(0, 0);
label.setString(std::string()+queue[cursorPos].data->GetCode());
label.setColor(sf::Color(225, 180, 0));
Engine::GetInstance().Draw(label, false);
}
// Draw the small cursor
Engine::GetInstance().Draw(cursorSmall, false);
}
else
Engine::GetInstance().Draw(cursorBig, false);
}
}
As you can see I toggle setting the shader binding in to `nullptr` so that the Engine::Draw() call does not apply any shaders and sends it directly onto the render window target. It's the same icon over and over so I reuse the sprite in the draw call.
Finally, here's the greyscale.frag.txt code. I do not believe it is the culprit:
uniform sampler2D texture;
void main()
{
vec4 pixel = texture2D(texture, vec2(gl_TexCoord[0].x, 1 - gl_TexCoord[0].y));
vec4 color = gl_Color * pixel;
color.rgb = vec3((color.r+color.b+color.g)/3.0);
gl_FragColor = color;
}
My hunch is that there are just too many calls to the GPU. IMO it shouldn't be a problem. 2D shaders are a joke compared to complex 3D scenes so it's concerning that this new UI is slowing things significantly in the draw pipeline. If the toggling of the greyscale shader is the culprit, my only idea at the moment would be to sort the sprites by their applied shader, bake them into a big render target the size of the window's frame buffer and one by one place them into the scene. That way, at most there's only 1 call per shader.
Is this the case?