Do you suggest to bind the texture every draw call (i.e. comment out this line)?
The recommendation to minimize state changes can be found not only in official guides of OpenGL 1.1 (which SFML fortunately still supports) but also on more contemporary ones (Apple, ARM).
I know that changing state often is a bad thing, but what I also know is that changing
client state has much less of an impact than changing server state. Anything that has to do with client-side arrays is purely client side i.e. is performed in the driver. Whether we choose to do a bit of caching ourselves or leave it up to the driver process is just a matter of taste. In the end, it is executed by the same CPU anyway, which is why there is no difference. Moving this responsibility over to the driver will let it optimize much better if it sees any opportunities. Not only that, but moving it out of RenderTarget will also make room for newer RenderTarget implementations (using newer OpenGL), which will result in much better performance anyway. There is no point talking about optimizing that last 1% when we know that OpenGL 1.1 is far from the optimal way of doing things anyway.
Due to various considerations (e.g. undocumented limit for # of vertices, application data structures that need to be contiguous in memory, etc.), even when batching is used there may be multiple draw() calls sharing the same state. And even if not completely, perhaps partially.
I'm not saying that it is always possible to draw your whole scene in a single draw call, but the people who count on the current optimization are literally doing
nothing themselves to make the situation any better. Cutting down the number of batches will have a greater (and more noticeable) impact than having to enable and disable a few states more than necessary. You should normally start to optimize where it has a greater impact, and not where you don't have to think about it because it is taken care of by someone else.
If you want to see how far batching can go, you can go look at the SFGUI renderers. If you look at the SFGUI test application, there are well over a thousand primitives that get rendered every frame, and that using 3 draw calls. People who say batching isn't possible at all in their application without even trying either have no idea what a batch is, or are just too lazy to optimize performance themselves.
As a user, I would prefer SFML to stay on the safe side with regard to best practices for performance, just like it does so well for C++. I'm not familiar with the new feature you're implementing or the overall maintenance efforts of the Graphics library, but I can say that I found RenderTarget quite elegant in its design when I implemented #1015.
Like I said above, this "best practice" is written more with newer OpenGL in mind, and also has more noticeable effects when applied to newer OpenGL. I'm not a friend of going to great depths just to follow "best practice" even though it doesn't really benefit us and even hampers our efforts to maintain/extend the implementation. Getting rid of the state cache is purely a removal, and I don't know about you, but to me removing code which does not do what it was intended to do is an improvement. Encapsulation even had to be broken in Texture because the state cache had to track which texture was last active, already a minus point.