I really don't see why, from a technical perspective, what you are trying to achieve is only doable by creating a temporary sprite every frame...
Assuming that the user should be able to actually recognize something on their screen, said sprite would have to be drawn over multiple frames anyway, meaning that it would have to be recreated multiple times with very similar if not identical parameters. Except for special cases (static linking combined with some heavy duty LTO), the call to the constructor of sf::Sprite and indirectly its members will not be optimized away and will also have some kind of CPU performance impact. The fact that it doesn't stick out (yet) in a measurement does not mean it doesn't exist.
While I am not the original author of the documentation, I can say that the term "lightweight" came in when switching from SFML 1.6 to SFML 2.0 in order to emphasize the fact that sf::Sprite no longer holds the huge chunk of texture data like it used to in SFML 1.6. Making any other forms of performance guarantees, especially when relying on the graphics hardware/drivers to do the right thing, is not such a good idea. Something that runs fast on one system does not have to run fast on another, especially when it comes to OpenGL.
When considering such optimizations, we always consider what kind of effects it might have on the behaviour and performance of the program. It was already clear back then that frequent creation of sf::Sprites and all the other drawables would cause VBOs to be created and destroyed all the time. Based on inspection of publicly available code, it was clear that this usage pattern wasn't really something to take into consideration since performance concious users would, as explained above, reuse objects over a longer period of time, or just forget the pre-made drawables and manage vertex data themselves.
We optimize for best and average case performance. Worst case performance is something no library/API/programming language will optimize for since it is virtually impossible to predict all the ways something can be used incorrectly let alone infeasible to write workarounds for every single case. Unlike algorithms that operate on input data which cannot be known beforehand and have to reckon with the worst, we assume developers do not program randomly and have some kind of motivation to get the best performance out of their application whatever means necessary. As long as there are enough ways to get good performance, we won't let worst case performance regressions hinder any optimizations made to improve the former.
Last but not least, the main reason this change was necessary, regardless of performance impact, is that client side arrays are no longer supported in core OpenGL 3.2+ and OpenGL ES 2+. In order to future proof SFML and be able to gradually move in this direction, we had to add an implementation that uses VBOs for vertex data storage instead of system RAM. Disabling this change in any way would lock the implementation and your application to using legacy OpenGL and/or OpenGL ES 1.0 forever.
Future optimizations can always be made to optimize even such usage patterns. Making use of a single large VBO from which we allocate fragments to every drawable is something I have had on my list for quite some time, I just haven't got around to doing that just yet. The necessary backend changes to make this possible are being worked on already, it's just a matter of time until it sees the light of day.