I appreciate the enthusiasm, but it seems like you have little understanding of the complexities of audio and video editing, capturing and encoding. This is not a trivial task at all and has so many possibilities, that it would be hard to even find a simple subset, plus it would still not fit the scope of SFML.
There is plenty of capturing software out there (e.g. OBS Studio) and even more video or audio editing software (DaVinci, Audacity, etc.). I'd claim that it's easier to use a capturing software, than to try and write your own video encoder from your animation.