Let's argument from the point of view of a user, not a developer. This should be the main approach when designing any API, because that's what it is: an interface for the user.
As a user, I don't know that a window is required to keep the clipboard, and I'm not interested in all that ownership stuff. It's something the library should take care of for me, otherwise I could use low-level libraries directly.
First, when looking up the functionality in the SFML documentation, I would not expect the clipboard functionality to reside in a class that is responsible of displaying a window. Continuing with the usage, I don't want to have a window around every time I access the clipboard. Since clipboards have global semantics (even if they're internally accessed through one specific window), I would like to deal with them accordingly. Additionally, there are cases where I have no window at all, or where I have multiple ones and need to pick one arbitrarily.
A sf::Clipboard class doesn't complicate the implementation a lot. As soon as a window is around, we can use that one. In the X11 implementation, there is already a global list of windows, which is required for the focus requests. It could be slightly more implementation effort to create a dummy window when there is no window -- but don't forget, this is an additional feature you wouldn't even have with pure sf::Window member functions.
If necessary, things can still be implemented in the sf::Window class (or its backends), and the sf::Clipboard class could either access them directly, or via friend through sf::Window.