I disabled everything but the renderings (like physics etc.), but its still a lot of code.
Well, then instead of only disabling stuff, why not also
reduce stuff? It doesn't have to be all or nothing. If you can't separate out sections of your rendering for testing purposes, you have more problems to worry about than something getting stuck every now and then.
1. what does window.display() actually do besides flipping buffers? I checked the code and looks like eventually all it does is glCopyTexSubImage2D() so I don't see how it can get stuck? is there some magic inside I'm missing?
sf::Window::display() doesn't call glCopyTexSubImage2D(), sf::RenderTexture::display() does. Pick one and stick to it. If your window variable really calls glCopyTexSubImage2D(), then I suggest perhaps renaming it to renderTexture or similar so you don't confuse yourself any longer...
2. did you ever encountered a problem similar to this in the past?
No. And it is impossible that we encountered such a problem, because what you describe - "window.display() getting stuck" - isn't even the actual problem, it's what
you think the cause of the problem is. There is a difference.
3. how can I debug that function?
You don't have to debug that function, you have to debug your own code. That function does what it's supposed to. Looking for the cause of a problem deep inside library code that functions properly is just a waste of time.
4. which steps should I do to figure out the problem?
Like zsbzsb and I have already said, reduce your code down to an example. The fact that you don't even consider the possibility of reducing your code to something that is reproducible in an example is preventing you from solving your own problem. Without such code, we can't do anything else for you but guess as well.
in WglContext.cpp. SwapBuffers is what taking so long, but the debugger can't find the SwapBuffers() symbol. trying to figure out why..
If this is your methodology of finding performance bottlenecks, you really need read a bit more on proper profiling techniques. Yes, it is true that a lot of time is probably spent in SwapBuffers() in your application, but that doesn't mean that you have to be able to look inside of it. SwapBuffers() is an operating system function that swaps the back and front framebuffers, so you obviously won't be able to debug inside of it.
If you had read a bit more about it, you might have probably seen that it is allowed to block if the OpenGL command queue is full, meaning the GPU needs more time to finish off all the operations you told it to perform. What are these operations? Everything you did,
up to 3 frames ago. GPUs tend to run 3 frames behind the CPU, and if it starts to lag even further behind, SwapBuffers() will have to block.
How can your OpenGL command queue become so full? Easy: You just did too much "OpenGL stuff". If you keep track of how much "stuff" you are actually doing during your rendering, then you will know whether to expect framerate drops or not. You can't expect the GPU to render a frame with 100000000 commands in the same time as it does 10000 commands. The key to understanding why it takes so long is to understand what
you are doing.
There is no "magic", the more work there is, the longer it takes. It's that simple. SFML doesn't create any extra work, OpenGL doesn't either. Only you control how much work is being produced.
The first step to solving this problem is to understand your own code. Show us that you do by reconstructing it in a minimal example and you might even solve the problem yourself while doing it.