Subtraction rarely works at least due to the compression.
You cannot remove vocals perfectly, but you can reduce them.
That can be done by using Fourier transform, reducing frequencies of the vocal range and performing inverse Fourier transform to recreate the wave back.
That will not work, as those pesky "vocal frequencies" are 90% of the important "music frequencies" as well
so this would reduce pretty much everything except extreme bass and extreme treble
Subtraction isn't perfect, due to - as you mentioned - compression, but also due to reverb, doubling and other stereo vocal effects, but it really is the best you can do. You could aditionally reduce the "presence" range of 2-3 kHz, but that will also affect your music quite a bit
In an ideal world you could use BSS (blind source separation) and vocal recognition to remove the vocal source. But in the real world... "that ain't happining"