If one thread only reads a value while the other writes it, volatile will be required so that the compiler doesn't optimize away the read operations in the first thread -- because it wrongly thinks that the variable never changes.
Why would a compiler do that? To create a thread, you have to call a function from the system API, passing the thread entry point as an argument. As the compiler knows nothing about the called function, it cannot assume
anything. In particular, it cannot assume that the thread entry point is dead. Thus, the compiler sees that in some part of the code some variable gets written, and in some other part of the code the same variable gets read. Because it knows nothing about when exactly the variable is written, it leaves both reads and writes alone.
When I say "it knows nothing about when the variable is written", I assume that at least one of these is true:
- The compiler is threading-aware (if the platform supports threading, you can be virtually certain it is; if the compiler supports C++11, is has to be)
- Between the reads, there's an intervening call to a function compiler knows nothing about
Because 1. is true on every compiler that could possibly be used by a programmer who doesn't deal with really weird platforms, and is surely true on all platforms supported by SFML, the only reason compiler could optimize the reads out is due to undefined behavior in the code, or a compiler bug. In the first case, fix your code. In the second case, report the bug and maybe upgrade your compiler.
In the case only 2. is true, you're dealing with a weird compiler for a weird system, so you should be very well aware of its quirks, because it's relatively unlikely that's the only one. And even then, the compiler probably doesn't do any aggressive optimizations, so you could get away without volatile.
All that assumes the compiler sees the whole program during its optimization stage. If it doesn't, there are even more optimization barriers, so the point still stands.
Bear in mind that I'm being theoretical here, everything I just said is based on knowledge about data- and control-flow analysis, and guarantees provided by the standard. So, do you really know a compiler where you really need a
volatile in order to make the situation in question work?
This is all with the assumption that there is enough memory on the device that you are coding on to create and use proper sync objects.
A spinlock can take as little as one byte. Or even zero bytes, if you can squash it together with something else - remember that the simplest kind of spinlock needs only a single bit. Yet it's often more than enough to do proper synchronization. If you run on a single CPU, you won't even need any instructions to issue memory barriers.
The only place that I have ever needed volatile is on embedded devices. That is when volatile is needed most (In my experience).
Yes, but it's all about unusual kinds of memory, like memory-mapped registers. You can do proper synchronization without
volatile.