Ah. pthread_mutex_lock() / unlock(), as EXTERNAL
functions, will never
be optimized away or inlined. Now, being all sequence points, if you
simply do
pthread_mutex_lock();
xval = x;
pthread_mutex_unlock();
the compiler is not allowed to move statements out the locked section
or reorder them in any way (without need for any volatile qualifiers).
the hardware would be allowed to reorder them ... this is the reason why mutex
implementations involve memory barriers ...
the main problem is the lack of a memory model for multi-threaded applications
at the level of the language (c or c++). fortunately this is about to change
with c++0x and probably c1x.
tim