On Sun, Jul 10, 2011 at 06:05:45PM +0300, Dan Muresan wrote:
Ah. pthread_mutex_lock() / unlock(), as EXTERNAL
functions, will never
be optimized away or inlined. Now, being all sequence points, if you
simply do
pthread_mutex_lock();
xval = x;
pthread_mutex_unlock();
the compiler is not allowed to move statements out the locked section
or reorder them in any way (without need for any volatile qualifiers).
OK. This is becoming an interesting discussion, so please
allow me to restate clearly the context.
We have a variable 'int xval' that is being modified
by forces unknown to the code we are discussing. This
code is a function f() which uses the value of xval,
but the algorithm implemented by f() requires that
the same value is used at all points where f() uses
xval during a single invocation of that function.
So we have:
extern int xval;
void f(void)
{
int a, b, c, ... x, y, z;
x = xval;
// lots of code using a ... z;
}
Since f() has much more local variables than the CPU
has registers, the compiler could be tempted to reuse
the register used to store x for some other purpose
at some point in the code. It could do that in two ways:
1. Store x on the stack, and read that location when
xval is required again, or
2. Just reuse the register without saving it, and read
the memory location 'xval' again when required.
(2) could make the algorithm fail, because xval could
have changed. So we want to prevent that happening.
The solution I propose is to declare xval volatile.
This forces the compiler to read it just once, as
expressed by the source code. So it either will do (1),
or maybe decide not to trash the register holding x at
all but use another one.
The solution you propose is to protect xval by a mutex.
I invite you to consider the following:
A. If xval is being modified by an interrupt handler
then clearly you can't use the mutex - you can't risk
to block the interrupt handler.
B. From the point of view of the code we are discussing
*there is no difference* between xval being modified by
an interrupt handler, or by another thread. The difference
is completely irrelevant to f(). The only thing that
matters is that xval can change while f() executes.
You would probably accept the 'volatile' if xval is
being written by an interrupt handler. Given (B),
there is no good reason to reject it in either case.
I see. But as I said, in general the cache coherency
problem is worse
than the pipeline reordering problem -- i.e. when there are multiple
CPUs/cores using different caches, they may see actions out-of-order.
On that I absolutely agree - cache coherency is the real
problem, not pipelining. The latter should in fact be
transparent from a language such as C/C++.
Ciao,
--
FA