User's Manual

8-16 Vol. 3
MULTIPLE-PROCESSOR MANAGEMENT
By the principles discussed in Section 8.2.3.2,
processor 2’s first and second load cannot be reordered,
processor 3’s first and second load cannot be reordered.
If r1 == 1 and r2 == 0, processor 0’s store appears to precede processor 1’s
store with respect to processor 2.
Similarly, r3 == 1 and r4 == 0 imply that processor 1’s store appears to precede
processor 0’s store with respect to processor 1.
Because the memory-ordering model ensures that any two stores appear to execute
in the same order to all processors (other than those performing the stores), this set
of return values is not allowed
8.2.3.8 Locked Instructions Have a Total Order
The memory-ordering model ensures that all processors agree on a single execution
order of all locked instructions, including those that are larger than 8 bytes or are not
naturally aligned. This is illustrated by the following example:
Processor 2 and processor 3 must agree on the order of the two executions of XCHG.
Without loss of generality, suppose that processor 0’s XCHG occurs first.
If r5 == 1, processor 1’s XCHG into y occurs before processor 3’s load from y.
Because the Intel-64 memory-ordering model prevents loads from being
reordered (see
Section 8.2.3.2), processor 3’s loads occur in order and,
therefore, processor 1’s XCHG occurs before processor 3’s load from x.
Since processor 0’s XCHG into x occurs before processor 1’s XCHG (by
assumption), it occurs before processor
3’s load from x. Thus, r6 == 1.
A similar argument (referring instead to processor 2’s loads) applies if processor 1’s
XCHG occurs before processor
0’s XCHG.
8.2.3.9 Loads and Stores Are Not Reordered with Locked Instructions
The memory-ordering model prevents loads and stores from being reordered with
locked instructions that execute earlier or later. The examples in this section illustrate
only cases in which a locked instruction is executed before a load or a store. The
Example 8-8. Locked Instructions Have a Total Order
Processor 0 Processor 1 Processor 2 Processor 3
xchg [ _x], r1 xchg [ _y], r2
mov r3, [ _x] mov r5, [_y]
mov r4, [ _y] mov r6, [_x]
Initially r1 == r2 == 1, x == y == 0
r3 == 1, r4 == 0, r5 == 1, r6 == 0 is not allowed