User's Manual

ManualsBrandsIntel ManualsWebcamIntel Intel Webcam 253668-032US

321

322

323

324

325

326

327

328

329

330

Vol. 3 8-23

MULTIPLE-PROCESSOR MANAGEMENT

as the XCHG instruction or the LOCK prefix to insure that a read-modify-write opera-

tion on memory is carried out atomically. Locking operations typically operate like

I/O operations in that they wait for all previous instructions to complete and for all

buffered writes to drain to memory (see

Section 8.1.2, “Bus Locking”).

Program synchronization can also be carried out with serializing instructions (see

Section 8.3). These instructions are typically used at critical procedure or task

boundaries to force completion of all previous instructions before a jump to a new

section of code or a context switch occurs. Like the I/O and locking instructions, the

processor waits until all previous instructions have been completed and all buffered

writes have been drained to memory before executing the serializing instruction.

The SFENCE, LFENCE, and MFENCE instructions provide a performance-efficient way

of insuring load and store memory ordering between routines that produce weakly-

ordered results and routines that consume that data. The functions of these instruc

tions are as follows:

• SFENCE — Serializes all store (write) operations that occurred prior to the

SFENCE instruction in the program instruction stream, but does not affect load

operations.

• LFENCE — Serializes all load (read) operations that occurred prior to the LFENCE

instruction in the program instruction stream, but does not affect store

operations.

• MFENCE — Serializes all store and load operations that occurred prior to the

MFENCE instruction in the program instruction stream.

Note that the SFENCE, LFENCE, and MFENCE instructions provide a more efficient

method of controlling memory ordering than the CPUID instruction.

The MTRRs were introduced in the P6 family processors to define the cache charac-

teristics for specified areas of physical memory. The following are two examples of

how memory types set up with MTRRs can be used strengthen or weaken memory

ordering for the Pentium 4, Intel Xeon, and P6 family processors:

• The strong uncached (UC) memory type forces a strong-ordering model on

memory accesses. Here, all reads and writes to the UC memory region appear on

the bus and out-of-order or speculative accesses are not performed. This

memory type can be applied to an address range dedicated to memory mapped

I/O devices to force strong memory ordering.

• For areas of memory where weak ordering is acceptable, the write back (WB)

memory type can be chosen. Here, reads can be performed speculatively and

writes can be buffered and combined. For this type of memory, cache locking is

performed on atomic (locked) operations that do not split across cache lines,

which helps to reduce the performance penalty associated with the use of the

typical synchronization instructions, such as XCHG, that lock the bus during the

entire read-modify-write operation. With the WB memory type, the XCHG

instruction locks the cache instead of the bus if the memory access is contained

within a cache line.