Intel 64 and IA-32 Architectures Software Developers Manual Volume 3A, System Programming Guide, Part 1
Vol. 3A 7-13
MULTIPLE-PROCESSOR MANAGEMENT
• For areas of memory where weak ordering is acceptable, the write back (WB)
memory type can be chosen. Here, reads can be performed speculatively and
writes can be buffered and combined. For this type of memory, cache locking is
performed on atomic (locked) operations that do not split across cache lines,
which helps to reduce the performance penalty associated with the use of the
typical synchronization instructions, such as XCHG, that lock the bus during the
entire read-modify-write operation. With the WB memory type, the XCHG
instruction locks the cache instead of the bus if the memory access is contained
within a cache line.
The PAT was introduced in the Pentium III processor to enhance the caching charac-
teristics that can be assigned to pages or groups of pages. The PAT mechanism typi-
cally used to strengthen caching characteristics at the page level with respect to the
caching characteristics established by the MTRRs. Table 10-7 shows the interaction of
the PAT with the MTRRs.
We recommended that software written to run on Intel Core 2 Duo, Intel Core Duo,
Pentium 4, Intel Xeon, and P6 family processors assume the processor-ordering
model or a weaker memory-ordering model. The Intel Core 2 Duo, Intel Core Duo,
Pentium 4, Intel Xeon, and P6 family processors do not implement a strong memory-
ordering model, except when using the UC memory type. Despite the fact that
Pentium 4, Intel Xeon, and P6 family processors support processor ordering, Intel
does not guarantee that future processors will support this model. To make software
portable to future processors, it is recommended that operating systems provide crit-
ical region and resource control constructs and API’s (application program interfaces)
based on I/O, locking, and/or serializing instructions be used to synchronize access
to shared areas of memory in multiple-processor systems. Also, software should not
depend on processor ordering in situations where the system hardware does not
support this memory-ordering model.
7.3 PROPAGATION OF PAGE TABLE AND PAGE
DIRECTORY ENTRY CHANGES TO MULTIPLE
PROCESSORS
In a multiprocessor system, when one processor changes a page table or page direc-
tory entry, the changes must also be propagated to all other processors. This process
is commonly referred to as “TLB shootdown.” The propagation of changes to page
table or page directory entries can be done using memory-based semaphores and/or
interprocessor interrupts (IPI).
For example, the following describes a simple TLB shootdown sequence for an Intel
64 or IA-32 processor:
1. Begin barrier — Stop all but one processor; that is, cause all but one to HALT or
stop in a spin loop.
2. Let the active processor change the necessary PTEs and/or PDEs.