Intel 64 and IA-32 Architectures Software Developers Manual Volume 3A, System Programming Guide, Part 1
Vol. 3A 7-1
CHAPTER 7
MULTIPLE-PROCESSOR MANAGEMENT
The Intel 64 and IA-32 architectures provide mechanisms for managing and
improving the performance of multiple processors connected to the same system
bus. These include:
• Bus locking and/or cache coherency management for performing atomic
operations on system memory.
• Serializing instructions. These instructions apply only to the Pentium 4, Intel
Xeon, P6 family, and Pentium processors.
• An advance programmable interrupt controller (APIC) located on the processor
chip (see Chapter 8, “Advanced Programmable Interrupt Controller (APIC)”). This
feature was introduced by the Pentium processor.
• A second-level cache (level 2, L2). For the Pentium 4, Intel Xeon, and P6 family
processors, the L2 cache is included in the processor package and is tightly
coupled to the processor. For the Pentium and Intel486 processors, pins are
provided to support an external L2 cache.
• A third-level cache (level 3, L3). For Intel Xeon processors, the L3 cache is
included in the processor package and is tightly coupled to the processor.
• Hyper-Threading Technology. This extension to the Intel 64 and IA-32 architec-
tures enables a single processor core to execute two or more threads concur-
rently (see Section 7.6, “Hyper-Threading and Multi-Core Technology”).
These mechanisms are particularly useful in symmetric-multiprocessing (SMP)
systems. However, they can also be used when an Intel 64 or IA-32 processor and a
special-purpose processor (such as a communications, graphics, or video processor)
share the system bus.
These multiprocessing mechanisms:
• To maintain system memory coherency — When two or more processors are
attempting simultaneously to access the same address in system memory, some
communication mechanism or memory access protocol must be available to
promote data coherency and, in some instances, to allow one processor to
temporarily lock a memory location.
• To maintain cache consistency — When one processor accesses data cached on
another processor, it must not receive incorrect data. If it modifies data, all other
processors that access that data must receive the modified data.
• To allow predictable ordering of writes to memory — In some circumstances, it is
important that memory writes be observed externally in precisely the same order
as programmed.
• To distribute interrupt handling among a group of processors — When several
processors are operating in a system in parallel, it is useful to have a centralized