Intel 64 and IA-32 Architectures Software Developers Manual Volume 3A, System Programming Guide, Part 1

ManualsBrandsIntel ManualsOtherIntel Pentium 4 Processor 2.80 GHz, 512K Cache, 533 MHz FSB

441

442

443

444

445

446

447

448

449

450

Vol. 3A 10-5

MEMORY CACHE CONTROL

The processor’s caches are for the most part transparent to software. When enabled,

instructions and data flow through these caches without the need for explicit soft-

ware control. However, knowledge of the behavior of these caches may be useful in

optimizing software performance. For example, knowledge of cache dimensions and

replacement algorithms gives an indication of how large of a data structure can be

operated on at once without causing cache thrashing.

In multiprocessor systems, maintenance of cache consistency may, in rare circum-

stances, require intervention by system software. For these rare cases, the processor

provides privileged cache control instructions for use in flushing caches and forcing

memory ordering.

The Pentium III, Pentium 4, and Intel Xeon processors introduced several instructions

that software can use to improve the performance of the L1, L2, and L3 caches,

including the PREFETCHh and CLFLUSH instructions and the non-temporal move

instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD). The use of

these instructions are discussed in Section 10.5.5, “Cache Management Instruc-

tions.”

10.2 CACHING TERMINOLOGY

IA-32 processors (beginning with the Pentium processor) and Intel 64 processors use

the MESI (modified, exclusive, shared, invalid) cache protocol to maintain consis-

tency with internal caches and caches in other processors (see Section 10.4, “Cache

Control Protocol”).

When the processor recognizes that an operand being read from memory is cache-

able, the processor reads an entire cache line into the appropriate cache (L1, L2, L3,

or all). This operation is called a cache line fill. If the memory location containing

that operand is still cached the next time the processor attempts to access the

operand, the processor can read the operand from the cache instead of going back to

memory. This operation is called a cache hit.

When the processor attempts to write an operand to a cacheable area of memory, it

first checks if a cache line for that memory location exists in the cache. If a valid

cache line does exist, the processor (depending on the write policy currently in force)

can write the operand into the cache instead of writing it out to system memory. This

operation is called a write hit. If a write misses the cache (that is, a valid cache line

is not present for area of memory being written to), the processor performs a cache

line fill, write allocation. Then it writes the operand into the cache line and

(depending on the write policy currently in force) can also write it out to memory. If

the operand is to be written out to memory, it is written first into the store buffer, and

then written from the store buffer to memory when the system bus is available.

(Note that for the Pentium processor, write misses do not result in a cache line fill;

they always result in a write to memory. For this processor, only read misses result in

cache line fills.)

When operating in an MP system, IA-32 processors (beginning with the Intel486

processor) and Intel 64 processors have the ability to snoop other processor’s