Uncore Manual

Reference Number: 329468-002 173
Uncore Performance Monitoring
R3QPI Performance Monitoring
2.10 R3QPI PERFORMANCE MONITORING
2.10.1 Overview of the R3QPI Box
R3QPI is the interface between the Intel
®
QPI Link Layer, which packetizes requests, and the Ring.
R3QPI is the interface between the ring and the Intel
®
QPI Link Layer. It is responsible for translating
between ring protocol packets and flits that are used for transmitting data across the Intel
®
QPI inter-
face. It performs credit checking between the local Intel
®
QPI LL, the remote Intel
®
QPI LL and other
agents on the local ring.
The R3QPI agent provides several functions:
Interface between Ring and Intel
®
QPI:
One of the primary attributes of the ring is its ability to convey Intel
®
QPI semantics with no
translation. For example, this architecture enables initiators to communicate with a local Home
agent in exactly the same way as a remote Home agent on another socket. With this philosophy,
the R3QPI block is lean and does very little with regards to the Intel
®
QPI protocol aside from
mirror the request between the ring and the Intel
®
QPI interface.
•Intel
®
QPI routing:
In order to optimize layout and latency, both full width Intel
®
QPI interfaces share the same ring
stop. Therefore, a Intel
®
QPI packet might be received on one interface and simply forwarded
along on the other Intel
®
QPI interface. The R3QPI has sufficient routing logic to determine if
a request, snoop or response is targeting the local socket or if it should be forwarded along to
the other interface. This routing remains isolated to R3QPI and does not impede traffic on the
Ring.
•Intel
®
QPI Home Snoop Protocol (with early snoop optimizations for DP):
The R3QPI agent implements a latency-reducing optimization for dual sockets which issues
snoops within the socket for incoming requests as well as a latency-reducing optimization to
return data satisfying Direct2Core (D2C) requests.
2.10.2 R3QPI Performance Monitoring Overview
Each R3QPI Link supports event monitoring through three 44b wide counters
(R3_Ly_PCI_PMON_CTR/CTL{2:0}). Each of these three counters can be programmed to count
almost any R3QPI event (see NOTE for exceptions). the R3QPI counters can increment by a maximum
of 8b per cycle.
For information on how to setup a monitoring session, refer to Section 2.1, “Uncore Per-Socket
Performance Monitoring Control”
.
NOTE
Only counter 0 can be used for tracking occupancy events. Only counter 2 can be
used to count ring events.
2.10.2.1 R3QPI PMON Registers - On Overflow and the Consequences (PMI/Freeze)
If an overflow is detected from an R3QPI performance counter enabled to communicate its overflow
(R3_Ly_PCI_PMON_CTL.ov_en is set to 1), the overflow bit is set at the box level
(R3_Ly_PCI_PMON_BOX_STATUS.ov) and an overflow message is sent to the UBox When the UBox
receives the overflow signal, U_MSR_PMON_GLOBAL_STATUS.ov_rq is set (see Table 2-3,
“U_MSR_PMON_GLOBAL_STATUS Register – Field Definitions”) and a PMI can be generated.