- 论坛徽章:
- 0
|
Details on the Cache Scrubber
=============================
The cache scrubber reduces the likelihood of EDP, WP, and CP events by
shortening the data lifetime in the Ecache, and by eliminating parity
errors where possible. (See "Errors and Events" below for an
explanation
of the EDP, WP, and CP event types.)
The cache scrubber is enabled by default. It scans the entire Ecache
of every CPU in the system once every ten seconds.
On an idle CPU, it scrubs all clean lines (lines that are identical to
the system memory from where they came), and dirty lines (lines that
have newer data than the system memory from where they came) that have
good parity. This reduces the lifetime of data in the Ecache on an
idle CPU, reducing the likelihood that a parity error will affect
critical system or user data.
On a busy CPU, it only scrubs clean lines with bad parity (which might
otherwise lead to EDP or CP events). Clean lines with good parity and
dirty lines are left in the Ecache so as to not adversely impact system
performance.
The cache scrubber never scrubs dirty lines with bad parity to avoid
causing WP events. These bad lines could get overwritten by the
program using them before they are accessed or flushed, thereby
eliminating a bad parity event from occurring at all. (This is
sometimes referred to as the natural scrubbing behavior of a busy
system.)
Details on the Cache Scrubber
=============================
The cache scrubber reduces the likelihood of EDP, WP, and CP events by
shortening the data lifetime in the Ecache, and by eliminating parity
errors where possible. (See "Errors and Events" below for an
explanation
of the EDP, WP, and CP event types.)
The cache scrubber is enabled by default. It scans the entire Ecache
of every CPU in the system once every ten seconds.
On an idle CPU, it scrubs all clean lines (lines that are identical to
the system memory from where they came), and dirty lines (lines that
have newer data than the system memory from where they came) that have
good parity. This reduces the lifetime of data in the Ecache on an
idle CPU, reducing the likelihood that a parity error will affect
critical system or user data.
On a busy CPU, it only scrubs clean lines with bad parity (which might
otherwise lead to EDP or CP events). Clean lines with good parity and
dirty lines are left in the Ecache so as to not adversely impact system
performance.
The cache scrubber never scrubs dirty lines with bad parity to avoid
causing WP events. These bad lines could get overwritten by the
program using them before they are accessed or flushed, thereby
eliminating a bad parity event from occurring at all. (This is
sometimes referred to as the natural scrubbing behavior of a busy
system.)
Errors and Events
=================
UltraSPARC processors can detect errors that are reported in the
following types of events (as detailed in the UltraSPARC-I/II User's
Manual, P/N 802-7220-02):
ETP
A parity error was detected by the CPU when reading from the
Ecache Tag SRAM. This is a fatal error because system coherency
has been lost. The system will reset (POR) and Starfire domains
will arbstop (UPA Fatal error). No Solaris error message will be
generated.
EDP
A parity error was detected by the CPU when reading from the
Ecache Data SRAM on a cache hit.
LDP
A parity error was detected by the CPU while reading main
memory through its Ultra Data Buffer (UDB) chip on an Ecache
miss. Note that the Ecache itself is not involved. This can occur
when the CPU is reading non-cacheable data (for example, a frame
buffer or I/O device), or when filling a line of cache from main
memory.
WP
A parity error was detected by one of the UDB chips while data
was being written back from the Ecache into main memory. The UDB
chips convert the data with bad parity into data with bad ECC, so
that a subsequent access to the same physical address will result
in a UE. (See UE below.) (The conversion of a parity error to a
latent UE does not occur on either UltraSPARC-IIi or -IIe, which
is one of the reasons why improved error handling is not
available on those processors.)
CP
A parity error was detected during a copyout transaction; that
is, a data transfer from one CPU's Ecache to another CPU. This
error is detected by the UDB chips of the providing CPU,
resulting in the CP event. The providing CPU's UDB chips convert
the data with bad parity to data with bad ECC, so that the UDBs
of the receiving CPU will report a UE event. (See UE below.)
UE
An uncorrectable memory error has occurred. This event refers to
an error in the main system memory, reported by the system data
bus on a read access. The underlying source of this error could
be main memory, another CPU module (see CP above), or another UPA
device (for example, the I/O controller). The UDB chips detect
this error.
CE
A correctable error was detected when reading from main memory,
or when reading from another CPU's UDB chips. The data read has
been corrected and valid data is given to the CPU and the CPU's
Ecache. This error is detected by the UDB chips.
BERR
A bus error has occurred during an attempt to read from a memory
address. Either there is no device at that address, or the
device at that address has returned a bus error. Therefore, bus
errors are caused by a programming error or by a corrupted or
defective device.
TO
A bus timeout was encountered during an attempt to read from a
memory address. Too much time has elapsed waiting for a device
at that address to respond.
本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/4019/showart_24868.html |
|