- 论坛徽章:
- 0
|
Intel手册中关于prefetch的一些资料:
IA-32 Intel Architecture Software Developer's Manual Vol.3
9.5.4 Cache Management Instructions
The PREFETCHh instructions allow a program to suggest to the processor that a cache line from a specified location in system memory be prefetched into the cache hierarchy.
9.8 EXPLICIT CACHING
The Pentium III processor introduced four new instructions, the PREFETCHh instructions, that provide software with explicit control over the caching of data. These instructions provide "hints" to the processor that the data requested by a PREFETCHh instruction should be read into cache hierarchy now or as soon as possible, in anticipation of its use. The instructions provide different variations of hint that allow selection of the cache level into which data will be read.
The PREFETCHh instructions can help reduce the long latency typically associated with reading data from memeory and thus help prevent processor "stalls." However, these instructions should be used judiciously. Overuse can lead to resource conflicts and hence reduce the performance of an application. Also, these instructions should only be used to prefetch data from memory; they should not be used to prefetch instructions.
Intel Pentium 4 and Intel Xeon Processor Optimization Reference Manual
Software Data Prefetch
The prefetch instruction can hide the latency of data access in performance-critical sections of application code by allowing data to be fetched in advance of its actual usage. The prefetch instructions do not change the user-visible semantics of a program, although they may affect the program’s performance. The prefetch instructions merely provide a hint to the hardware and generally will not generate exceptions or faults.
The prefetch instructions load either non-temporal data or temporal data in the specified cache level. This data access type and the cache level are specified as a hint. Depending on the implementation, the instruction fetches 32 or more aligned bytes, including the specified address byte, into the instruction-specified cache levels.
The prefetch instruction is implementation-specific; applications need to be tuned to each implementation to maximize performance.
NOTE. Using the prefetch instructions is recommended only if data does not fit in cache.
The prefetch instructions merely provide a hint to the hardware, and they will not generate exceptions or faults except for a few special cases. However, excessive use of prefetch instructions may waste memory bandwidth and result in performance penalty due to resource constraints.
Nevertheless, the prefetch instructions can lessen the overhead of memory transactions by preventing cache pollution and by using the caches and memory efficiently. This is particularly important for applications that share critical system resources, such as the memory bus.
The prefetch instructions are mainly designed to improve application performance by hiding memory latency in the background. If segments of an application access data in a predictable manner, for example, using arrays with known strides, then they are good candidates for using prefetch to improve performance.
Use the prefetch instructions in:
. predictable memory access patterns
. time-consuming innermost loops
. locations where the execution pipeline may stall if data is not available. |
|