PCI Bus
The Peripheral Component Interconnect (PCI) bus is the standard IO bus
on recent computers in general, and PCs in particular. There's a lot
more good information out there about it than I could even pretend to
write, so here are some references.
References
-
PCI -
Webopedia Definition and Links (these guys give lots of good
explanations of concepts, not much really PCI-specific)
-
Adaptec, PCI, 64-Bit and 66-MHz Benefits (good concept-level description)
- Adaptec, PCI Bridges
-
pci_sokos.htm (full of terrific low-level details)
-
Rubini, Alessandro, and Jonathon Corbet, Linux Device
Drivers, 2nd Edition Chapter 15, O'Reilly and Associates,
2001 (Linux-centric -- surprise, surprise. Does a good job
covering PCI).
-
PCI Express Whitepaper from PCI-SIG
Just as a brief note, it was developed by Intel in 1993 to replace
the various busses which had been in use on both PCs and Macintoshes.
To Intel's credit, it is a remarkably architecture-neutral bus. A
very brief description would be that it is a 32-bit, 33MHz bus with
multiplexed address and data, and very nice capabilities for
autoconfiguration ("Plug and Play"). It also supports both old, 5
volt devices and newer, 3.3 volt devices.
There are many extensions to PCI. Best known is that it has simply
been extended to 64 bits and 66 MHz. In addition, there is a variant
called PC-104+, which is a 32-bit PCI bus in a highly shock and
vibration resistant packaging. PCI-X is a backward-compatible
extension to PCI, with PCI-X itself running at 266MHz and PCI-X 2.0 at
533 MHz. This latter also defines a 16 bit interface for
space-constrained applications, and a new bus mastering protocol (PCI
SIG likes to call this peer-to-peer) that looks a lot like messaging.
All transfers on the PCI bus are "burst" transfers. What this means
is that once a device obtains the bus to perform a transfer, it is
able to hang on to the bus indefinitely, and keep sending more data
every bus cycle (there's actually a time in the bus controller which
will take control back after some configurable time period, to keep
transfers from being too long. The longer the tranfers are the better
the throughput, but this can cause unacceptable delays for other
devices).
Autoconfiguration
One of the nicest features of PCI is its support for
autoconfiguration. In addition to every device having an address on
the PCI bus, every card has its own address determined by which slot
it is plugged into. This is referred to as the card's configuration
space, and can be queried (and parts of it can be written) by the
CPU. This normally occurs at boot time; it may be performed by the
BIOS prior to starting the boot loader, or it may be performed by the
OS as it boots.
Here's a picture of the configuration space for a PCI device (taken
from the Rubini page above):
The most important parts of the configuration space (IMHO) are:
Vendor and Device ID
The Vendor ID is a 16 bit number, assigned by the PCI SIG. You
can look this number up in a database to find out who built the
card. The device ID is another 16 bit number, assigned by the
vendor. You can look this up in a database to find out the
device model number. Put them together and you can know what
kind of device you're going to be talking to, so you can run the
right device driver.
Class Code
This is a 24 bit number, assigned by I-don't-know-who, which
identifies what kind of device is on the card. The difference
between this and the vendor/device id fields is that this will
specify something like "serial port" while the vendor and device
ID fields will say "Bob's Card Shop Model XY-Zowie." You can
run the device based on its class code, but to take advantage of
any extra features (like the fact it might be an 8-port card
instead of a single-port card) requires the vendor and device IDs.
Base Registers
Up to six base registers can be specified, for the devices
located on the card. If you have fewer than six logical
devices you will actually use fewer than these; if you have
more, you will have to get into some ugly hacks (for instance,
on an eight port serial card I have, six of the ports' base
addresses are specified in the base addresses, while two are at
fixed offsets from the first two of the six). Unlike the vendor
and device ID fields, and the class codes, the base register
addresses are read/write.
PCI Interrupt Handling
As with many aspects of the PCI bus, one of the challenges is to
design the interrupt handling so that it can be mapped into the
interrupt scheme expected by the CPU. The basic solution is the same
as for other aspects of the bus: chips called ``bridges'' are used
for the translation.
PCI uses four pins, called INTA-INTD, for interrupt requests. When an
interrupt is required, the proper pin is asserted. A card which
only has a single interrupt will normally use INTA; a card with two
(they exist! Particularly cards with more than one logical device)
will use INTA and INTB, and so forth.
The bus wiring determines how the requested interrupt is presented to
the bridge chip; the standard doesn't specify how this routing should
be performed. I've come across one source that says the routing for
current PCs is:
| Slot 1 | Slot 2 | Slot 3 | Slot 4 | Slot 5 |
INTA | PIRQ1 | PIRQ2 | PIRQ3 | PIRQ4 | PIRQ4 |
INTB | PIRQ2 | PIRQ3 | PIRQ4 | PIRQ1 | PIRQ1 |
INTC | PIRQ3 | PIRQ4 | PIRQ1 | PIRQ2 | PIRQ2 |
INTD | PIRQ4 | PIRQ1 | PIRQ2 | PIRQ3 | PIRQ3 |
(where the PIRQ# is the interrupt as presented to the bridge chip).
So if in fact all the devices are using INTA, they will be routed to
different pins (except for cards 4 and 5). Notice that this is how
they are wired, not how they have to be wired; it would be entirely
possible for a bus to route all 20 interrupts from these five devices
to different inputs on the bridge. On a PC, the BIOS programs the
bridge to route its PIRQ inputs to Intel IRQ requests in an emulated
pair of 8259s.
When the device requests its interrupt, the bridge responds with an
Interrupt Acknowledge (INTA) bus cycle; the card responds with an
interrupt vector. This vector is an eight-bit number loaded into a
device configuration register by the BIOS or the OS at boot time.
I haven't been able to find a specification of the arbitration that
decides which device wins when multiple devices attempt to interrupt
simultaneously.
PCI Commands
There are a total of 16 possible commands on a PCI cycle. They're in
the following table:
Command | Command Type |
0000 | Interrupt Acknowledge |
0001 | Special Cycle |
0010 | I/O Read |
0011 | I/O Write |
0100 | reserved |
0101 | reserved |
0110 | Memory Read |
0111 | Memory Write |
1000 | reserved |
1001 | reserved |
1010 | Configuration Read |
1011 | Configuration Write |
1100 | Multiple Memory Read |
1101 | Dual Address Cycle |
1110 | Memory-Read Line |
1111 | Memory Write and Invalidate |
Here are some notes on the different transfer types (taken almost
verbatim from pci_sokos.htm).
Interrupt Acknowledge (0000)
The interrupt controller automatically recognizes and reacts to
the INTA (interrupt acknowledge) command. In the data phase, it transfers
the interrupt vector to the AD lines.
Special Cycle (0001)
AD15-AD0 |
|
0x0000 | Processor Shutdown |
0x0001 | Processor Halt |
0x0002 | x86 Specific Code |
0x0003 to 0xFFFF | Reserved |
I/O Read (0010) and I/O Write (0011)
Input/Output device read or write operation. The AD lines contain
a byte address (AD0 and AD1 must be decoded).
PCI I/O ports may be 8 or 16 bits.
PCI allows 32 bits of address space. On IBM compatible machines, the
Intel CPU is limited to 16 bits of I/O space, which is further limited
by some ISA cards that may also be installed in the machine (many ISA
cards only decode the lower 10 bits of address space, and thus mirror
themselves throughout the 16 bit I/O space). This limit assumes that the
machine supports ISA or EISA slots in addition to PCI slots.
The PCI configuration space may also be accessed through I/O
ports 0x0CF8 (Address) and 0x0CFC (Data). The address port must be
written first.
Memory Read (0110) and Memory Write (0111)
A read or write to the system memory space. The AD lines contain
a doubleword address. AD0 and AD1 do not need to be decoded. The Byte
Enable lines (C/BE) indicate which bytes are valid.
Configuration Read (1010) and Configuration Write (1011)
A read or write to the PCI device configuration space, which is
256 bytes in length. It is accessed in doubleword units.
AD0 and AD1 contain 0, AD2-7 contain the doubleword address, AD8-10
are used for selecting the addressed unit a the malfunction unit,
and the remaining AD lines are not used.
Multiple Memory Read (1100)
This is an extension of the memory read bus cycle. It is used to read large
blocks of memory without caching, which is beneficial for long sequential
memory accesses.
Dual Address Cycle (1101)
Two address cycles are necessary when a 64 bit address is used,
but only a 32 bit physical address exists. The least significant portion
of the address is placed on the AD lines first, followed by the most
significant 32 bits. The second address cycle also contains the command
for the type of transfer (I/O, Memory, etc). The PCI bus supports a 64 bit
I/O address space, although this is not available on Intel based PCs due
to limitations of the CPU.
Memory-Read Line (1110)
This cycle is used to read in more than two 32 bit data blocks,
typically up to the end of a cache line. It is more effecient than
normal memory read bursts for a long series of sequential memory accesses.
Memory Write and Invalidate (1111)
This indicates that a minimum of one cache line is to be transferred.
This allows main memory to be updated, saving a cache write-back cycle.
PCI Express
PCI Express (a bus formerly known as 3GIO) is the successor to PCI.
In fact, while the hardware is very different from PCI's the software
model is unchanged. If a manufacturer moves a device from a PCI
implementation to a PCI Express implementation (and makes no other
changes), the old drivers will all continue to work.
Physical Layer
The first thing to notice about PCI Express is that it isn't a bus.
Instead, each PCI express slot is independently connected to a
switch. The communication between devices and the bus controller is
very reliable, with CRC being used to check for errors in transmission.
Devices are connected to PCI Express with two differential signal
pairs: one for transmitting and one for receiving. Signals are
transmitted across these signal pairs at 2.5 Gb/s/direction (and
they're anticipating scaling to 10 Gb/s/direction later).
Notice that this means the data transfers across a single signal
pair is already substantially faster than across PCI (32 bits *
33 MHz = 1 Gb/s, half-duplex), with later scaling.
But there's more. What I just described is what PCI-SIG calls a
"lane." Where PCI has a 120-pin connector, a one-lane PCI
Express device uses a 36 pin connector. The specification
allows a device to use more than one lane; adding another lane
adds another 2.5 GB/s/direction. You can get up to an 8x PCI
Express connector before it takes more pins than PCI did; 16x
(used for graphics) and 32x connectors have been defined.
The last thing to comment on is that the architecture
automatically supports slower devices: if you have a (say) 2x
board, it'll plug into an 8x connector just fine. It just won't
use all the 8x pins, and will only run at 2x speed. But you
can't plug a wider card into a narrower slot.
Last modified: Fri Apr 22 09:28:45 MDT 2005
原文链接:http://www.cs.nmsu.edu/~pfeiffer/classes/473/notes/pci.html
|