10

In x86 architecture we use I/O instructions like IN and OUT for I/O mapped I/O. We use memory instructions like MOV in memory mapped I/O as far as I know. This is all nice but who decides which I/O method will be used? If I want to build my own device (a peripheral) can I choose freely whether I use I/O mapped or memory mapped I/O to communicate with PC? Or all devices must support the both?

Can't understand who makes the decision on I/O method used to communicate with a device.

Tom Milberg
  • 333
  • 3
  • 8
  • IN and OUT are very low level and not used in modern systems. See https://stackoverflow.com/questions/3215878/what-are-in-out-instructions-in-x86-used-for – qwr Jun 29 '19 at 16:44
  • 5
    @qwr That's not correct. Many fundamental system components still use port IO. – fuz Jun 29 '19 at 16:51
  • 5
    The manufacturer of the peripheral card or interface decides. The peripheral manufacturer can support one, the other, or both. – Michael Petch Jun 29 '19 at 16:56
  • 2
    @1201ProgramAlarm: memory-mapping the video RAM isn't quite MMIO: it's just memory, not I/O registers that have side effects for reading or writing. That's why it can be marked as write-combining (WC) memory type, not UC (uncacheable). You'd typically have separate PCI memory regions: one for the actual video memory, and one for MMIO registers. – Peter Cordes Jun 29 '19 at 18:26
  • 1
    as Michael Petch mentioned it is the hardware, the board/chip/firmware on the product decides. There can be designs where some percentage of the items you can reach can be reached via either path, historically some types of accesses are I/O based and others using memory mapped. – old_timer Jun 29 '19 at 19:40
  • @fuz: Well, not "many" - almost all of the ancient stuff that used IO ports is now deprecated/obsolete and the only thing left that still uses IO ports is CMOS/RTC. The most recent thing I can think of that used IO ports was UCHI (20+ years old now). Note that part of the reason for this is that PCI card manufacturers want their card to work on other computers (e.g. with CPUs that never support IO ports). – Brendan Jun 29 '19 at 19:48
  • 1
    @Brendan You forget that x86 CPUs are not only used in computers. If I understand correctly, there are still 80386-based microcontrollers in production. – Martin Rosenau Jun 29 '19 at 20:04

2 Answers2

12

As Michael Petch said in his comment it's the manufacturer, it doesn't always have full freedom though.
Standards and specifications can mandate the address space to use, some standards are generic (e.g. the OHCI, USB 1.0, refers to an "uncachable address space", which on x86 can be either IO or MMIO) other are not (e.g. the PC Client TPM spec maps the TPM registers by locality based on the MMIO area used).


As far as i know, and as far as we are concerned in this answer, MMIO adoption went mainstream with the advent of PCI1.
PCI BARs (Base Address Registers) have a special format that allow the software to know which address space the card is using (and how much of it is needed):

PCI BAR format

Bit 0 is Read-Only (set by manufacturer) and tell which address space is being used by the card.

The IO space has the advantage over MMIO of not requiring any setup, MMIO need a virtual to physical mapping and the correct caching type.
However the IO space is only 64KiB + 3B, it's very small.
In fact PCI 2.2 limits the max IO space used by a single BAR to 256 bytes.

256 IO bytes per BAR

Sorry for the image, copying from the PDF spec gives me gibberish

Furthermore, pointers don't work in the IO space and some devices works with pointers (e.g. USB controllers, GBe and so on).

IO is surely used for legacy devices (before MMIO was a thing).
I was used to think that IO was used for devices that have a small number of registers but that's not always true, for example the Power Manager Control registers of the PCH (chipset) are IO mapped and occupy 128B.

Sometimes, the device support both IO and MMIO. This requires two BARs, an example is the SMBus controller of the PCH:

SMBus controller's BARs

It has two BARs (note the default value, one is for IO the other for MMIO) that control the same set of registers.
The documentation specifies that both can be used.

I cannot give an exact rule of when IO vs MMIO is used.
I don't think there a difference in performance, the distinction is just a bit in the TLP packet sent by the PCIe link layer.
However I've never investigated the matter, the IO instructions are serialising so there is a performance penalty at software level.

My rule of thumb is that IO is/can be used if any of the following is true:

  • The device is a legacy one (there is really no freedom of choice here).
  • Your device is not using pointers (because IO has no pointers) and the register set is small.
  • The registers are mostly used for control and report the status of the device or the whole system (because IO instructions are serialising).

These are just rule of thumbs, based on my readings and memories, there are many exceptions and counter examples to them.
Today the tendency is to use MMIO, this may require more decode logic (more address lines to decode) but the PCI spec simplify it by allowing a device to round its decoding to 4KiB.
One example is the PCIe configuration space, in PCI it was IO accessed (with a technique similar to the stacking of registers as used in, e.g., the VGA controller) but now is memory mapped.

There is no need to consider other busses as PCIe is the main bus on modern PCs, everything else goes through a PCIe device (e.g. USB uses xHCI PCI devices).
The only exception to this are the off-core devices (e.g. the LAPICs, the TXT registers), these are accessed through memory mapped IO because it's more performant I think, this accesses won't make it to the system agent (these devices are close to their core and inside the CPU package anyway) so using a (serialising) IO instruction would impact them significantly.
Plus there is a nice spot a the top of the 4GiB where Intel can reclaim memory without too much pressure on the other devices.
Fun fact: Ports 0xf8-0xff are reserved due to the times when the FPU was a coprocessor (x87) and this ports were used by the CPU for communicating with it.


1 Before that both other PnP buses was already available (e.g. PnP ISA and MCA) but decoding memory accesses was mostly done for giving access to ROMs and on-card RAM. Mapping registers to memory was not yet a thing I guess.

Margaret Bloom
  • 41,768
  • 5
  • 78
  • 124
  • So I (as peripheral's developer) decide what to use and tell it to CPU via PCI communication? – Tom Milberg Jun 30 '19 at 14:31
  • @TomMilberg Unless you have to follow a standard that constrains you to either IO or MMIO, you are free to choose either one. The PCI BARs registers are for the software, so it can use the right method. – Margaret Bloom Jun 30 '19 at 14:37
  • I understand that but software must know which one to use beforehand or I can somehow report it? – Tom Milberg Jun 30 '19 at 18:05
  • @TomMilberg the PCI BAR registers has a bit for that: bit0 is 0 for MMIO and 1 for IO. It's read-only and set by the card manufacturer. That's how it is reported. The size of the region is also reported by the PCI BAR (by writing the base address as all ones and reading the value back) but the software kind of know the size already as it must know which registers are present and how to use them. – Margaret Bloom Jun 30 '19 at 19:18
3

If I want to build my own device (a peripheral) can I choose freely whether I use I/O mapped or memory mapped I/O to communicate with PC?

What kind of device?

If it's a legacy device (e.g. an ancient "PS/2 controller" or serial port or parallel port or ..) or a standardized device (e.g. implementing AHCI or NVMe or xHCI), then it has to comply with an existing (formal or de facto) specification.

Otherwise (no existing specification that has to be compiled with); if it's a USB device then you can't use IO ports or MMIO (it's responding to requests on a serial bus); if it's a PCI device and needs high performance it should use MMIO (because IO ports are a performance problem); and if it's a PCI device that doesn't need high performance it shouldn't be a PCI device at all (should be USB).

Brendan
  • 35,656
  • 2
  • 39
  • 66