Commit 9726840d authored by Will Deacon's avatar Will Deacon

docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread

The revised I/O ordering section of memory-barriers.txt introduced in
4614bbde ("docs/memory-barriers.txt: Rewrite "KERNEL I/O BARRIER
EFFECTS" section") loosely refers to "the CPU", whereas the ordering
guarantees generally apply within a thread of execution that can migrate
between cores, with the scheduler providing the relevant barrier
semantics.

Reword the section to refer to "CPU thread" and call out ordering of
MMIO writes separately from ordering of writes to memory. Ben also
spotted that the string accessors are native-endian, so fix that up too.

Link: https://lkml.kernel.org/r/080d1ec73e3e29d6ffeeeb50b39b613da28afb37.camel@kernel.crashing.org
Fixes: 4614bbde ("docs/memory-barriers.txt: Rewrite "KERNEL I/O BARRIER EFFECTS" section")
Reported-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
parent 0cde62a4
...@@ -2523,27 +2523,37 @@ guarantees: ...@@ -2523,27 +2523,37 @@ guarantees:
ioremap()), the ordering guarantees are as follows: ioremap()), the ordering guarantees are as follows:
1. All readX() and writeX() accesses to the same peripheral are ordered 1. All readX() and writeX() accesses to the same peripheral are ordered
with respect to each other. This ensures that MMIO register writes by with respect to each other. This ensures that MMIO register accesses
the CPU to a particular device will arrive in program order. by the same CPU thread to a particular device will arrive in program
order.
2. A writeX() by the CPU to the peripheral will first wait for the
completion of all prior CPU writes to memory. This ensures that 2. A writeX() issued by a CPU thread holding a spinlock is ordered
writes by the CPU to an outbound DMA buffer allocated by before a writeX() to the same peripheral from another CPU thread
dma_alloc_coherent() will be visible to a DMA engine when the CPU issued after a later acquisition of the same spinlock. This ensures
writes to its MMIO control register to trigger the transfer. that MMIO register writes to a particular device issued while holding
a spinlock will arrive in an order consistent with acquisitions of
3. A readX() by the CPU from the peripheral will complete before any the lock.
subsequent CPU reads from memory can begin. This ensures that reads
by the CPU from an incoming DMA buffer allocated by 3. A writeX() by a CPU thread to the peripheral will first wait for the
dma_alloc_coherent() will not see stale data after reading from the completion of all prior writes to memory either issued by, or
DMA engine's MMIO status register to establish that the DMA transfer propagated to, the same thread. This ensures that writes by the CPU
has completed. to an outbound DMA buffer allocated by dma_alloc_coherent() will be
visible to a DMA engine when the CPU writes to its MMIO control
4. A readX() by the CPU from the peripheral will complete before any register to trigger the transfer.
subsequent delay() loop can begin execution. This ensures that two
MMIO register writes by the CPU to a peripheral will arrive at least 4. A readX() by a CPU thread from the peripheral will complete before
1us apart if the first write is immediately read back with readX() any subsequent reads from memory by the same thread can begin. This
and udelay(1) is called prior to the second writeX(): ensures that reads by the CPU from an incoming DMA buffer allocated
by dma_alloc_coherent() will not see stale data after reading from
the DMA engine's MMIO status register to establish that the DMA
transfer has completed.
5. A readX() by a CPU thread from the peripheral will complete before
any subsequent delay() loop can begin execution on the same thread.
This ensures that two MMIO register writes by the CPU to a peripheral
will arrive at least 1us apart if the first write is immediately read
back with readX() and udelay(1) is called prior to the second
writeX():
writel(42, DEVICE_REGISTER_0); // Arrives at the device... writel(42, DEVICE_REGISTER_0); // Arrives at the device...
readl(DEVICE_REGISTER_0); readl(DEVICE_REGISTER_0);
...@@ -2559,10 +2569,11 @@ guarantees: ...@@ -2559,10 +2569,11 @@ guarantees:
These are similar to readX() and writeX(), but provide weaker memory These are similar to readX() and writeX(), but provide weaker memory
ordering guarantees. Specifically, they do not guarantee ordering with ordering guarantees. Specifically, they do not guarantee ordering with
respect to normal memory accesses or delay() loops (i.e. bullets 2-4 respect to locking, normal memory accesses or delay() loops (i.e.
above) but they are still guaranteed to be ordered with respect to other bullets 2-5 above) but they are still guaranteed to be ordered with
accesses to the same peripheral when operating on __iomem pointers respect to other accesses from the same CPU thread to the same
mapped with the default I/O attributes. peripheral when operating on __iomem pointers mapped with the default
I/O attributes.
(*) readsX(), writesX(): (*) readsX(), writesX():
...@@ -2600,8 +2611,10 @@ guarantees: ...@@ -2600,8 +2611,10 @@ guarantees:
These will perform appropriately for the type of access they're actually These will perform appropriately for the type of access they're actually
doing, be it inX()/outX() or readX()/writeX(). doing, be it inX()/outX() or readX()/writeX().
All of these accessors assume that the underlying peripheral is little-endian, With the exception of the string accessors (insX(), outsX(), readsX() and
and will therefore perform byte-swapping operations on big-endian architectures. writesX()), all of the above assume that the underlying peripheral is
little-endian and will therefore perform byte-swapping operations on big-endian
architectures.
======================================== ========================================
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment