• Jesse Barnes's avatar
    [PATCH] I/O space write barrier · e1c5245a
    Jesse Barnes authored
    On some platforms (e.g.  SGI Challenge, Origin, and Altix machines), writes
    to I/O space aren't ordered coming from different CPUs.  For the most part,
    this isn't a problem since drivers generally spinlock around code that does
    writeX calls, but if the last operation a driver does before it releases a
    lock is a write and some other CPU takes the lock and immediately does a
    write, it's possible the second CPU's write could arrive before the
    first's.
    
    This patch adds a mmiowb() call to deal with this sort of situation, and
    adds some documentation describing I/O ordering issues to
    deviceiobook.tmpl.  The idea is to mirror the regular, cacheable memory
    barrier operation, wmb.  Example of the problem this new macro solves:
    
    CPU A:  spin_lock_irqsave(&dev_lock, flags)
    CPU A:  ...
    CPU A:  writel(newval, ring_ptr);
    CPU A:  spin_unlock_irqrestore(&dev_lock, flags)
            ...
    CPU B:  spin_lock_irqsave(&dev_lock, flags)
    CPU B:  writel(newval2, ring_ptr);
    CPU B:  ...
    CPU B:  spin_unlock_irqrestore(&dev_lock, flags)
    
    In this case, newval2 could be written to ring_ptr before newval.  Fixing
    it is easy though:
    
    CPU A:  spin_lock_irqsave(&dev_lock, flags)
    CPU A:  ...
    CPU A:  writel(newval, ring_ptr);
    CPU A:  mmiowb(); /* ensure no other writes beat us to the device */
    CPU A:  spin_unlock_irqrestore(&dev_lock, flags)
            ...
    CPU B:  spin_lock_irqsave(&dev_lock, flags)
    CPU B:  writel(newval2, ring_ptr);
    CPU B:  ...
    CPU B:  mmiowb();
    CPU B:  spin_unlock_irqrestore(&dev_lock, flags)
    
    Note that this doesn't address a related case where the driver may want to
    actually make a given write get to the device before proceeding.  This
    should be dealt with by immediately reading a register from the card that
    has no side effects.  According to the PCI spec, that will guarantee that
    all writes have arrived before being sent to the target bus.  If no such
    register is available (in the case of card resets perhaps), reading from
    config space is sufficient (though it may return all ones if the card isn't
    responding to read cycles).  I've tried to describe how mmiowb() differs
    from PCI posted write flushing in the patch to deviceiobook.tmpl.
    Signed-off-by: default avatarJesse Barnes <jbarnes@sgi.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    e1c5245a
io.h 16.2 KB