• Uros Bizjak's avatar
    locking/qspinlock/x86: Micro-optimize virt_spin_lock() · 94af3a04
    Uros Bizjak authored
    Optimize virt_spin_lock() to use simpler and faster:
    
      atomic_try_cmpxchg(*ptr, &val, new)
    
    instead of:
    
      atomic_cmpxchg(*ptr, val, new) == val
    
    The x86 CMPXCHG instruction returns success in the ZF flag, so
    this change saves a compare after the CMPXCHG.
    
    Also optimize retry loop a bit. atomic_try_cmpxchg() fails iff
    &lock->val != 0, so there is no need to load and compare the
    lock value again - cpu_relax() can be unconditinally called in
    this case. This allows us to generate optimized:
    
      1f:	ba 01 00 00 00       	mov    $0x1,%edx
      24:	8b 03                	mov    (%rbx),%eax
      26:	85 c0                	test   %eax,%eax
      28:	75 63                	jne    8d <...>
      2a:	f0 0f b1 13          	lock cmpxchg %edx,(%rbx)
      2e:	75 5d                	jne    8d <...>
    ...
      8d:	f3 90                	pause
      8f:	eb 93                	jmp    24 <...>
    
    instead of:
    
      1f:	ba 01 00 00 00       	mov    $0x1,%edx
      24:	8b 03                	mov    (%rbx),%eax
      26:	85 c0                	test   %eax,%eax
      28:	75 13                	jne    3d <...>
      2a:	f0 0f b1 13          	lock cmpxchg %edx,(%rbx)
      2e:	85 c0                	test   %eax,%eax
      30:	75 f2                	jne    24 <...>
    ...
      3d:	f3 90                	pause
      3f:	eb e3                	jmp    24 <...>
    Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Cc: Waiman Long <longman@redhat.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: https://lore.kernel.org/r/20240422120054.199092-1-ubizjak@gmail.com
    94af3a04
qspinlock.h 3.02 KB