• Uros Bizjak's avatar
    locking/atomic/x86: Introduce arch_sync_try_cmpxchg() · 636d6a8b
    Uros Bizjak authored
    Introduce the arch_sync_try_cmpxchg() macro to improve code using
    sync_try_cmpxchg() locking primitive. The new definitions use existing
    __raw_try_cmpxchg() macros, but use its own "lock; " prefix.
    
    The new macros improve assembly of the cmpxchg loop in
    evtchn_fifo_unmask() from drivers/xen/events/events_fifo.c from:
    
     57a:	85 c0                	test   %eax,%eax
     57c:	78 52                	js     5d0 <...>
     57e:	89 c1                	mov    %eax,%ecx
     580:	25 ff ff ff af       	and    $0xafffffff,%eax
     585:	c7 04 24 00 00 00 00 	movl   $0x0,(%rsp)
     58c:	81 e1 ff ff ff ef    	and    $0xefffffff,%ecx
     592:	89 4c 24 04          	mov    %ecx,0x4(%rsp)
     596:	89 44 24 08          	mov    %eax,0x8(%rsp)
     59a:	8b 74 24 08          	mov    0x8(%rsp),%esi
     59e:	8b 44 24 04          	mov    0x4(%rsp),%eax
     5a2:	f0 0f b1 32          	lock cmpxchg %esi,(%rdx)
     5a6:	89 04 24             	mov    %eax,(%rsp)
     5a9:	8b 04 24             	mov    (%rsp),%eax
     5ac:	39 c1                	cmp    %eax,%ecx
     5ae:	74 07                	je     5b7 <...>
     5b0:	a9 00 00 00 40       	test   $0x40000000,%eax
     5b5:	75 c3                	jne    57a <...>
     <...>
    
    to:
    
     578:	a9 00 00 00 40       	test   $0x40000000,%eax
     57d:	74 2b                	je     5aa <...>
     57f:	85 c0                	test   %eax,%eax
     581:	78 40                	js     5c3 <...>
     583:	89 c1                	mov    %eax,%ecx
     585:	25 ff ff ff af       	and    $0xafffffff,%eax
     58a:	81 e1 ff ff ff ef    	and    $0xefffffff,%ecx
     590:	89 4c 24 04          	mov    %ecx,0x4(%rsp)
     594:	89 44 24 08          	mov    %eax,0x8(%rsp)
     598:	8b 4c 24 08          	mov    0x8(%rsp),%ecx
     59c:	8b 44 24 04          	mov    0x4(%rsp),%eax
     5a0:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
     5a4:	89 44 24 04          	mov    %eax,0x4(%rsp)
     5a8:	75 30                	jne    5da <...>
     <...>
     5da:	8b 44 24 04          	mov    0x4(%rsp),%eax
     5de:	eb 98                	jmp    578 <...>
    
    The new code removes move instructions from 585: 5a6: and 5a9:
    and the compare from 5ac:. Additionally, the compiler assumes that
    cmpxchg success is more probable and optimizes code flow accordingly.
    Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: linux-kernel@vger.kernel.org
    636d6a8b
cmpxchg.h 7.29 KB