• Cheng Jian's avatar
    sched/idle: Micro-optimize the idle loop · 54b933c6
    Cheng Jian authored
    Move the loop-invariant calculation of 'cpu' in do_idle() out of the loop body,
    because the current CPU is always constant.
    
    This improves the generated code both on x86-64 and ARM64:
    
    x86-64:
    
    Before patch (execution in loop):
    	864:       0f ae e8                lfence
    	867:       65 8b 05 c2 38 f1 7e    mov %gs:0x7ef138c2(%rip),%eax
    	86e:       89 c0                   mov %eax,%eax
    	870:       48 0f a3 05 68 19 08    bt  %rax,0x1081968(%rip)
    	877:	   01
    
    After patch (execution in loop):
    	872:       0f ae e8                lfence
    	875:       4c 0f a3 25 63 19 08    bt  %r12,0x1081963(%rip)
    	87c:       01
    
    ARM64:
    
    Before patch (execution in loop):
    	c58:       d5033d9f        dsb     ld
    	c5c:       d538d080        mrs     x0, tpidr_el1
    	c60:       b8606a61        ldr     w1, [x19,x0]
    	c64:       1100fc20        add     w0, w1, #0x3f
    	c68:       7100003f        cmp     w1, #0x0
    	c6c:       1a81b000        csel    w0, w0, w1, lt
    	c70:       13067c00        asr     w0,...
    54b933c6
idle.c 8.61 KB