x86: improve bitop code generation with clang
This uses the new ASM_INPUT_RM macro to avoid the bad code generation
issue that clang has with more generic asm inputs.
This ends up avoiding generating code like this:
mov %r10,(%rsp)
tzcnt (%rsp),%rcx
which now becomes just
tzcnt %r10,%rcx
and in the process ends up also removing a few unnecessary stack frames
when the only use was that pointless "asm uses memory location off stack".
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Showing
Please register or sign in to comment