1. 22 May, 2024 3 commits
    • Linus Torvalds's avatar
      x86: improve bitop code generation with clang · b9b60b31
      Linus Torvalds authored
      This uses the new ASM_INPUT_RM macro to avoid the bad code generation
      issue that clang has with more generic asm inputs.
      
      This ends up avoiding generating code like this:
      
       	mov    %r10,(%rsp)
       	tzcnt  (%rsp),%rcx
      
      which now becomes just
      
       	tzcnt  %r10,%rcx
      
      and in the process ends up also removing a few unnecessary stack frames
      when the only use was that pointless "asm uses memory location off stack".
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9b60b31
    • Linus Torvalds's avatar
      x86: improve array_index_mask_nospec() code generation · 7453b948
      Linus Torvalds authored
      Don't force the inputs to be 'unsigned long', when the comparison can
      easily be done in 32-bit if that's more appropriate.
      
      Note that while we can look at the inputs to choose an appropriate size
      for the compare instruction, the output is fixed at 'unsigned long'.
      That's not technically optimal either, since a 32-bit 'sbbl' would often
      be sufficient.
      
      But for the outgoing mask we don't know how the mask ends up being used
      (ie we have uses that have an incoming 32-bit array index, but end up
      using the mask for other things).  That said, it only costs the extra
      REX prefix to always generate the 64-bit mask.
      
      [ A 'sbbl' also always technically generates a 64-bit mask, but with the
        upper 32 bits clear: that's fine for when the incoming index that will
        be masked is already 32-bit, but not if you use the mask to mask a
        pointer afterwards, like the file table lookup does ]
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7453b948
    • Linus Torvalds's avatar
      clang: work around asm input constraint problems · dbaaabd6
      Linus Torvalds authored
      Work around clang problems with asm constraints that have multiple
      possibilities, particularly "g" and "rm".
      
      Clang seems to turn inputs like that into the most generic form, which
      is the memory input - but to make matters worse, clang won't even use a
      possible original memory location, but will spill the value to stack,
      and use the stack for the asm input.
      
      See
      
        https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442
      
      for some explanation of why clang has this strange behavior, but the end
      result is that "g" and "rm" really end up generating horrid code.
      
      Link: https://github.com/llvm/llvm-project/issues/20571
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbaaabd6
  2. 12 May, 2024 5 commits
  3. 11 May, 2024 10 commits
  4. 10 May, 2024 20 commits
  5. 09 May, 2024 2 commits