1. 05 Nov, 2022 3 commits
    • Peter Zijlstra's avatar
      x86,pm: Force out-of-line memcpy() · b32fd8a6
      Peter Zijlstra authored
      GCC fancies inlining memcpy(), and because it cannot prove the
      destination is page-aligned (it is) it ends up generating atrocious
      code like:
      
       19e:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 1a5 <relocate_restore_code+0x25> 1a1: R_X86_64_PC32      core_restore_code-0x4
       1a5:   48 8d 78 08             lea    0x8(%rax),%rdi
       1a9:   48 89 c1                mov    %rax,%rcx
       1ac:   48 c7 c6 00 00 00 00    mov    $0x0,%rsi        1af: R_X86_64_32S       core_restore_code
       1b3:   48 83 e7 f8             and    $0xfffffffffffffff8,%rdi
       1b7:   48 29 f9                sub    %rdi,%rcx
       1ba:   48 89 10                mov    %rdx,(%rax)
       1bd:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 1c4 <relocate_restore_code+0x44> 1c0: R_X86_64_PC32      core_restore_code+0xff4
       1c4:   48 29 ce                sub    %rcx,%rsi
       1c7:   81 c1 00 10 00 00       add    $0x1000,%ecx
       1cd:   48 89 90 f8 0f 00 00    mov    %rdx,0xff8(%rax)
       1d4:   c1 e9 03                shr    $0x3,%ecx
       1d7:   f3 48 a5                rep movsq %ds:(%rsi),%es:(%rdi)
      
      Notably the alignment code generates a text reference to
      code_restore_code+0xff8, for which objtool raises the objection:
      
        vmlinux.o: warning: objtool: relocate_restore_code+0x3d: relocation to !ENDBR: next_arg+0x18
      
      Applying some __assume_aligned(PAGE_SIZE) improve code-gen to:
      
       19e:   48 89 c7                mov    %rax,%rdi
       1a1:   48 c7 c6 00 00 00 00    mov    $0x0,%rsi        1a4: R_X86_64_32S       core_restore_code
       1a8:   b9 00 02 00 00          mov    $0x200,%ecx
       1ad:   f3 48 a5                rep movsq %ds:(%rsi),%es:(%rdi)
      
      And resolve the problem, however, none of this is important code and
      a much simpler solution still is to force a memcpy() call:
      
       1a1:   ba 00 10 00 00          mov    $0x1000,%edx
       1a6:   48 c7 c6 00 00 00 00    mov    $0x0,%rsi        1a9: R_X86_64_32S       core_restore_code
       1ad:   e8 00 00 00 00          call   1b2 <relocate_restore_code+0x32> 1ae: R_X86_64_PLT32     __memcpy-0x4
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      b32fd8a6
    • Peter Zijlstra's avatar
      objtool: Fix weak hole vs prefix symbol · 023f2340
      Peter Zijlstra authored
      Boris (and the robot) reported that objtool grew a new complaint about
      unreachable instructions. Upon inspection it was immediately clear
      the __weak zombie instructions struck again.
      
      For the unweary, the linker will simply remove the symbol for
      overriden __weak symbols but leave the instructions in place, creating
      unreachable instructions -- and objtool likes to report these.
      
      Commit 4adb2368 ("objtool: Ignore extra-symbol code") was supposed
      to have dealt with that, but the new commit 9f2899fe ("objtool:
      Add option to generate prefix symbols") subtly broke that logic by
      created unvisited symbols.
      
      Fixes: 9f2899fe ("objtool: Add option to generate prefix symbols")
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      023f2340
    • Peter Zijlstra's avatar
      objtool: Optimize elf_dirty_reloc_sym() · 19526717
      Peter Zijlstra authored
      When moving a symbol in the symtab its index changes and any reloc
      referring that symtol-table-index will need to be rewritten too.
      
      In order to facilitate this, objtool simply marks the whole reloc
      section 'changed' which will cause the whole section to be
      re-generated.
      
      However, finding the relocs that use any given symbol is implemented
      rather crudely -- a fully iteration of all sections and their relocs.
      Given that some builds have over 20k sections (kallsyms etc..)
      iterating all that for *each* symbol moved takes a bit of time.
      
      Instead have each symbol keep a list of relocs that reference it.
      
      This *vastly* improves build times for certain configs.
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/Y2LlRA7x+8UsE1xf@hirez.programming.kicks-ass.net
      19526717
  2. 01 Nov, 2022 10 commits
  3. 25 Oct, 2022 1 commit
  4. 22 Oct, 2022 1 commit
  5. 21 Oct, 2022 1 commit
  6. 20 Oct, 2022 4 commits
  7. 18 Oct, 2022 2 commits
  8. 17 Oct, 2022 18 commits