1. 08 Nov, 2023 11 commits
    • Hengqi Chen's avatar
      LoongArch: BPF: Support signed div instructions · 2425c9e0
      Hengqi Chen authored
      Add support for signed div instructions.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      2425c9e0
    • Hengqi Chen's avatar
      LoongArch: BPF: Support 32-bit offset jmp instructions · 9ddd2b8d
      Hengqi Chen authored
      Add support for 32-bit offset jmp instruction. Currently, we use b
      instruction which supports range within ±128MB for such jumps. This
      should be large enough for BPF progs.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      9ddd2b8d
    • Hengqi Chen's avatar
      LoongArch: BPF: Support unconditional bswap instructions · 4ebf9216
      Hengqi Chen authored
      Add support for unconditional bswap instruction. Since LoongArch is
      always little-endian, just treat unconditional bswap the same as big-
      endian conversion.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      4ebf9216
    • Hengqi Chen's avatar
      LoongArch: BPF: Support sign-extension mov instructions · f48012f1
      Hengqi Chen authored
      Add support for sign-extension mov instructions.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      f48012f1
    • Hengqi Chen's avatar
      LoongArch: BPF: Support sign-extension load instructions · 7111afe8
      Hengqi Chen authored
      Add support for sign-extension load instructions.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      7111afe8
    • Hengqi Chen's avatar
      LoongArch: Add more instruction opcodes and emit_* helpers · add28024
      Hengqi Chen authored
      This patch adds more instruction opcodes and their corresponding emit_*
      helpers which will be used in later patches.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      add28024
    • Huacai Chen's avatar
      LoongArch/smp: Call rcutree_report_cpu_starting() earlier · a2ccf463
      Huacai Chen authored
      rcutree_report_cpu_starting() must be called before cpu_probe() to avoid
      the following lockdep splat that triggered by calling __alloc_pages() when
      CONFIG_PROVE_RCU_LIST=y:
      
       =============================
       WARNING: suspicious RCU usage
       6.6.0+ #980 Not tainted
       -----------------------------
       kernel/locking/lockdep.c:3761 RCU-list traversed in non-reader section!!
       other info that might help us debug this:
       RCU used illegally from offline CPU!
       rcu_scheduler_active = 1, debug_locks = 1
       1 lock held by swapper/1/0:
        #0: 900000000c82ef98 (&pcp->lock){+.+.}-{2:2}, at: get_page_from_freelist+0x894/0x1790
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.0+ #980
       Stack : 0000000000000001 9000000004f79508 9000000004893670 9000000100310000
               90000001003137d0 0000000000000000 90000001003137d8 9000000004f79508
               0000000000000000 0000000000000001 0000000000000000 90000000048a3384
               203a656d616e2065 ca43677b3687e616 90000001002c3480 0000000000000008
               000000000000009d 0000000000000000 0000000000000001 80000000ffffe0b8
               000000000000000d 0000000000000033 0000000007ec0000 13bbf50562dad831
               9000000005140748 0000000000000000 9000000004f79508 0000000000000004
               0000000000000000 9000000005140748 90000001002bad40 0000000000000000
               90000001002ba400 0000000000000000 9000000003573ec8 0000000000000000
               00000000000000b0 0000000000000004 0000000000000000 0000000000070000
               ...
       Call Trace:
       [<9000000003573ec8>] show_stack+0x38/0x150
       [<9000000004893670>] dump_stack_lvl+0x74/0xa8
       [<900000000360d2bc>] lockdep_rcu_suspicious+0x14c/0x190
       [<900000000361235c>] __lock_acquire+0xd0c/0x2740
       [<90000000036146f4>] lock_acquire+0x104/0x2c0
       [<90000000048a955c>] _raw_spin_lock_irqsave+0x5c/0x90
       [<900000000381cd5c>] rmqueue_bulk+0x6c/0x950
       [<900000000381fc0c>] get_page_from_freelist+0xd4c/0x1790
       [<9000000003821c6c>] __alloc_pages+0x1bc/0x3e0
       [<9000000003583b40>] tlb_init+0x150/0x2a0
       [<90000000035742a0>] per_cpu_trap_init+0xf0/0x110
       [<90000000035712fc>] cpu_probe+0x3dc/0x7a0
       [<900000000357ed20>] start_secondary+0x40/0xb0
       [<9000000004897138>] smpboot_entry+0x54/0x58
      
      raw_smp_processor_id() is required in order to avoid calling into lockdep
      before RCU has declared the CPU to be watched for readers.
      
      See also commit 29368e09 ("x86/smpboot: Move rcu_cpu_starting() earlier"),
      commit de5d9dae ("s390/smp: move rcu_cpu_starting() earlier") and commit
      99f070b6 ("powerpc/smp: Call rcu_cpu_starting() earlier").
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      a2ccf463
    • WANG Rui's avatar
      LoongArch: Relax memory ordering for atomic operations · affef66b
      WANG Rui authored
      This patch relaxes the implementation while satisfying the memory ordering
      requirements for atomic operations, which will help improve performance on
      LA664+.
      
      Unixbench with full threads (8)
                                                 before       after
        Dhrystone 2 using register variables   203910714.2  203909539.8   0.00%
        Double-Precision Whetstone                 37930.9        37931   0.00%
        Execl Throughput                           29431.5      29545.8   0.39%
        File Copy 1024 bufsize 2000 maxblocks    6645759.5      6676320   0.46%
        File Copy 256 bufsize 500 maxblocks      2138772.4    2144182.4   0.25%
        File Copy 4096 bufsize 8000 maxblocks   11640698.4     11602703  -0.33%
        Pipe Throughput                          8849077.7    8917009.4   0.77%
        Pipe-based Context Switching             1255108.5    1287277.3   2.56%
        Process Creation                           50825.9      50442.1  -0.76%
        Shell Scripts (1 concurrent)               25795.8      25942.3   0.57%
        Shell Scripts (8 concurrent)                3812.6       3835.2   0.59%
        System Call Overhead                     9248212.6    9353348.6   1.14%
                                                                        =======
        System Benchmarks Index Score               8076.6       8114.4   0.47%
      Signed-off-by: default avatarWANG Rui <wangrui@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      affef66b
    • Nathan Chancellor's avatar
      LoongArch: Mark __percpu functions as always inline · 71945968
      Nathan Chancellor authored
      A recent change to the optimization pipeline in LLVM reveals some
      fragility around the inlining of LoongArch's __percpu functions, which
      manifests as a BUILD_BUG() failure:
      
        In file included from kernel/sched/build_policy.c:17:
        In file included from include/linux/sched/cputime.h:5:
        In file included from include/linux/sched/signal.h:5:
        In file included from include/linux/rculist.h:11:
        In file included from include/linux/rcupdate.h:26:
        In file included from include/linux/irqflags.h:18:
        arch/loongarch/include/asm/percpu.h:97:3: error: call to '__compiletime_assert_51' declared with 'error' attribute: BUILD_BUG failed
           97 |                 BUILD_BUG();
              |                 ^
        include/linux/build_bug.h:59:21: note: expanded from macro 'BUILD_BUG'
           59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed")
              |                     ^
        include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
           39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
              |                                     ^
        include/linux/compiler_types.h:425:2: note: expanded from macro 'compiletime_assert'
          425 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
              |         ^
        include/linux/compiler_types.h:413:2: note: expanded from macro '_compiletime_assert'
          413 |         __compiletime_assert(condition, msg, prefix, suffix)
              |         ^
        include/linux/compiler_types.h:406:4: note: expanded from macro '__compiletime_assert'
          406 |                         prefix ## suffix();                             \
              |                         ^
        <scratch space>:86:1: note: expanded from here
           86 | __compiletime_assert_51
              | ^
        1 error generated.
      
      If these functions are not inlined (which the compiler is free to do
      even with functions marked with the standard 'inline' keyword), the
      BUILD_BUG() in the default case cannot be eliminated since the compiler
      cannot prove it is never used, resulting in a build failure due to the
      error attribute.
      
      Mark these functions as __always_inline to guarantee inlining so that
      the BUILD_BUG() only triggers when the default case genuinely cannot be
      eliminated due to an unexpected size.
      
      Cc:  <stable@vger.kernel.org>
      Closes: https://github.com/ClangBuiltLinux/linux/issues/1955
      Fixes: 46859ac8 ("LoongArch: Add multi-processor (SMP) support")
      Link: https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16eSuggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      71945968
    • WANG Rui's avatar
      LoongArch: Disable module from accessing external data directly · 21eb2bfe
      WANG Rui authored
      The distance between vmlinux and the module is too far so that PC-REL
      cannot be accessed directly, only GOT.
      
      When compiling module with GCC, the option `-mdirect-extern-access` is
      disabled by default. The Clang option `-fdirect-access-external-data` is
      enabled by default, so it needs to be explicitly disabled.
      Signed-off-by: default avatarWANG Rui <wangrui@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      21eb2bfe
    • Huacai Chen's avatar
      LoongArch: Support PREEMPT_DYNAMIC with static keys · 80c7889d
      Huacai Chen authored
      Since commit 4e90d052 ("riscv: support PREEMPT_DYNAMIC with
      static keys"), the infrastructure is complete and we can simply select
      HAVE_PREEMPT_DYNAMIC_KEY to enable PREEMPT_DYNAMIC on LoongArch because
      we already support static keys.
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      80c7889d
  2. 01 Nov, 2023 1 commit
  3. 30 Oct, 2023 1 commit
  4. 28 Oct, 2023 15 commits
  5. 27 Oct, 2023 12 commits