1. 24 Jul, 2023 7 commits
    • Heiko Carstens's avatar
      s390/hypfs: factor out filesystem code · 3325b4d8
      Heiko Carstens authored
      The s390_hypfs filesystem is deprecated and shouldn't be used due to its
      rather odd semantics. It creates a whole directory structure with static
      file contents so a user can read a consistent state while within that
      directory.
      Writing to its update attribute will remove and rebuild nearly the whole
      filesystem, so that again a user can read a consistent state, even if
      multiple files need to be read.
      
      Given that this wastes a lot of CPU cycles, and involves a lot of code,
      binary interfaces have been added quite a couple of years ago, which simply
      pass the binary data to user space, and let user space decode the data.
      This is the preferred and only way how the data should be retrieved.
      
      The assumption is that there are no users of the s390_hypfs filesystem.
      However instead of just removing the code, and having to revert in case
      there are actually users, factor the filesystem code out and make it only
      available via a new config option.
      
      This config option is supposed to be disabled. If it turns out there are no
      complaints the filesystem code can be removed probably in a couple of
      years.
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      3325b4d8
    • Heiko Carstens's avatar
      s390/hypfs: remove open-coded PTR_ALIGN() · b7857acc
      Heiko Carstens authored
      Get rid of page_align_ptr() and use PTR_ALIGN() instead.
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      b7857acc
    • Heiko Carstens's avatar
      s390/hypfs: simplify memory allocation · 83f95671
      Heiko Carstens authored
      Simplify memory allocation for diagnose 204 memory buffer:
      
      - allocate with __vmalloc_node() to enure page alignment
      - allocate real / physical memory area also within vmalloc area and handle
        vmalloc to real / physical address translation within diag204().
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Reviewed-by: default avatarMete Durlu <meted@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      83f95671
    • Harald Freudenberger's avatar
      s390/zcrypt: remove CEX2 and CEX3 device drivers · 5ac8c724
      Harald Freudenberger authored
      Remove the legacy device driver code for CEX2 and CEX3 cards.
      
      The last machines which are able to handle CEX2 crypto cards
      are z10 EC first available 2008 and z10 BC first available 2009.
      The last machines able to handle a CEX3 crypto card are
      z196 first available 2010 and z114 first available 2011.
      
      Please note that this does not imply to drop CEX2 and CEX3
      support in general. With older kernels on hardware up to the
      aforementioned machine models these crypto cards will get
      support by IBM.
      
      The removal of the CEX2 and CEX3 device drivers code opens up
      some simplifications, for example support for crypto cards
      without rng support can be removed also.
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      5ac8c724
    • Heiko Carstens's avatar
      s390/sthyi: enforce 4k alignment of vmalloc'ed area · 86e74965
      Heiko Carstens authored
      vmalloc() does not guarantee any alignment, unless it is explicitly
      requested with e.g. __vmalloc_node(). Using diag204() with subcode 7
      requires a 4k aligned virtual buffer. Therefore switch to __vmalloc_node().
      
      Note: with the current vmalloc() implementation callers would still get a
      4k aligned area, even though this is quite non-obvious looking at the
      code. So changing this in sthyi doesn't fix a real bug. It is just to make
      sure the code will not suffer from some obscure options, like it happened
      in the past with kmalloc() where debug options changed the assumed
      alignment of allocated memory areas.
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      86e74965
    • Heiko Carstens's avatar
      s390/diag: handle diag 204 subcode 4 address correctly · c83cd4fe
      Heiko Carstens authored
      Diagnose 204 subcode 4 requires a real (physical) address, but a
      virtual address is passed to the inline assembly.
      
      Convert the address to a physical address for only this specific case.
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      c83cd4fe
    • Anastasia Eskova's avatar
      s390: add support for user-defined certificates · 8cf57d72
      Anastasia Eskova authored
      Enable receiving the user-defined certificates from the s390x
      hypervisor via new diagnose 0x320 calls, and make them available to the
      Linux root user as 'cert_store_key' type keys in a so-called
      'cert_store' keyring.
      
      New user-space interfaces:
      
        /sys/firmware/cert_store/refresh
      
          Writing to this attribute re-fetches certificates via DIAG 0x320
      
        /sys/firmware/cert_store/cs_status
      
          Reading from this attribute returns either of:
      
      	  "uninitialized"
      	    If no certificate has been retrieved yet
      	  "ok"
      	    If certificates have been successfully retrieved
      	  "failed (<number>)"
      	    If certificate retrieval failed with reason code <number>
      
      New debug trace areas:
      
        /sys/kernel/debug/s390dbf/cert_store_msg
      
        /sys/kernel/debug/s390dbf/cert_store_hexdump
      
      Usage example:
      
      To initiate request for certificates available to the system as root:
      
        $ echo 1 > /sys/firmware/cert_store/refresh
      
      Upon success the '/sys/firmware/cert_store/cs_status' contains
      the value 'ok'.
      
        $ cat /sys/firmware/cert_store/cs_status
        ok
      
      Get the ID of the keyring 'cert_store':
      
        $ keyctl search @us keyring cert_store
      OR
        $ keyctl link @us @s; keyctl request keyring cert_store
      
      Obtain list of IDs of certificates:
      
        $ keyctl rlist <cert_store keyring ID>
      
      Display certificate content as hex-dump:
      
        $ keyctl read <certificate ID>
      
      Read certificate contents as binary data:
      
        $ keyctl pipe <certificate ID> >cert_data
      
      Display certificate description:
      
        $ keyctl describe <certificate ID>
      
      The certificate description has the following format:
      
        <64 bytes certificate name in EBCDIC> ':'
        <certificate index as obtained from hypervisor> ':'
        <certificate store token obtained from hypervisor>
      
      The certificate description in /proc/keys has certificate name
      represented in ASCII.
      
      Users can read but cannot update the content of the certificate.
      Signed-off-by: default avatarAnastasia Eskova <anastasia.eskova@ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      8cf57d72
  2. 23 Jul, 2023 18 commits
    • Linus Torvalds's avatar
      Linux 6.5-rc3 · 6eaae198
      Linus Torvalds authored
      6eaae198
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 3b4e48b8
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Swapping the ring buffer for snapshotting (for things like irqsoff)
         can crash if the ring buffer is being resized. Disable swapping when
         this happens. The missed swap will be reported to the tracer
      
       - Report error if the histogram fails to be created due to an error in
         adding a histogram variable, in event_hist_trigger_parse()
      
       - Remove unused declaration of tracing_map_set_field_descr()
      
      * tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing/histograms: Return an error if we fail to add histogram to hist_vars list
        ring-buffer: Do not swap cpu_buffer during resize process
        tracing: Remove unused extern declaration tracing_map_set_field_descr()
      3b4e48b8
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.5' of... · 12a5336c
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix stale help text in gconfig
      
       - Support *.S files in compile_commands.json
      
       - Flatten KBUILD_CFLAGS
      
       - Fix external module builds with Rust so that temporary files are
         created in the modules directories instead of the kernel tree
      
      * tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: rust: avoid creating temporary files
        kbuild: flatten KBUILD_CFLAGS
        gen_compile_commands: add assembly files to compilation database
        kconfig: gconfig: correct program name in help text
        kconfig: gconfig: drop the Show Debug Info help text
      12a5336c
    • Miguel Ojeda's avatar
      kbuild: rust: avoid creating temporary files · df01b7cf
      Miguel Ojeda authored
      `rustc` outputs by default the temporary files (i.e. the ones saved
      by `-Csave-temps`, such as `*.rcgu*` files) in the current working
      directory when `-o` and `--out-dir` are not given (even if
      `--emit=x=path` is given, i.e. it does not use those for temporaries).
      
      Since out-of-tree modules are compiled from the `linux` tree,
      `rustc` then tries to create them there, which may not be accessible.
      
      Thus pass `--out-dir` explicitly, even if it is just for the temporary
      files.
      
      Similarly, do so for Rust host programs too.
      Reported-by: default avatarRaphael Nestler <raphael.nestler@gmail.com>
      Closes: https://github.com/Rust-for-Linux/linux/issues/1015Reported-by: default avatarAndrea Righi <andrea.righi@canonical.com>
      Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs
      Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs
      Fixes: 295d8398 ("kbuild: specify output names separately for each emission type from rustc")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Tested-by: default avatarMartin Rodriguez Reboredo <yakoyoku@gmail.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      df01b7cf
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 269f4a4b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Avoid pKVM finalization if KVM initialization fails
      
         - Add missing BTI instructions in the hypervisor, fixing an early
           boot failure on BTI systems
      
         - Handle MMU notifiers correctly for non hugepage-aligned memslots
      
         - Work around a bug in the architecture where hypervisor timer
           controls have UNKNOWN behavior under nested virt
      
         - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel
           BUG in cpu hotplug resulting from per-CPU accessor sanity checking
      
         - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
           consistently requesting a doorbell interrupt on vcpu_put()
      
         - Uphold RES0 sysreg behavior when emulating older PMU versions
      
         - Avoid macro expansion when initializing PMU register names,
           ensuring the tracepoints pretty-print the sysreg
      
        s390:
      
         - Two fixes for asynchronous destroy
      
        x86 fixes will come early next week"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: pv: fix index value of replaced ASCE
        KVM: s390: pv: simplify shutdown and fix race
        KVM: arm64: Fix the name of sys_reg_desc related to PMU
        KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
        KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
        KVM: arm64: Add missing BTI instructions
        KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
        KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
        KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
        KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
      269f4a4b
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 15b593ba
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
        checkpoint code"
      
      * tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
        ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
        ext4: correct inline offset when handling xattrs in inode body
        jbd2: remove __journal_try_to_free_buffer()
        jbd2: fix a race when checking checkpoint buffer busy
        jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
        jbd2: remove journal_clean_one_cp_list()
        jbd2: remove t_checkpoint_io_list
        jbd2: recheck chechpointing non-dirty buffer
      15b593ba
    • Linus Torvalds's avatar
      Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6 · 8266f53b
      Linus Torvalds authored
      Pull smb client fix from Steve French:
       "Add minor debugging improvement.
      
        The change improves ability to read a network trace to debug problems
        on encrypted connections which are very common (e.g. using wireshark
        or tcpdump).
      
        That works today with tools like 'smbinfo keys /mnt/file' but requires
        passing in a filename on the mount (see e.g. [1]), but it often makes
        more sense to just pass in the mount point path (ie a directory not a
        filename).
      
        So this fix was needed to debug some types of problems (an obvious
        example is on an encrypted connection failing operations on an empty
        share or with no files in the root of the directory) - so you can
        simply pass in the 'smbinfo keys <mntpoint>' and get the information
        that wireshark needs"
      
      Link: https://wiki.samba.org/index.php/Wireshark_Decryption [1]
      
      * tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module version number for cifs.ko
        cifs: allow dumping keys for directories too
      8266f53b
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-master-6.5-1' of... · 0c189708
      Paolo Bonzini authored
      Merge tag 'kvm-s390-master-6.5-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      Two fixes for asynchronous destroy
      0c189708
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.5-1' of... · 675a15f4
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.5, part #1
      
       - Avoid pKVM finalization if KVM initialization fails
      
       - Add missing BTI instructions in the hypervisor, fixing an early boot
         failure on BTI systems
      
       - Handle MMU notifiers correctly for non hugepage-aligned memslots
      
       - Work around a bug in the architecture where hypervisor timer controls
         have UNKNOWN behavior under nested virt.
      
       - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG
         in cpu hotplug resulting from per-CPU accessor sanity checking.
      
       - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
         consistently requesting a doorbell interrupt on vcpu_put()
      
       - Uphold RES0 sysreg behavior when emulating older PMU versions
      
       - Avoid macro expansion when initializing PMU register names, ensuring
         the tracepoints pretty-print the sysreg.
      675a15f4
    • Mohamed Khalfella's avatar
      tracing/histograms: Return an error if we fail to add histogram to hist_vars list · 4b8b3905
      Mohamed Khalfella authored
      Commit 6018b585 ("tracing/histograms: Add histograms to hist_vars if
      they have referenced variables") added a check to fail histogram creation
      if save_hist_vars() failed to add histogram to hist_vars list. But the
      commit failed to set ret to failed return code before jumping to
      unregister histogram, fix it.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com
      
      Cc: stable@vger.kernel.org
      Fixes: 6018b585 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables")
      Signed-off-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      4b8b3905
    • Chen Lin's avatar
      ring-buffer: Do not swap cpu_buffer during resize process · 8a96c028
      Chen Lin authored
      When ring_buffer_swap_cpu was called during resize process,
      the cpu buffer was swapped in the middle, resulting in incorrect state.
      Continuing to run in the wrong state will result in oops.
      
      This issue can be easily reproduced using the following two scripts:
      /tmp # cat test1.sh
      //#! /bin/sh
      for i in `seq 0 100000`
      do
               echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
               sleep 0.5
               echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
               sleep 0.5
      done
      /tmp # cat test2.sh
      //#! /bin/sh
      for i in `seq 0 100000`
      do
              echo irqsoff > /sys/kernel/debug/tracing/current_tracer
              sleep 1
              echo nop > /sys/kernel/debug/tracing/current_tracer
              sleep 1
      done
      /tmp # ./test1.sh &
      /tmp # ./test2.sh &
      
      A typical oops log is as follows, sometimes with other different oops logs.
      
      [  231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
      [  231.713375] Modules linked in:
      [  231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec2 #15
      [  231.716750] Hardware name: linux,dummy-virt (DT)
      [  231.718152] Workqueue: events update_pages_handler
      [  231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  231.721171] pc : rb_update_pages+0x378/0x3f8
      [  231.722212] lr : rb_update_pages+0x25c/0x3f8
      [  231.723248] sp : ffff800082b9bd50
      [  231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
      [  231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
      [  231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
      [  231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
      [  231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
      [  231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
      [  231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
      [  231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
      [  231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
      [  231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
      [  231.744196] Call trace:
      [  231.744892]  rb_update_pages+0x378/0x3f8
      [  231.745893]  update_pages_handler+0x1c/0x38
      [  231.746893]  process_one_work+0x1f0/0x468
      [  231.747852]  worker_thread+0x54/0x410
      [  231.748737]  kthread+0x124/0x138
      [  231.749549]  ret_from_fork+0x10/0x20
      [  231.750434] ---[ end trace 0000000000000000 ]---
      [  233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [  233.721696] Mem abort info:
      [  233.721935]   ESR = 0x0000000096000004
      [  233.722283]   EC = 0x25: DABT (current EL), IL = 32 bits
      [  233.722596]   SET = 0, FnV = 0
      [  233.722805]   EA = 0, S1PTW = 0
      [  233.723026]   FSC = 0x04: level 0 translation fault
      [  233.723458] Data abort info:
      [  233.723734]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
      [  233.724176]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      [  233.724589]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
      [  233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
      [  233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
      [  233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
      [  233.726720] Modules linked in:
      [  233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec2 #15
      [  233.727777] Hardware name: linux,dummy-virt (DT)
      [  233.728225] Workqueue: events update_pages_handler
      [  233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  233.729054] pc : rb_update_pages+0x1a8/0x3f8
      [  233.729334] lr : rb_update_pages+0x154/0x3f8
      [  233.729592] sp : ffff800082b9bd50
      [  233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
      [  233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
      [  233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
      [  233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
      [  233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
      [  233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
      [  233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
      [  233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
      [  233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
      [  233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
      [  233.734418] Call trace:
      [  233.734593]  rb_update_pages+0x1a8/0x3f8
      [  233.734853]  update_pages_handler+0x1c/0x38
      [  233.735148]  process_one_work+0x1f0/0x468
      [  233.735525]  worker_thread+0x54/0x410
      [  233.735852]  kthread+0x124/0x138
      [  233.736064]  ret_from_fork+0x10/0x20
      [  233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
      [  233.736959] ---[ end trace 0000000000000000 ]---
      
      After analysis, the seq of the error is as follows [1-5]:
      
      int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
      			int cpu_id)
      {
      	for_each_buffer_cpu(buffer, cpu) {
      		cpu_buffer = buffer->buffers[cpu];
      		//1. get cpu_buffer, aka cpu_buffer(A)
      		...
      		...
      		schedule_work_on(cpu,
      		 &cpu_buffer->update_pages_work);
      		//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
      		// update_pages_handler, do the update process, set 'update_done' in
      		// complete(&cpu_buffer->update_done) and to wakeup resize process.
      	//---->
      		//3. Just at this moment, ring_buffer_swap_cpu is triggered,
      		//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
      		//ring_buffer_swap_cpu is called as the 'Call trace' below.
      
      		Call trace:
      		 dump_backtrace+0x0/0x2f8
      		 show_stack+0x18/0x28
      		 dump_stack+0x12c/0x188
      		 ring_buffer_swap_cpu+0x2f8/0x328
      		 update_max_tr_single+0x180/0x210
      		 check_critical_timing+0x2b4/0x2c8
      		 tracer_hardirqs_on+0x1c0/0x200
      		 trace_hardirqs_on+0xec/0x378
      		 el0_svc_common+0x64/0x260
      		 do_el0_svc+0x90/0xf8
      		 el0_svc+0x20/0x30
      		 el0_sync_handler+0xb0/0xb8
      		 el0_sync+0x180/0x1c0
      	//<----
      
      	/* wait for all the updates to complete */
      	for_each_buffer_cpu(buffer, cpu) {
      		cpu_buffer = buffer->buffers[cpu];
      		//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
      		//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
      		//for example, cpu_buffer(A)->update_done will leave be set 1, and will
      		//not 'wait_for_completion' at the next resize round.
      		  if (!cpu_buffer->nr_pages_to_update)
      			continue;
      
      		if (cpu_online(cpu))
      			wait_for_completion(&cpu_buffer->update_done);
      		cpu_buffer->nr_pages_to_update = 0;
      	}
      	...
      }
      	//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
      	//Continuing to run in the wrong state, then oops occurs.
      
      Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cnSigned-off-by: default avatarChen Lin <chen.lin5@zte.com.cn>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      8a96c028
    • YueHaibing's avatar
      tracing: Remove unused extern declaration tracing_map_set_field_descr() · 1faf7e4a
      YueHaibing authored
      Since commit 08d43a5f ("tracing: Add lock-free tracing_map"),
      this is never used, so can be removed.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com
      
      Cc: <mhiramat@kernel.org>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      1faf7e4a
    • Alexey Dobriyan's avatar
      kbuild: flatten KBUILD_CFLAGS · 0817d259
      Alexey Dobriyan authored
      Make it slightly easier to see which compiler options are added and
      removed (and not worry about column limit too!).
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      0817d259
    • Benjamin Gray's avatar
      gen_compile_commands: add assembly files to compilation database · 1c679214
      Benjamin Gray authored
      Like C source files, tooling can find it useful to have the assembly
      source file compilation recorded.
      
      The .S extension appears to used across all architectures.
      Signed-off-by: default avatarBenjamin Gray <bgray@linux.ibm.com>
      Reviewed-by: default avatarFangrui Song <maskray@google.com>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1c679214
    • Ojaswin Mujoo's avatar
      ext4: fix rbtree traversal bug in ext4_mb_use_preallocated · 9d3de7ee
      Ojaswin Mujoo authored
      During allocations, while looking for preallocations(PA) in the per
      inode rbtree, we can't do a direct traversal of the tree because
      ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
      and that can cause direct traversal to skip some entries. This was
      leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
      our request and ultimately tried to create a new PA that would overlap
      with the missed one.
      
      To makes sure we handle that case while still keeping the performance of
      the rbtree, we make use of the fact that the only pa that could possibly
      overlap the original goal start is the one that satisfies the below
      conditions:
      
        1. It must have it's logical start immediately to the left of
        (ie less than) original logical start.
      
        2. It must not be deleted
      
      To find this pa we use the following traversal method:
      
      1. Descend into the rbtree normally to find the immediate neighboring
      PA. Here we keep descending irrespective of if the PA is deleted or if
      it overlaps with our request etc. The goal is to find an immediately
      adjacent PA.
      
      2. If the found PA is on right of original goal, use rb_prev() to find
      the left adjacent PA.
      
      3. Check if this PA is deleted and keep moving left with rb_prev() until
      a non deleted PA is found.
      
      4. This is the PA we are looking for. Now we can check if it can satisfy
      the original request and proceed accordingly.
      
      This approach also takes care of having deleted PAs in the tree.
      
      (While we are at it, also fix a possible overflow bug in calculating the
      end of a PA)
      
      [1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/
      
      Cc: stable@kernel.org # 6.4
      Fixes: 38727786 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
      Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
      Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      9d3de7ee
    • Ojaswin Mujoo's avatar
      ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail() · 5d5460fa
      Ojaswin Mujoo authored
      In ext4_mb_choose_next_group_best_avail(), we want the start order to be
      1 less than goal length and the min_order to be, at max, 1 more than the
      original length. This commit fixes an off by one issue that arose due to
      the fact that 1 << fls(n) > (n).
      
      After all the processing:
      
      order = 1 order below goal len
      min_order = maximum of the three:-
                   - order - trim_order
                   - 1 order below B2C(s_stripe)
                   - 1 order above original len
      
      Cc: stable@kernel.org
      Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      5d5460fa
    • Eric Whitney's avatar
      ext4: correct inline offset when handling xattrs in inode body · 6909cf5c
      Eric Whitney authored
      When run on a file system where the inline_data feature has been
      enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
      to emit error messages indicating that inline directory entries are
      corrupted.  This occurs because the inline offset used to locate
      inline directory entries in the inode body is not updated when an
      xattr in that shared region is deleted and the region is shifted in
      memory to recover the space it occupied.  If the deleted xattr precedes
      the system.data attribute, which points to the inline directory entries,
      that attribute will be moved further up in the region.  The inline
      offset continues to point to whatever is located in system.data's former
      location, with unfortunate effects when used to access directory entries
      or (presumably) inline data in the inode body.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6909cf5c
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c2782531
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Reinstate support for little endian ELFv1 binaries, which it turns
         out still exist in the wild.
      
       - Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it
         lead to dead code generation and seemed to trigger compiler bugs in
         some edge cases.
      
       - Fix a deadlock in the pseries VAS code, between live migration and
         the driver's mmap handler.
      
       - Disable KCOV instrumentation in the powerpc KASAN code.
      
      Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren
      Myneni, Russell Currey, and Uwe Kleine-König.
      
      * tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"
        powerpc/kasan: Disable KCOV in KASAN code
        powerpc/512x: lpbfifo: Convert to platform remove callback returning void
        powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files
        Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"
        powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
      c2782531
  3. 22 Jul, 2023 8 commits
    • Steve French's avatar
      cifs: update internal module version number for cifs.ko · ba61a03a
      Steve French authored
      From 2.43 to 2.44
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      ba61a03a
    • Shyam Prasad N's avatar
      cifs: allow dumping keys for directories too · b3edef6b
      Shyam Prasad N authored
      Dumping the enc/dec keys is a session wide operation.
      And it should not matter if the ioctl was run on
      a regular file or a directory.
      
      Currently, we obtain the tcon pointer from the
      cifs file handle. But since there's no dir open call
      in cifs, this is not populated for dirs.
      
      This change allows dumping of session keys using ioctl
      even for directories. To do this, we'll now get the
      tcon pointer from the superblock, and not from the file
      handle.
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      b3edef6b
    • Linus Torvalds's avatar
      Merge tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 295e1388
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Fix per vma lock fault handling: add missing !(fault & VM_FAULT_ERROR)
         check to fault handler to prevent error handling for return values
         that don't indicate an error
      
       - Use kfree_sensitive() instead of kfree() in paes crypto code to clear
         memory that may contain keys before freeing it
      
       - Fix reply buffer size calculation for CCA replies in zcrypt device
         driver
      
      * tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/zcrypt: fix reply buffer calculations for CCA replies
        s390/crypto: use kfree_sensitive() instead of kfree()
        s390/mm: fix per vma lock fault handling
      295e1388
    • Linus Torvalds's avatar
      Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux · f036d67c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix for loop regressions (Mauricio)
      
       - Fix a potential stall with batched wakeups in sbitmap (David)
      
       - Fix for stall with recursive plug flushes (Ross)
      
       - Skip accounting of empty requests for blk-iocost (Chengming)
      
       - Remove a dead field in struct blk_mq_hw_ctx (Chengming)
      
      * tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux:
        loop: do not enforce max_loop hard limit by (new) default
        loop: deprecate autoloading callback loop_probe()
        sbitmap: fix batching wakeup
        blk-iocost: skip empty flush bio in iocost
        blk-mq: delete dead struct blk_mq_hw_ctx->queued field
        blk-mq: Fix stall due to recursive flush plug
      f036d67c
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux · bdd1d82e
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix for io-wq not always honoring REQ_F_NOWAIT, if it was set and
         punted directly (eg via DRAIN) (me)
      
       - Capability check fix (Ondrej)
      
       - Regression fix for the mmap changes that went into 6.4, which
         apparently broke IA64 (Helge)
      
      * tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux:
        ia64: mmap: Consider pgoff when searching for free mapping
        io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
        io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
        io_uring: don't audit the capability check in io_uring_create()
      bdd1d82e
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 725d444d
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Fix moortec,mr75203 schema usage of 'multipleOf' keyword
      
       - Fix regression in systems depending on "of-display" device name
      
       - Build fix for s390 with CONFIG_PCI=n and OF_EARLY_FLATTREE=y
      
       - Drop two obsolete serial .txt bindings
      
      * tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt
        dt-bindings: serial: Remove obsolete cavium-uart.txt
        dt-bindings: hwmon: moortec,mr75203: fix multipleOf for coefficients
        of: Preserve "of-display" device name for compatibility
        of: make OF_EARLY_FLATTREE depend on HAS_IOMEM
      725d444d
    • Linus Torvalds's avatar
      Merge tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 39b14286
      Linus Torvalds authored
      Pull regmap fixes from Mark Brown:
       "Three fixes here:
      
         - The issues with accounting for register and padding length on raw
           buses turn out to be quite widespread in custom buses.
      
           In order to avoid disturbing anything drop the initial fixes and
           fall back to a point fix in the SMBus code where the issue was
           originally noticed, a more substantial refactoring of the API which
           ensures that all buses make the same assumptions will follow.
      
         - The generic regcache code had been forcing on async I/O which did
           not work with the new maple tree sync code when used with SPI.
      
           Since that was mainly for the rbtree cache and the assumptions
           about hardware that drove the choice are probably not true any more
           fix this by pushing the enablement of async down into the rbtree
           code.
      
           This probably also makes cache syncs for systems faster though it's
           not the point.
      
         - The test code was triggering use of the rbtree and maple tree
           caches with dynamic allocation of nodes since all the testing is
           with RAM backed caches with no I/O performance issues.
      
           Just disable the locking in the tests to avoid triggering warnings
           when allocation debugging is turned on, it's not really what's
           being tested"
      
      * tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: Disable locking for RBTREE and MAPLE unit tests
        regcache: Push async I/O request down into the rbtree cache
        regmap: Account for register length in SMBus I/O limits
        regmap: Drop initial version of maximum transfer length fixes
      39b14286
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · c0842db5
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix initial value handling for output-only pins in gpio-tps68470
      
       - fix two resource leaks in gpio-mvebu
      
      * tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: mvebu: fix irq domain leak
        gpio: mvebu: Make use of devm_pwmchip_add
        gpio: tps68470: Make tps68470_gpio_output() always set the initial value
      c0842db5
  4. 21 Jul, 2023 7 commits