1. 17 Oct, 2024 5 commits
  2. 16 Oct, 2024 2 commits
  3. 15 Oct, 2024 4 commits
  4. 11 Oct, 2024 3 commits
  5. 10 Oct, 2024 9 commits
  6. 09 Oct, 2024 3 commits
  7. 08 Oct, 2024 2 commits
  8. 02 Oct, 2024 3 commits
    • Martin KaFai Lau's avatar
      Merge branch 'bpf: devmap: provide rxq after redirect' · bcd28cfd
      Martin KaFai Lau authored
      Florian Kauer says:
      
      ====================
      rxq contains a pointer to the device from where
      the redirect happened. Currently, the BPF program
      that was executed after a redirect via BPF_MAP_TYPE_DEVMAP*
      does not have it set.
      
      Add bugfix and related selftest.
      
      ---
      Changes in v4:
      - return -> goto out_close, thanks Toke
      - Link to v3: https://lore.kernel.org/r/20240909-devel-koalo-fix-ingress-ifindex-v3-0-66218191ecca@linutronix.de
      
      Changes in v3:
      - initialize skel to NULL, thanks Stanislav
      - Link to v2: https://lore.kernel.org/r/20240906-devel-koalo-fix-ingress-ifindex-v2-0-4caa12c644b4@linutronix.de
      
      Changes in v2:
      - changed fixes tag
      - added selftest
      - Link to v1: https://lore.kernel.org/r/20240905-devel-koalo-fix-ingress-ifindex-v1-1-d12a0d74c29c@linutronix.de
      ====================
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      bcd28cfd
    • Florian Kauer's avatar
      bpf: selftests: send packet to devmap redirect XDP · 49ebeb0c
      Florian Kauer authored
      The current xdp_devmap_attach test attaches a program
      that redirects to another program via devmap.
      
      It is, however, never executed, so do that to catch
      any bugs that might occur during execution.
      
      Also, execute the same for a veth pair so that we
      also cover the non-generic path.
      
      Warning: Running this without the bugfix in this series
      will likely crash your system.
      Signed-off-by: default avatarFlorian Kauer <florian.kauer@linutronix.de>
      Reviewed-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20240911-devel-koalo-fix-ingress-ifindex-v4-2-5c643ae10258@linutronix.deSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      49ebeb0c
    • Florian Kauer's avatar
      bpf: devmap: provide rxq after redirect · ca9984c5
      Florian Kauer authored
      rxq contains a pointer to the device from where
      the redirect happened. Currently, the BPF program
      that was executed after a redirect via BPF_MAP_TYPE_DEVMAP*
      does not have it set.
      
      This is particularly bad since accessing ingress_ifindex, e.g.
      
      SEC("xdp")
      int prog(struct xdp_md *pkt)
      {
              return bpf_redirect_map(&dev_redirect_map, 0, 0);
      }
      
      SEC("xdp/devmap")
      int prog_after_redirect(struct xdp_md *pkt)
      {
              bpf_printk("ifindex %i", pkt->ingress_ifindex);
              return XDP_PASS;
      }
      
      depends on access to rxq, so a NULL pointer gets dereferenced:
      
      <1>[  574.475170] BUG: kernel NULL pointer dereference, address: 0000000000000000
      <1>[  574.475188] #PF: supervisor read access in kernel mode
      <1>[  574.475194] #PF: error_code(0x0000) - not-present page
      <6>[  574.475199] PGD 0 P4D 0
      <4>[  574.475207] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
      <4>[  574.475217] CPU: 4 UID: 0 PID: 217 Comm: kworker/4:1 Not tainted 6.11.0-rc5-reduced-00859-g78080120 #23
      <4>[  574.475226] Hardware name: Intel(R) Client Systems NUC13ANHi7/NUC13ANBi7, BIOS ANRPL357.0026.2023.0314.1458 03/14/2023
      <4>[  574.475231] Workqueue: mld mld_ifc_work
      <4>[  574.475247] RIP: 0010:bpf_prog_5e13354d9cf5018a_prog_after_redirect+0x17/0x3c
      <4>[  574.475257] Code: cc cc cc cc cc cc cc 80 00 00 00 cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 66 90 55 48 89 e5 f3 0f 1e fa 48 8b 57 20 <48> 8b 52 00 8b 92 e0 00 00 00 48 bf f8 a6 d5 c4 5d a0 ff ff be 0b
      <4>[  574.475263] RSP: 0018:ffffa62440280c98 EFLAGS: 00010206
      <4>[  574.475269] RAX: ffffa62440280cd8 RBX: 0000000000000001 RCX: 0000000000000000
      <4>[  574.475274] RDX: 0000000000000000 RSI: ffffa62440549048 RDI: ffffa62440280ce0
      <4>[  574.475278] RBP: ffffa62440280c98 R08: 0000000000000002 R09: 0000000000000001
      <4>[  574.475281] R10: ffffa05dc8b98000 R11: ffffa05f577fca40 R12: ffffa05dcab24000
      <4>[  574.475285] R13: ffffa62440280ce0 R14: ffffa62440549048 R15: ffffa62440549000
      <4>[  574.475289] FS:  0000000000000000(0000) GS:ffffa05f4f700000(0000) knlGS:0000000000000000
      <4>[  574.475294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[  574.475298] CR2: 0000000000000000 CR3: 000000025522e000 CR4: 0000000000f50ef0
      <4>[  574.475303] PKRU: 55555554
      <4>[  574.475306] Call Trace:
      <4>[  574.475313]  <IRQ>
      <4>[  574.475318]  ? __die+0x23/0x70
      <4>[  574.475329]  ? page_fault_oops+0x180/0x4c0
      <4>[  574.475339]  ? skb_pp_cow_data+0x34c/0x490
      <4>[  574.475346]  ? kmem_cache_free+0x257/0x280
      <4>[  574.475357]  ? exc_page_fault+0x67/0x150
      <4>[  574.475368]  ? asm_exc_page_fault+0x26/0x30
      <4>[  574.475381]  ? bpf_prog_5e13354d9cf5018a_prog_after_redirect+0x17/0x3c
      <4>[  574.475386]  bq_xmit_all+0x158/0x420
      <4>[  574.475397]  __dev_flush+0x30/0x90
      <4>[  574.475407]  veth_poll+0x216/0x250 [veth]
      <4>[  574.475421]  __napi_poll+0x28/0x1c0
      <4>[  574.475430]  net_rx_action+0x32d/0x3a0
      <4>[  574.475441]  handle_softirqs+0xcb/0x2c0
      <4>[  574.475451]  do_softirq+0x40/0x60
      <4>[  574.475458]  </IRQ>
      <4>[  574.475461]  <TASK>
      <4>[  574.475464]  __local_bh_enable_ip+0x66/0x70
      <4>[  574.475471]  __dev_queue_xmit+0x268/0xe40
      <4>[  574.475480]  ? selinux_ip_postroute+0x213/0x420
      <4>[  574.475491]  ? alloc_skb_with_frags+0x4a/0x1d0
      <4>[  574.475502]  ip6_finish_output2+0x2be/0x640
      <4>[  574.475512]  ? nf_hook_slow+0x42/0xf0
      <4>[  574.475521]  ip6_finish_output+0x194/0x300
      <4>[  574.475529]  ? __pfx_ip6_finish_output+0x10/0x10
      <4>[  574.475538]  mld_sendpack+0x17c/0x240
      <4>[  574.475548]  mld_ifc_work+0x192/0x410
      <4>[  574.475557]  process_one_work+0x15d/0x380
      <4>[  574.475566]  worker_thread+0x29d/0x3a0
      <4>[  574.475573]  ? __pfx_worker_thread+0x10/0x10
      <4>[  574.475580]  ? __pfx_worker_thread+0x10/0x10
      <4>[  574.475587]  kthread+0xcd/0x100
      <4>[  574.475597]  ? __pfx_kthread+0x10/0x10
      <4>[  574.475606]  ret_from_fork+0x31/0x50
      <4>[  574.475615]  ? __pfx_kthread+0x10/0x10
      <4>[  574.475623]  ret_from_fork_asm+0x1a/0x30
      <4>[  574.475635]  </TASK>
      <4>[  574.475637] Modules linked in: veth br_netfilter bridge stp llc iwlmvm x86_pkg_temp_thermal iwlwifi efivarfs nvme nvme_core
      <4>[  574.475662] CR2: 0000000000000000
      <4>[  574.475668] ---[ end trace 0000000000000000 ]---
      
      Therefore, provide it to the program by setting rxq properly.
      
      Fixes: cb261b59 ("bpf: Run devmap xdp_prog on flush instead of bulk enqueue")
      Reviewed-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarFlorian Kauer <florian.kauer@linutronix.de>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/20240911-devel-koalo-fix-ingress-ifindex-v4-1-5c643ae10258@linutronix.deSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      ca9984c5
  9. 01 Oct, 2024 4 commits
  10. 25 Sep, 2024 1 commit
    • Wander Lairson Costa's avatar
      bpf: Use raw_spinlock_t in ringbuf · 8b62645b
      Wander Lairson Costa authored
      The function __bpf_ringbuf_reserve is invoked from a tracepoint, which
      disables preemption. Using spinlock_t in this context can lead to a
      "sleep in atomic" warning in the RT variant. This issue is illustrated
      in the example below:
      
      BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
      in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556208, name: test_progs
      preempt_count: 1, expected: 0
      RCU nest depth: 1, expected: 1
      INFO: lockdep is turned off.
      Preemption disabled at:
      [<ffffd33a5c88ea44>] migrate_enable+0xc0/0x39c
      CPU: 7 PID: 556208 Comm: test_progs Tainted: G
      Hardware name: Qualcomm SA8775P Ride (DT)
      Call trace:
       dump_backtrace+0xac/0x130
       show_stack+0x1c/0x30
       dump_stack_lvl+0xac/0xe8
       dump_stack+0x18/0x30
       __might_resched+0x3bc/0x4fc
       rt_spin_lock+0x8c/0x1a4
       __bpf_ringbuf_reserve+0xc4/0x254
       bpf_ringbuf_reserve_dynptr+0x5c/0xdc
       bpf_prog_ac3d15160d62622a_test_read_write+0x104/0x238
       trace_call_bpf+0x238/0x774
       perf_call_bpf_enter.isra.0+0x104/0x194
       perf_syscall_enter+0x2f8/0x510
       trace_sys_enter+0x39c/0x564
       syscall_trace_enter+0x220/0x3c0
       do_el0_svc+0x138/0x1dc
       el0_svc+0x54/0x130
       el0t_64_sync_handler+0x134/0x150
       el0t_64_sync+0x17c/0x180
      
      Switch the spinlock to raw_spinlock_t to avoid this error.
      
      Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it")
      Reported-by: default avatarBrian Grech <bgrech@redhat.com>
      Signed-off-by: default avatarWander Lairson Costa <wander.lairson@gmail.com>
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20240920190700.617253-1-wander@redhat.com
      8b62645b
  11. 24 Sep, 2024 4 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs · 684a64bf
      Linus Torvalds authored
      Pull NFS client updates from Anna Schumaker:
       "New Features:
         - Add a 'noalignwrite' mount option for lock-less 'lost writes' prevention
         - Add support for the LOCALIO protocol extention
      
        Bugfixes:
         - Fix memory leak in error path of nfs4_do_reclaim()
         - Simplify and guarantee lock owner uniqueness
         - Fix -Wformat-truncation warning
         - Fix folio refcounts by using folio_attach_private()
         - Fix failing the mount system call when the server is down
         - Fix detection of "Proxying of Times" server support
      
        Cleanups:
         - Annotate struct nfs_cache_array with __counted_by()
         - Remove unnecessary NULL checks before kfree()
         - Convert RPC_TASK_* constants to an enum
         - Remove obsolete or misleading comments and declerations"
      
      * tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (41 commits)
        nfs: Fix `make htmldocs` warnings in the localio documentation
        nfs: add "NFS Client and Server Interlock" section to localio.rst
        nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst
        nfs: add Documentation/filesystems/nfs/localio.rst
        nfs: implement client support for NFS_LOCALIO_PROGRAM
        nfs/localio: use dedicated workqueues for filesystem read and write
        pnfs/flexfiles: enable localio support
        nfs: enable localio for non-pNFS IO
        nfs: add LOCALIO support
        nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit
        nfsd: implement server support for NFS_LOCALIO_PROGRAM
        nfsd: add LOCALIO support
        nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO
        nfs_common: add NFS LOCALIO auxiliary protocol enablement
        SUNRPC: replace program list with program array
        SUNRPC: add svcauth_map_clnt_to_svc_cred_local
        SUNRPC: remove call_allocate() BUG_ONs
        nfsd: add nfsd_serv_try_get and nfsd_serv_put
        nfsd: add nfsd_file_acquire_local()
        nfsd: factor out __fh_verify to allow NULL rqstp to be passed
        ...
      684a64bf
    • Linus Torvalds's avatar
      Merge tag 'fuse-update-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · f7fccaa7
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi:
      
       - Add support for idmapped fuse mounts (Alexander Mikhalitsyn)
      
       - Add optimization when checking for writeback (yangyun)
      
       - Add tracepoints (Josef Bacik)
      
       - Clean up writeback code (Joanne Koong)
      
       - Clean up request queuing (me)
      
       - Misc fixes
      
      * tag 'fuse-update-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (32 commits)
        fuse: use exclusive lock when FUSE_I_CACHE_IO_MODE is set
        fuse: clear FR_PENDING if abort is detected when sending request
        fs/fuse: convert to use invalid_mnt_idmap
        fs/mnt_idmapping: introduce an invalid_mnt_idmap
        fs/fuse: introduce and use fuse_simple_idmap_request() helper
        fs/fuse: fix null-ptr-deref when checking SB_I_NOIDMAP flag
        fuse: allow O_PATH fd for FUSE_DEV_IOC_BACKING_OPEN
        virtio_fs: allow idmapped mounts
        fuse: allow idmapped mounts
        fuse: warn if fuse_access is called when idmapped mounts are allowed
        fuse: handle idmappings properly in ->write_iter()
        fuse: support idmapped ->rename op
        fuse: support idmapped ->set_acl
        fuse: drop idmap argument from __fuse_get_acl
        fuse: support idmapped ->setattr op
        fuse: support idmapped ->permission inode op
        fuse: support idmapped getattr inode op
        fuse: support idmap for mkdir/mknod/symlink/create/tmpfile
        fuse: support idmapped FUSE_EXT_GROUPS
        fuse: add an idmap argument to fuse_simple_request
        ...
      f7fccaa7
    • Linus Torvalds's avatar
      Merge tag 'exfat-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat · 4165cee7
      Linus Torvalds authored
      Pull exfat updates from Namjae Jeon:
      
       - Clean-up unnecessary codes as ->valid_size is supported
      
       - buffered-IO fallback is no longer needed when using direct-IO
      
       - Move ->valid_size extension from mmap to ->page_mkwrite. This
         improves the overhead caused by unnecessary zero-out during mmap.
      
       - Fix memleaks from exfat_load_bitmap() and exfat_create_upcase_table()
      
       - Add sops->shutdown and ioctl
      
       - Add Yuezhang Mo as a reviwer
      
      * tag 'exfat-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
        MAINTAINERS: exfat: add myself as reviewer
        exfat: resolve memory leak from exfat_create_upcase_table()
        exfat: move extend valid_size into ->page_mkwrite()
        exfat: fix memory leak in exfat_load_bitmap()
        exfat: Implement sops->shutdown and ioctl
        exfat: do not fallback to buffered write
        exfat: drop ->i_size_ondisk
      4165cee7
    • Linus Torvalds's avatar
      Merge tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 79952bdc
      Linus Torvalds authored
      Pull f2fs updates from Jaegeuk Kim:
       "The main changes include converting major IO paths to use folio, and
        adding various knobs to control GC more flexibly for Zoned devices.
      
        In addition, there are several patches to address corner cases of
        atomic file operations and better support for file pinning on zoned
        device.
      
        Enhancement:
         - add knobs to tune foreground/background GCs for Zoned devices
         - convert IO paths to use folio
         - reduce expensive checkpoint trigger frequency
         - allow F2FS_IPU_NOCACHE for pinned file
         - forcibly migrate to secure space for zoned device file pinning
         - get rid of buffer_head use
         - add write priority option based on zone UFS
         - get rid of online repair on corrupted directory
      
        Bug fixes:
         - fix to don't panic system for no free segment fault injection
         - fix to don't set SB_RDONLY in f2fs_handle_critical_error()
         - avoid unused block when dio write in LFS mode
         - compress: don't redirty sparse cluster during {,de}compress
         - check discard support for conventional zones
         - atomic: prevent atomic file from being dirtied before commit
         - atomic: fix to check atomic_file in f2fs ioctl interfaces
         - atomic: fix to forbid dio in atomic_file
         - atomic: fix to truncate pagecache before on-disk metadata truncation
         - atomic: create COW inode from parent dentry
         - atomic: fix to avoid racing w/ GC
         - atomic: require FMODE_WRITE for atomic write ioctls
         - fix to wait page writeback before setting gcing flag
         - fix to avoid racing in between read and OPU dio write, dio completion
         - fix several potential integer overflows in file offsets and dir_block_index
         - fix to avoid use-after-free in f2fs_stop_gc_thread()
      
        As usual, there are several code clean-ups and refactorings"
      
      * tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
        f2fs: allow F2FS_IPU_NOCACHE for pinned file
        f2fs: forcibly migrate to secure space for zoned device file pinning
        f2fs: remove unused parameters
        f2fs: fix to don't panic system for no free segment fault injection
        f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error()
        f2fs: add valid block ratio not to do excessive GC for one time GC
        f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent
        f2fs: do FG_GC when GC boosting is required for zoned devices
        f2fs: increase BG GC migration window granularity when boosted for zoned devices
        f2fs: add reserved_segments sysfs node
        f2fs: introduce migration_window_granularity
        f2fs: make BG GC more aggressive for zoned devices
        f2fs: avoid unused block when dio write in LFS mode
        f2fs: fix to check atomic_file in f2fs ioctl interfaces
        f2fs: get rid of online repaire on corrupted directory
        f2fs: prevent atomic file from being dirtied before commit
        f2fs: get rid of page->index
        f2fs: convert read_node_page() to use folio
        f2fs: convert __write_node_page() to use folio
        f2fs: convert f2fs_write_data_page() to use folio
        ...
      79952bdc