1. 22 Jan, 2024 1 commit
  2. 21 Jan, 2024 2 commits
    • Michal Schmidt's avatar
      idpf: distinguish vports by the dev_port attribute · 359724fa
      Michal Schmidt authored
      idpf registers multiple netdevs (virtual ports) for one PCI function,
      but it does not provide a way for userspace to distinguish them with
      sysfs attributes. Per Documentation/ABI/testing/sysfs-class-net, it is
      a bug not to set dev_port for independent ports on the same PCI bus,
      device and function.
      
      Without dev_port set, systemd-udevd's default naming policy attempts
      to assign the same name ("ens2f0") to all four idpf netdevs on my test
      system and obviously fails, leaving three of them with the initial
      eth<N> name.
      
      With this patch, systemd-udevd is able to assign unique names to the
      netdevs (e.g. "ens2f0", "ens2f0d1", "ens2f0d2", "ens2f0d3").
      
      The Intel-provided out-of-tree idpf driver already sets dev_port. In
      this patch I chose to do it in the same place in the idpf_cfg_netdev
      function.
      
      Fixes: 0fe45467 ("idpf: add create vport and netdev configuration")
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      359724fa
    • Eric Dumazet's avatar
      udp: fix busy polling · a54d51fb
      Eric Dumazet authored
      Generic sk_busy_loop_end() only looks at sk->sk_receive_queue
      for presence of packets.
      
      Problem is that for UDP sockets after blamed commit, some packets
      could be present in another queue: udp_sk(sk)->reader_queue
      
      In some cases, a busy poller could spin until timeout expiration,
      even if some packets are available in udp_sk(sk)->reader_queue.
      
      v3: - make sk_busy_loop_end() nicer (Willem)
      
      v2: - add a READ_ONCE(sk->sk_family) in sk_is_inet() to avoid KCSAN splats.
          - add a sk_is_inet() check in sk_is_udp() (Willem feedback)
          - add a sk_is_inet() check in sk_is_tcp().
      
      Fixes: 2276f58a ("udp: use a separate rx queue for packet reception")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a54d51fb
  3. 20 Jan, 2024 11 commits
    • Kuniyuki Iwashima's avatar
      llc: Drop support for ETH_P_TR_802_2. · e3f9bed9
      Kuniyuki Iwashima authored
      syzbot reported an uninit-value bug below. [0]
      
      llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2
      (0x0011), and syzbot abused the latter to trigger the bug.
      
        write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16)
      
      llc_conn_handler() initialises local variables {saddr,daddr}.mac
      based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes
      them to __llc_lookup().
      
      However, the initialisation is done only when skb->protocol is
      htons(ETH_P_802_2), otherwise, __llc_lookup_established() and
      __llc_lookup_listener() will read garbage.
      
      The missing initialisation existed prior to commit 211ed865
      ("net: delete all instances of special processing for token ring").
      
      It removed the part to kick out the token ring stuff but forgot to
      close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv().
      
      Let's remove llc_tr_packet_type and complete the deprecation.
      
      [0]:
      BUG: KMSAN: uninit-value in __llc_lookup_established+0xe9d/0xf90
       __llc_lookup_established+0xe9d/0xf90
       __llc_lookup net/llc/llc_conn.c:611 [inline]
       llc_conn_handler+0x4bd/0x1360 net/llc/llc_conn.c:791
       llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206
       __netif_receive_skb_one_core net/core/dev.c:5527 [inline]
       __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5641
       netif_receive_skb_internal net/core/dev.c:5727 [inline]
       netif_receive_skb+0x58/0x660 net/core/dev.c:5786
       tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555
       tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002
       tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
       call_write_iter include/linux/fs.h:2020 [inline]
       new_sync_write fs/read_write.c:491 [inline]
       vfs_write+0x8ef/0x1490 fs/read_write.c:584
       ksys_write+0x20f/0x4c0 fs/read_write.c:637
       __do_sys_write fs/read_write.c:649 [inline]
       __se_sys_write fs/read_write.c:646 [inline]
       __x64_sys_write+0x93/0xd0 fs/read_write.c:646
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Local variable daddr created at:
       llc_conn_handler+0x53/0x1360 net/llc/llc_conn.c:783
       llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206
      
      CPU: 1 PID: 5004 Comm: syz-executor994 Not tainted 6.6.0-syzkaller-14500-g1c410411 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      
      Fixes: 211ed865 ("net: delete all instances of special processing for token ring")
      Reported-by: syzbot+b5ad66046b913bc04c6f@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=b5ad66046b913bc04c6fSigned-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240119015515.61898-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e3f9bed9
    • Eric Dumazet's avatar
      llc: make llc_ui_sendmsg() more robust against bonding changes · dad555c8
      Eric Dumazet authored
      syzbot was able to trick llc_ui_sendmsg(), allocating an skb with no
      headroom, but subsequently trying to push 14 bytes of Ethernet header [1]
      
      Like some others, llc_ui_sendmsg() releases the socket lock before
      calling sock_alloc_send_skb().
      Then it acquires it again, but does not redo all the sanity checks
      that were performed.
      
      This fix:
      
      - Uses LL_RESERVED_SPACE() to reserve space.
      - Check all conditions again after socket lock is held again.
      - Do not account Ethernet header for mtu limitation.
      
      [1]
      
      skbuff: skb_under_panic: text:ffff800088baa334 len:1514 put:14 head:ffff0000c9c37000 data:ffff0000c9c36ff2 tail:0x5dc end:0x6c0 dev:bond0
      
       kernel BUG at net/core/skbuff.c:193 !
      Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 6875 Comm: syz-executor.0 Not tainted 6.7.0-rc8-syzkaller-00101-g0802e17d9aca-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
      pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
       pc : skb_panic net/core/skbuff.c:189 [inline]
       pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
       lr : skb_panic net/core/skbuff.c:189 [inline]
       lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
      sp : ffff800096f97000
      x29: ffff800096f97010 x28: ffff80008cc8d668 x27: dfff800000000000
      x26: ffff0000cb970c90 x25: 00000000000005dc x24: ffff0000c9c36ff2
      x23: ffff0000c9c37000 x22: 00000000000005ea x21: 00000000000006c0
      x20: 000000000000000e x19: ffff800088baa334 x18: 1fffe000368261ce
      x17: ffff80008e4ed000 x16: ffff80008a8310f8 x15: 0000000000000001
      x14: 1ffff00012df2d58 x13: 0000000000000000 x12: 0000000000000000
      x11: 0000000000000001 x10: 0000000000ff0100 x9 : e28a51f1087e8400
      x8 : e28a51f1087e8400 x7 : ffff80008028f8d0 x6 : 0000000000000000
      x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff800082b78714
      x2 : 0000000000000001 x1 : 0000000100000000 x0 : 0000000000000089
      Call trace:
        skb_panic net/core/skbuff.c:189 [inline]
        skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
        skb_push+0xf0/0x108 net/core/skbuff.c:2451
        eth_header+0x44/0x1f8 net/ethernet/eth.c:83
        dev_hard_header include/linux/netdevice.h:3188 [inline]
        llc_mac_hdr_init+0x110/0x17c net/llc/llc_output.c:33
        llc_sap_action_send_xid_c+0x170/0x344 net/llc/llc_s_ac.c:85
        llc_exec_sap_trans_actions net/llc/llc_sap.c:153 [inline]
        llc_sap_next_state net/llc/llc_sap.c:182 [inline]
        llc_sap_state_process+0x1ec/0x774 net/llc/llc_sap.c:209
        llc_build_and_send_xid_pkt+0x12c/0x1c0 net/llc/llc_sap.c:270
        llc_ui_sendmsg+0x7bc/0xb1c net/llc/af_llc.c:997
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg net/socket.c:745 [inline]
        sock_sendmsg+0x194/0x274 net/socket.c:767
        splice_to_socket+0x7cc/0xd58 fs/splice.c:881
        do_splice_from fs/splice.c:933 [inline]
        direct_splice_actor+0xe4/0x1c0 fs/splice.c:1142
        splice_direct_to_actor+0x2a0/0x7e4 fs/splice.c:1088
        do_splice_direct+0x20c/0x348 fs/splice.c:1194
        do_sendfile+0x4bc/0xc70 fs/read_write.c:1254
        __do_sys_sendfile64 fs/read_write.c:1322 [inline]
        __se_sys_sendfile64 fs/read_write.c:1308 [inline]
        __arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1308
        __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
        invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
        el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
        do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
        el0_svc+0x54/0x158 arch/arm64/kernel/entry-common.c:678
        el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
        el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:595
      Code: aa1803e6 aa1903e7 a90023f5 94792f6a (d4210000)
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-and-tested-by: syzbot+2a7024e9502df538e8ef@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240118183625.4007013-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dad555c8
    • Lin Ma's avatar
      vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING · 6c21660f
      Lin Ma authored
      In the vlan_changelink function, a loop is used to parse the nested
      attributes IFLA_VLAN_EGRESS_QOS and IFLA_VLAN_INGRESS_QOS in order to
      obtain the struct ifla_vlan_qos_mapping. These two nested attributes are
      checked in the vlan_validate_qos_map function, which calls
      nla_validate_nested_deprecated with the vlan_map_policy.
      
      However, this deprecated validator applies a LIBERAL strictness, allowing
      the presence of an attribute with the type IFLA_VLAN_QOS_UNSPEC.
      Consequently, the loop in vlan_changelink may parse an attribute of type
      IFLA_VLAN_QOS_UNSPEC and believe it carries a payload of
      struct ifla_vlan_qos_mapping, which is not necessarily true.
      
      To address this issue and ensure compatibility, this patch introduces two
      type checks that skip attributes whose type is not IFLA_VLAN_QOS_MAPPING.
      
      Fixes: 07b5b17e ("[VLAN]: Use rtnl_link API")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240118130306.1644001-1-linma@zju.edu.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6c21660f
    • Jakub Kicinski's avatar
      Merge branch 'bnxt_en-bug-fixes' · 9b697956
      Jakub Kicinski authored
      Michael Chan says:
      
      ====================
      bnxt_en: Bug fixes
      
      This series contains 5 miscellaneous fixes.  The fixes include adding
      delay for FLR, buffer memory leak, RSS table size calculation,
      ethtool self test kernel warning, and mqprio crash.
      ====================
      
      Link: https://lore.kernel.org/r/20240117234515.226944-1-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9b697956
    • Michael Chan's avatar
      bnxt_en: Fix possible crash after creating sw mqprio TCs · 467739ba
      Michael Chan authored
      The driver relies on netdev_get_num_tc() to get the number of HW
      offloaded mqprio TCs to allocate and free TX rings.  This won't
      work and can potentially crash the system if software mqprio or
      taprio TCs have been setup.  netdev_get_num_tc() will return the
      number of software TCs and it may cause the driver to allocate or
      free more TX rings that it should.  Fix it by adding a bp->num_tc
      field to store the number of HW offload mqprio TCs for the device.
      Use bp->num_tc instead of netdev_get_num_tc().
      
      This fixes a crash like this:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      PGD 42b8404067 P4D 0
      Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 120 PID: 8661 Comm: ifconfig Kdump: loaded Tainted: G           OE     5.18.16 #1
      Hardware name: Lenovo ThinkSystem SR650 V3/SB27A92818, BIOS ESE114N-2.12 04/25/2023
      RIP: 0010:bnxt_hwrm_cp_ring_alloc_p5+0x10/0x90 [bnxt_en]
      Code: 41 5c 41 5d 41 5e c3 cc cc cc cc 41 8b 44 24 08 66 89 03 eb c6 e8 b0 f1 7d db 0f 1f 44 00 00 41 56 41 55 41 54 55 48 89 fd 53 <48> 8b 06 48 89 f3 48 81 c6 28 01 00 00 0f b6 96 13 ff ff ff 44 8b
      RSP: 0018:ff65907660d1fa88 EFLAGS: 00010202
      RAX: 0000000000000010 RBX: ff4dde1d907e4980 RCX: f400000000000000
      RDX: 0000000000000010 RSI: 0000000000000000 RDI: ff4dde1d907e4980
      RBP: ff4dde1d907e4980 R08: 000000000000000f R09: 0000000000000000
      R10: ff4dde5f02671800 R11: 0000000000000008 R12: 0000000088888889
      R13: 0500000000000000 R14: 00f0000000000000 R15: ff4dde5f02671800
      FS:  00007f4b126b5740(0000) GS:ff4dde9bff600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 000000416f9c6002 CR4: 0000000000771ee0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <TASK>
       bnxt_hwrm_ring_alloc+0x204/0x770 [bnxt_en]
       bnxt_init_chip+0x4d/0x680 [bnxt_en]
       ? bnxt_poll+0x1a0/0x1a0 [bnxt_en]
       __bnxt_open_nic+0xd2/0x740 [bnxt_en]
       bnxt_open+0x10b/0x220 [bnxt_en]
       ? raw_notifier_call_chain+0x41/0x60
       __dev_open+0xf3/0x1b0
       __dev_change_flags+0x1db/0x250
       dev_change_flags+0x21/0x60
       devinet_ioctl+0x590/0x720
       ? avc_has_extended_perms+0x1b7/0x420
       ? _copy_from_user+0x3a/0x60
       inet_ioctl+0x189/0x1c0
       ? wp_page_copy+0x45a/0x6e0
       sock_do_ioctl+0x42/0xf0
       ? ioctl_has_perm.constprop.0.isra.0+0xbd/0x120
       sock_ioctl+0x1ce/0x2e0
       __x64_sys_ioctl+0x87/0xc0
       do_syscall_64+0x59/0x90
       ? syscall_exit_work+0x103/0x130
       ? syscall_exit_to_user_mode+0x12/0x30
       ? do_syscall_64+0x69/0x90
       ? exc_page_fault+0x62/0x150
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Reviewed-by: default avatarDamodharam Ammepalli <damodharam.ammepalli@broadcom.com>
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240117234515.226944-6-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      467739ba
    • Michael Chan's avatar
      bnxt_en: Prevent kernel warning when running offline self test · c20f4821
      Michael Chan authored
      We call bnxt_half_open_nic() to setup the chip partially to run
      loopback tests.  The rings and buffers are initialized normally
      so that we can transmit and receive packets in loopback mode.
      That means page pool buffers are allocated for the aggregation ring
      just like the normal case.  NAPI is not needed because we are just
      polling for the loopback packets.
      
      When we're done with the loopback tests, we call bnxt_half_close_nic()
      to clean up.  When freeing the page pools, we hit a WARN_ON()
      in page_pool_unlink_napi() because the NAPI state linked to the
      page pool is uninitialized.
      
      The simplest way to avoid this warning is just to initialize the
      NAPIs during half open and delete the NAPIs during half close.
      Trying to skip the page pool initialization or skip linking of
      NAPI during half open will be more complicated.
      
      This fix avoids this warning:
      
      WARNING: CPU: 4 PID: 46967 at net/core/page_pool.c:946 page_pool_unlink_napi+0x1f/0x30
      CPU: 4 PID: 46967 Comm: ethtool Tainted: G S      W          6.7.0-rc5+ #22
      Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.3.8 08/31/2021
      RIP: 0010:page_pool_unlink_napi+0x1f/0x30
      Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8b 47 18 48 85 c0 74 1b 48 8b 50 10 83 e2 01 74 08 8b 40 34 83 f8 ff 74 02 <0f> 0b 48 c7 47 18 00 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90
      RSP: 0018:ffa000003d0dfbe8 EFLAGS: 00010246
      RAX: ff110003607ce640 RBX: ff110010baf5d000 RCX: 0000000000000008
      RDX: 0000000000000000 RSI: ff110001e5e522c0 RDI: ff110010baf5d000
      RBP: ff11000145539b40 R08: 0000000000000001 R09: ffffffffc063f641
      R10: ff110001361eddb8 R11: 000000000040000f R12: 0000000000000001
      R13: 000000000000001c R14: ff1100014553a080 R15: 0000000000003fc0
      FS:  00007f9301c4f740(0000) GS:ff1100103fd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f91344fa8f0 CR3: 00000003527cc005 CR4: 0000000000771ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? __warn+0x81/0x140
       ? page_pool_unlink_napi+0x1f/0x30
       ? report_bug+0x102/0x200
       ? handle_bug+0x44/0x70
       ? exc_invalid_op+0x13/0x60
       ? asm_exc_invalid_op+0x16/0x20
       ? bnxt_free_ring.isra.123+0xb1/0xd0 [bnxt_en]
       ? page_pool_unlink_napi+0x1f/0x30
       page_pool_destroy+0x3e/0x150
       bnxt_free_mem+0x441/0x5e0 [bnxt_en]
       bnxt_half_close_nic+0x2a/0x40 [bnxt_en]
       bnxt_self_test+0x21d/0x450 [bnxt_en]
       __dev_ethtool+0xeda/0x2e30
       ? native_queued_spin_lock_slowpath+0x17f/0x2b0
       ? __link_object+0xa1/0x160
       ? _raw_spin_unlock_irqrestore+0x23/0x40
       ? __create_object+0x5f/0x90
       ? __kmem_cache_alloc_node+0x317/0x3c0
       ? dev_ethtool+0x59/0x170
       dev_ethtool+0xa7/0x170
       dev_ioctl+0xc3/0x530
       sock_do_ioctl+0xa8/0xf0
       sock_ioctl+0x270/0x310
       __x64_sys_ioctl+0x8c/0xc0
       do_syscall_64+0x3e/0xf0
       entry_SYSCALL_64_after_hwframe+0x6e/0x76
      
      Fixes: 294e39e0 ("bnxt: hook NAPIs to page pools")
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Reviewed-by: default avatarAjit Khaparde <ajit.khaparde@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240117234515.226944-5-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c20f4821
    • Michael Chan's avatar
      bnxt_en: Fix RSS table entries calculation for P5_PLUS chips · 523384a6
      Michael Chan authored
      The existing formula used in the driver to calculate the number of RSS
      table entries is to round up the number of RX rings to the next integer
      multiples of 64 (e.g. 64, 128, 192, ..).  This is incorrect.  The valid
      values supported by the chip are 64, 128, 256, 512 only (power of 2
      starting from 64).  When the number of RX rings is greater than 128, the
      entry size will likely be wrong.  Firmware will round down the invalid
      value (e.g. 192 rounded down to 128) provided by the driver, causing some
      RSS rings to not receive any packets.
      
      We already have an existing function bnxt_calc_nr_ring_pages() to
      do this calculation.  Use it in bnxt_get_nr_rss_ctxs() to calculate the
      number of RSS contexts correctly for P5_PLUS chips.
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Fixes: 7b3af4f7 ("bnxt_en: Add RSS support for 57500 chips.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240117234515.226944-4-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      523384a6
    • Michael Chan's avatar
      bnxt_en: Fix memory leak in bnxt_hwrm_get_rings() · 2ad8e573
      Michael Chan authored
      bnxt_hwrm_get_rings() can abort and return error when there are not
      enough ring resources.  It aborts without releasing the HWRM DMA buffer,
      causing a dma_pool_destroy warning when the driver is unloaded:
      
      bnxt_en 0000:99:00.0: dma_pool_destroy bnxt_hwrm, 000000005b089ba8 busy
      
      Fixes: f1e50b27 ("bnxt_en: Fix trimming of P5 RX and TX rings")
      Reviewed-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240117234515.226944-3-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2ad8e573
    • Michael Chan's avatar
      bnxt_en: Wait for FLR to complete during probe · 3c1069fa
      Michael Chan authored
      The first message to firmware may fail if the device is undergoing FLR.
      The driver has some recovery logic for this failure scenario but we must
      wait 100 msec for FLR to complete before proceeding.  Otherwise the
      recovery will always fail.
      
      Fixes: ba02629f ("bnxt_en: log firmware status on firmware init failure")
      Reviewed-by: default avatarDamodharam Ammepalli <damodharam.ammepalli@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240117234515.226944-2-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3c1069fa
    • Zhengchao Shao's avatar
      tcp: make sure init the accept_queue's spinlocks once · 198bc90e
      Zhengchao Shao authored
      When I run syz's reproduction C program locally, it causes the following
      issue:
      pvqspinlock: lock 0xffff9d181cd5c660 has corrupted value 0x0!
      WARNING: CPU: 19 PID: 21160 at __pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508)
      Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      RIP: 0010:__pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508)
      Code: 73 56 3a ff 90 c3 cc cc cc cc 8b 05 bb 1f 48 01 85 c0 74 05 c3 cc cc cc cc 8b 17 48 89 fe 48 c7 c7
      30 20 ce 8f e8 ad 56 42 ff <0f> 0b c3 cc cc cc cc 0f 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90
      RSP: 0018:ffffa8d200604cb8 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9d1ef60e0908
      RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9d1ef60e0900
      RBP: ffff9d181cd5c280 R08: 0000000000000000 R09: 00000000ffff7fff
      R10: ffffa8d200604b68 R11: ffffffff907dcdc8 R12: 0000000000000000
      R13: ffff9d181cd5c660 R14: ffff9d1813a3f330 R15: 0000000000001000
      FS:  00007fa110184640(0000) GS:ffff9d1ef60c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000000 CR3: 000000011f65e000 CR4: 00000000000006f0
      Call Trace:
      <IRQ>
        _raw_spin_unlock (kernel/locking/spinlock.c:186)
        inet_csk_reqsk_queue_add (net/ipv4/inet_connection_sock.c:1321)
        inet_csk_complete_hashdance (net/ipv4/inet_connection_sock.c:1358)
        tcp_check_req (net/ipv4/tcp_minisocks.c:868)
        tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2260)
        ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205)
        ip_local_deliver_finish (net/ipv4/ip_input.c:234)
        __netif_receive_skb_one_core (net/core/dev.c:5529)
        process_backlog (./include/linux/rcupdate.h:779)
        __napi_poll (net/core/dev.c:6533)
        net_rx_action (net/core/dev.c:6604)
        __do_softirq (./arch/x86/include/asm/jump_label.h:27)
        do_softirq (kernel/softirq.c:454 kernel/softirq.c:441)
      </IRQ>
      <TASK>
        __local_bh_enable_ip (kernel/softirq.c:381)
        __dev_queue_xmit (net/core/dev.c:4374)
        ip_finish_output2 (./include/net/neighbour.h:540 net/ipv4/ip_output.c:235)
        __ip_queue_xmit (net/ipv4/ip_output.c:535)
        __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
        tcp_rcv_synsent_state_process (net/ipv4/tcp_input.c:6469)
        tcp_rcv_state_process (net/ipv4/tcp_input.c:6657)
        tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1929)
        __release_sock (./include/net/sock.h:1121 net/core/sock.c:2968)
        release_sock (net/core/sock.c:3536)
        inet_wait_for_connect (net/ipv4/af_inet.c:609)
        __inet_stream_connect (net/ipv4/af_inet.c:702)
        inet_stream_connect (net/ipv4/af_inet.c:748)
        __sys_connect (./include/linux/file.h:45 net/socket.c:2064)
        __x64_sys_connect (net/socket.c:2073 net/socket.c:2070 net/socket.c:2070)
        do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82)
        entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
        RIP: 0033:0x7fa10ff05a3d
        Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89
        c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48
        RSP: 002b:00007fa110183de8 EFLAGS: 00000202 ORIG_RAX: 000000000000002a
        RAX: ffffffffffffffda RBX: 0000000020000054 RCX: 00007fa10ff05a3d
        RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003
        RBP: 00007fa110183e20 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000202 R12: 00007fa110184640
        R13: 0000000000000000 R14: 00007fa10fe8b060 R15: 00007fff73e23b20
      </TASK>
      
      The issue triggering process is analyzed as follows:
      Thread A                                       Thread B
      tcp_v4_rcv	//receive ack TCP packet       inet_shutdown
        tcp_check_req                                  tcp_disconnect //disconnect sock
        ...                                              tcp_set_state(sk, TCP_CLOSE)
          inet_csk_complete_hashdance                ...
            inet_csk_reqsk_queue_add                 inet_listen  //start listen
              spin_lock(&queue->rskq_lock)             inet_csk_listen_start
              ...                                        reqsk_queue_alloc
              ...                                          spin_lock_init
              spin_unlock(&queue->rskq_lock)	//warning
      
      When the socket receives the ACK packet during the three-way handshake,
      it will hold spinlock. And then the user actively shutdowns the socket
      and listens to the socket immediately, the spinlock will be initialized.
      When the socket is going to release the spinlock, a warning is generated.
      Also the same issue to fastopenq.lock.
      
      Move init spinlock to inet_create and inet_accept to make sure init the
      accept_queue's spinlocks once.
      
      Fixes: fff1f300 ("tcp: add a spinlock to protect struct request_sock_queue")
      Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
      Reported-by: default avatarMing Shu <sming56@aliyun.com>
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240118012019.1751966-1-shaozhengchao@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      198bc90e
    • Benjamin Poirier's avatar
      selftests: bonding: Increase timeout to 1200s · b01f15a7
      Benjamin Poirier authored
      When tests are run by runner.sh, bond_options.sh gets killed before
      it can complete:
      
      make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding"
      	[...]
      	# timeout set to 120
      	# selftests: drivers/net/bonding: bond_options.sh
      	# TEST: prio (active-backup miimon primary_reselect 0)                [ OK ]
      	# TEST: prio (active-backup miimon primary_reselect 1)                [ OK ]
      	# TEST: prio (active-backup miimon primary_reselect 2)                [ OK ]
      	# TEST: prio (active-backup arp_ip_target primary_reselect 0)         [ OK ]
      	# TEST: prio (active-backup arp_ip_target primary_reselect 1)         [ OK ]
      	# TEST: prio (active-backup arp_ip_target primary_reselect 2)         [ OK ]
      	#
      	not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds
      
      This test includes many sleep statements, at least some of which are
      related to timers in the operation of the bonding driver itself. Increase
      the test timeout to allow the test to complete.
      
      I ran the test in slightly different VMs (including one without HW
      virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and
      13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s.
      
      Fixes: 42a8d4aa ("selftests: bonding: add bonding prio option test")
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#tSuggested-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@nvidia.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b01f15a7
  4. 19 Jan, 2024 15 commits
    • Wen Gu's avatar
      net/smc: fix illegal rmb_desc access in SMC-D connection dump · dbc153fd
      Wen Gu authored
      A crash was found when dumping SMC-D connections. It can be reproduced
      by following steps:
      
      - run nginx/wrk test:
        smc_run nginx
        smc_run wrk -t 16 -c 1000 -d <duration> -H 'Connection: Close' <URL>
      
      - continuously dump SMC-D connections in parallel:
        watch -n 1 'smcss -D'
      
       BUG: kernel NULL pointer dereference, address: 0000000000000030
       CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G	E      6.7.0+ #55
       RIP: 0010:__smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag]
       Call Trace:
        <TASK>
        ? __die+0x24/0x70
        ? page_fault_oops+0x66/0x150
        ? exc_page_fault+0x69/0x140
        ? asm_exc_page_fault+0x26/0x30
        ? __smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag]
        ? __kmalloc_node_track_caller+0x35d/0x430
        ? __alloc_skb+0x77/0x170
        smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
        smc_diag_dump+0x26/0x60 [smc_diag]
        netlink_dump+0x19f/0x320
        __netlink_dump_start+0x1dc/0x300
        smc_diag_handler_dump+0x6a/0x80 [smc_diag]
        ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
        sock_diag_rcv_msg+0x121/0x140
        ? __pfx_sock_diag_rcv_msg+0x10/0x10
        netlink_rcv_skb+0x5a/0x110
        sock_diag_rcv+0x28/0x40
        netlink_unicast+0x22a/0x330
        netlink_sendmsg+0x1f8/0x420
        __sock_sendmsg+0xb0/0xc0
        ____sys_sendmsg+0x24e/0x300
        ? copy_msghdr_from_user+0x62/0x80
        ___sys_sendmsg+0x7c/0xd0
        ? __do_fault+0x34/0x160
        ? do_read_fault+0x5f/0x100
        ? do_fault+0xb0/0x110
        ? __handle_mm_fault+0x2b0/0x6c0
        __sys_sendmsg+0x4d/0x80
        do_syscall_64+0x69/0x180
        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      
      It is possible that the connection is in process of being established
      when we dump it. Assumed that the connection has been registered in a
      link group by smc_conn_create() but the rmb_desc has not yet been
      initialized by smc_buf_create(), thus causing the illegal access to
      conn->rmb_desc. So fix it by checking before dump.
      
      Fixes: 4b1b7d3b ("net/smc: add SMC-D diag support")
      Signed-off-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Reviewed-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbc153fd
    • Linus Torvalds's avatar
      Merge tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 736b5545
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bpf and netfilter.
      
        Previous releases - regressions:
      
         - Revert "net: rtnetlink: Enslave device before bringing it up",
           breaks the case inverse to the one it was trying to fix
      
         - net: dsa: fix oob access in DSA's netdevice event handler
           dereference netdev_priv() before check its a DSA port
      
         - sched: track device in tcf_block_get/put_ext() only for clsact
           binder types
      
         - net: tls, fix WARNING in __sk_msg_free when record becomes full
           during splice and MORE hint set
      
         - sfp-bus: fix SFP mode detect from bitrate
      
         - drv: stmmac: prevent DSA tags from breaking COE
      
        Previous releases - always broken:
      
         - bpf: fix no forward progress in in bpf_iter_udp if output buffer is
           too small
      
         - bpf: reject variable offset alu on registers with a type of
           PTR_TO_FLOW_KEYS to prevent oob access
      
         - netfilter: tighten input validation
      
         - net: add more sanity check in virtio_net_hdr_to_skb()
      
         - rxrpc: fix use of Don't Fragment flag on RESPONSE packets, avoid
           infinite loop
      
         - amt: do not use the portion of skb->cb area which may get clobbered
      
         - mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option
      
        Misc:
      
         - spring cleanup of inactive maintainers"
      
      * tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
        i40e: Include types.h to some headers
        ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work
        selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes
        selftests: mlxsw: qos_pfc: Remove wrong description
        mlxsw: spectrum_router: Register netdevice notifier before nexthop
        mlxsw: spectrum_acl_tcam: Fix stack corruption
        mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
        mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
        ethtool: netlink: Add missing ethnl_ops_begin/complete
        selftests: bonding: Add more missing config options
        selftests: netdevsim: add a config file
        libbpf: warn on unexpected __arg_ctx type when rewriting BTF
        selftests/bpf: add tests confirming type logic in kernel for __arg_ctx
        bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
        bpf: extract bpf_ctx_convert_map logic and make it more reusable
        libbpf: feature-detect arg:ctx tag support in kernel
        ipvs: avoid stat macros calls from preemptible context
        netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
        netfilter: nf_tables: skip dead set elements in netlink dump
        netfilter: nf_tables: do not allow mismatch field size and set key length
        ...
      736b5545
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · ed8d8453
      Linus Torvalds authored
      Pull i2c updates from Wolfram Sang:
       "This removes the currently unused CLASS_DDC support (controllers set
        the flag, but there is no client to use it).
      
        Also, CLASS_SPD support gets simplified to prepare removal in the
        future. Class based instantiation is not recommended these days
        anyhow.
      
        Furthermore, I2C core now creates a debugfs directory per I2C adapter.
        Current bus driver users were converted to use it.
      
        Finally, quite some driver updates. Standing out are patches for the
        wmt-driver which is refactored to support more variants.
      
        This is the rebased pull request where a large series for the
        designware driver was dropped"
      
      * tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
        MAINTAINERS: use proper email for my I2C work
        i2c: stm32f7: add support for stm32mp25 soc
        i2c: stm32f7: perform I2C_ISR read once at beginning of event isr
        dt-bindings: i2c: document st,stm32mp25-i2c compatible
        i2c: stm32f7: simplify status messages in case of errors
        i2c: stm32f7: perform most of irq job in threaded handler
        i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq
        i2c: i801: Add lis3lv02d for Dell XPS 15 7590
        i2c: i801: Add lis3lv02d for Dell Precision 3540
        i2c: wmt: Reduce redundant: REG_CR setting
        i2c: wmt: Reduce redundant: function parameter
        i2c: wmt: Reduce redundant: clock mode setting
        i2c: wmt: Reduce redundant: wait event complete
        i2c: wmt: Reduce redundant: bus busy check
        i2c: mux: reg: Remove class-based device auto-detection support
        i2c: make i2c_bus_type const
        dt-bindings: at24: add ROHM BR24G04
        eeprom: at24: use of_match_ptr()
        i2c: cpm: Remove linux,i2c-index conversion from be32
        i2c: imx: Make SDA actually optional for bus recovering
        ...
      ed8d8453
    • Linus Torvalds's avatar
      Merge tag 'rtc-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 378de6df
      Linus Torvalds authored
      Pull RTC updates from Alexandre Belloni:
       "There are three new drivers this cycle. Also the cmos driver is
        getting fixes for longstanding wakeup issues on AMD.
      
        New drivers:
         - Analog Devices MAX31335
         - Nuvoton ma35d1
         - Texas Instrument TPS6594 PMIC RTC
      
        Drivers:
         - cmos: use ACPI alarm instead of HPET on recent AMD platforms
         - nuvoton: add NCT3015Y-R and NCT3018Y-R support
         - rv8803: proper suspend/resume and wakeup-source support"
      
      * tag 'rtc-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (26 commits)
        rtc: nuvoton: Compatible with NCT3015Y-R and NCT3018Y-R
        rtc: da9063: Use dev_err_probe()
        rtc: da9063: Use device_get_match_data()
        rtc: da9063: Make IRQ as optional
        rtc: max31335: Fix comparison in max31335_volatile_reg()
        rtc: max31335: use regmap_update_bits_check
        rtc: max31335: remove unecessary locking
        rtc: max31335: add driver support
        dt-bindings: rtc: max31335: add max31335 bindings
        rtc: rv8803: add wakeup-source support
        rtc: ac100: remove misuses of kernel-doc
        rtc: class: Remove usage of the deprecated ida_simple_xx() API
        rtc: MAINTAINERS: drop Alessandro Zummo
        rtc: ma35d1: remove hardcoded UIE support
        dt-bindings: rtc: qcom-pm8xxx: fix inconsistent example
        rtc: rv8803: Add power management support
        rtc: ds3232: avoid unused-const-variable warning
        rtc: lpc24xx: add missing dependency
        rtc: tps6594: Add driver for TPS6594 RTC
        rtc: Add driver for Nuvoton ma35d1 rtc controller
        ...
      378de6df
    • Linus Torvalds's avatar
      Merge tag 'input-for-v6.8-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 0f289bdd
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
      
       - a new driver for Adafruit Seesaw gamepad device
      
       - Zforce touchscreen will handle standard device properties for axis
         swap/inversion
      
       - handling of advanced sensitivity settings in Microchip CAP11xx
         capacitive sensor driver
      
       - more drivers have been converted to use newer gpiod API
      
       - support for dedicated wakeup IRQs in gpio-keys dirver
      
       - support for slider gestures and OTP variants in iqs269a driver
      
       - atkbd will report keyboard version as 0xab83 in cases when GET ID
         command was skipped (to deal with problematic firmware on newer
         laptops), restoring the previous behavior
      
       - other assorted cleanups and changes
      
      * tag 'input-for-v6.8-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (44 commits)
        Input: atkbd - use ab83 as id when skipping the getid command
        Input: driver for Adafruit Seesaw Gamepad
        dt-bindings: input: bindings for Adafruit Seesaw Gamepad
        Input: da9063_onkey - avoid explicitly setting input's parent
        Input: da9063_onkey - avoid using OF-specific APIs
        Input: iqs269a - add support for OTP variants
        dt-bindings: input: iqs269a: Add bindings for OTP variants
        Input: iqs269a - add support for slider gestures
        dt-bindings: input: iqs269a: Add bindings for slider gestures
        Input: gpio-keys - filter gpio_keys -EPROBE_DEFER error messages
        Input: zforce_ts - accept standard touchscreen properties
        dt-bindings: touchscreen: neonode,zforce: Use standard properties
        dt-bindings: touchscreen: convert neonode,zforce to json-schema
        dt-bindings: input: convert drv266x to json-schema
        Input: da9063 - use dev_err_probe()
        Input: da9063 - drop redundant prints in probe()
        Input: da9063 - simplify obtaining OF match data
        Input: as5011 - convert to GPIO descriptor
        Input: omap-keypad - drop optional GPIO support
        Input: tca6416-keypad - drop unused include
        ...
      0f289bdd
    • Linus Torvalds's avatar
      Merge tag 'phy-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · 33a9caa4
      Linus Torvalds authored
      Pull phy updates from Vinod Koul:
       "New Support:
      
         - Qualcomm SM8650 UFS, PCIe and USB/DP Combo PHY, eUSB2 PHY, SDX75
           USB3, X1E80100 USB3 support
      
         - Mediatek MT8195 support
      
         - Rockchip RK3128 usb2 support
      
         - TI SGMII mode for J784S4
      
        Updates:
      
         - Qualcomm v7 register offsets updates
      
         - Mediatek tphy support for force phy mode switch"
      
      * tag 'phy-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (34 commits)
        phy: ti: j721e-wiz: Add SGMII support in WIZ driver for J784S4
        phy: ti: gmii-sel: Enable SGMII mode for J784S4
        phy: qcom-qmp-usb: Add Qualcomm X1E80100 USB3 PHY support
        dt-bindings: phy: qcom,sc8280xp-qmp-usb3-uni: Add X1E80100 USB PHY binding
        phy: qcom-qmp-combo: Add x1e80100 USB/DP combo phys
        dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp-phy: Document X1E80100 compatible
        dt-bindings: phy: qcom: snps-eusb2: Document the X1E80100 compatible
        phy: mediatek: tphy: add support force phy mode switch
        dt-bindings: phy: mediatek: tphy: add a property for force-mode switch
        phy: phy-can-transceiver: insert space after include
        phy: qualcomm: phy-qcom-qmp-ufs: Rectify SM8550 UFS HS-G4 PHY Settings
        dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp-phy: fix path to header
        phy: renesas: phy-rcar-gen2: use select for GENERIC_PHY
        phy: qcom-qmp: qserdes-txrx: Add v7 register offsets
        phy: qcom-qmp: qserdes-txrx: Add V6 N4 register offsets
        phy: qcom-qmp: qserdes-com: Add v7 register offsets
        phy: qcom-qmp: pcs-usb: Add v7 register offsets
        phy: qcom-qmp: pcs: Add v7 register offsets
        phy: qcom-qmp: qserdes-txrx: Add some more v6.20 register offsets
        phy: qcom-qmp: qserdes-com: Add some more v6 register offsets
        ...
      33a9caa4
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · 4d5d604c
      Linus Torvalds authored
      Pull soundwire updates from Vinod Koul:
      
       - Core: add concept of controller_id to deal with clear Controller /
         Manager hierarchy
      
       - bunch of qcom driver refactoring for qcom_swrm_stream_alloc_ports(),
         qcom_swrm_stream_alloc_ports() and setting controller id to hw master
         id
      
      * tag 'soundwire-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
        soundwire: amd: drop bus freq calculation and set 'max_clk_freq'
        soundwire: generic_bandwidth_allocation use bus->params.max_dr_freq
        soundwire: qcom: set controller id to hw master id
        soundwire: fix initializing sysfs for same devices on different buses
        soundwire: bus: introduce controller_id
        soundwire: stream: constify sdw_port_config when adding devices
        soundwire: qcom: move sconfig in qcom_swrm_stream_alloc_ports() out of critical section
        soundwire: qcom: drop unneeded qcom_swrm_stream_alloc_ports() cleanup
      4d5d604c
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 34551358
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
       "Apart from some regular driver fixes there's a relatively big revert
        of the locking changes that were introduced to GPIOLIB in this merge
        window.
      
        This is because it turned out that some legacy GPIO interfaces - that
        need to translate a number from the global GPIO numberspace to the
        address of the relevant descriptor, thus running a GPIO device lookup
        and taking the GPIO device list lock - are still used in old code from
        atomic context resulting in "scheduling while atomic" errors.
      
        I'll try to make the read-only part of the list access entirely
        lockless using SRCU but this will take some time so let's go back to
        the old global spinlock for now.
      
        Summary:
      
         - revert the changes aiming to use a read-write semaphore to protect
           the list of GPIO devices due to calls to legacy API taking that
           lock from atomic context in old code
      
         - fix inverted logic in DEFINE_FREE() for GPIO device references
      
         - check the return value of bgpio_init() in gpio-mlxbf3
      
         - fix node address in the DT bindings example for gpio-xilinx
      
         - fix signedness bug in gpio-rtd
      
         - fix kernel-doc warnings in gpio-en7523"
      
      * tag 'gpio-fixes-for-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpiolib: revert the attempt to protect the GPIO device list with an rwsem
        gpio: EN7523: fix kernel-doc warnings
        gpiolib: Fix scope-based gpio_device refcounting
        gpio: mlxbf3: add an error code check in mlxbf3_gpio_probe
        dt-bindings: gpio: xilinx: Fix node address in gpio
        gpio: rtd: Fix signedness bug in probe
      34551358
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-6.8-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/ukleinek/linux · 5c935069
      Linus Torvalds authored
      Pull pwm fixes from Uwe Kleine-König:
      
       - fix a duplicate cleanup in an error path introduced in
         this merge window
      
       - fix an out-of-bounds access
      
         In practise it doesn't happen - otherwise someone would have noticed
         since v5.17-rc1 I guess - because the device tree binding for the two
         drivers using of_pwm_single_xlate() only have args->args_count == 1.
      
         A device-tree that doesn't conform to the respective bindings could
         trigger that easily however.
      
       - correct the request callback of the jz4740 pwm driver which used
         dev_err_probe() long after .probe() completed.
      
         This is conceptually wrong because dev_err_probe() might call
         device_set_deferred_probe_reason() which is nonsensical after the
         driver is bound.
      
      * tag 'pwm/for-6.8-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/ukleinek/linux:
        pwm: jz4740: Don't use dev_err_probe() in .request()
        pwm: Fix out-of-bounds access in of_pwm_single_xlate()
        pwm: bcm2835: Remove duplicate call to clk_rate_exclusive_put()
      5c935069
    • Linus Torvalds's avatar
      Merge tag 'backlight-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · 21c91bb9
      Linus Torvalds authored
      Pull backlight updates from Lee Jones:
       "New Drivers:
         - Add support for Monolithic Power Systems MP3309C WLED Step-up Converter
      
        Fix-ups:
         - Use/convert to new/better APIs/helpers/MACROs instead of
           hand-rolling implementations
         - Device Tree Binding updates
         - Demote non-kerneldoc header comments
         - Improve error handling; return proper error values, simplify, avoid
           duplicates, etc
         - Convert over to the new (kinda) GPIOD API
      
        Bug Fixes:
         - Fix uninitialised local variable"
      
      * tag 'backlight-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
        backlight: hx8357: Convert to agnostic GPIO API
        backlight: ili922x: Add an error code check in ili922x_write()
        backlight: ili922x: Drop kernel-doc for local macros
        backlight: mp3309c: Fix uninitialized local variable
        backlight: pwm_bl: Use dev_err_probe
        backlight: mp3309c: Add support for MPS MP3309C
        dt-bindings: backlight: mp3309c: Remove two required properties
      21c91bb9
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-6.8-2024-01-18' of git://git.infradead.org/users/hch/dma-mapping · 17e232b6
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
      
       - fix kerneldoc warnings (Randy Dunlap)
      
       - better bounds checking in swiotlb (ZhangPeng)
      
      * tag 'dma-mapping-6.8-2024-01-18' of git://git.infradead.org/users/hch/dma-mapping:
        dma-debug: fix kernel-doc warnings
        swiotlb: check alloc_size before the allocation of a new memory pool
      17e232b6
    • Linus Torvalds's avatar
      Merge tag 'memblock-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock · 77c9622d
      Linus Torvalds authored
      Pull memblock update from Mike Rapoport:
       "Code readability improvement.
      
        Use NUMA_NO_NODE instead of -1 as return value of
        memblock_search_pfn_nid() to improve code readability
        and consistency with the callers of that function"
      
      * tag 'memblock-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
        memblock: Return NUMA_NO_NODE instead of -1 to improve code readability
      77c9622d
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 0b7359cc
      Linus Torvalds authored
      Pull virtio updates from Michael Tsirkin:
      
       - vdpa/mlx5: support for resumable vqs
      
       - virtio_scsi: mq_poll support
      
       - 3virtio_pmem: support SHMEM_REGION
      
       - virtio_balloon: stay awake while adjusting balloon
      
       - virtio: support for no-reset virtio PCI PM
      
       - Fixes, cleanups
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vdpa/mlx5: Add mkey leak detection
        vdpa/mlx5: Introduce reference counting to mrs
        vdpa/mlx5: Use vq suspend/resume during .set_map
        vdpa/mlx5: Mark vq state for modification in hw vq
        vdpa/mlx5: Mark vq addrs for modification in hw vq
        vdpa/mlx5: Introduce per vq and device resume
        vdpa/mlx5: Allow modifying multiple vq fields in one modify command
        vdpa/mlx5: Expose resumable vq capability
        vdpa: Block vq property changes in DRIVER_OK
        vdpa: Track device suspended state
        scsi: virtio_scsi: Add mq_poll support
        virtio_pmem: support feature SHMEM_REGION
        virtio_balloon: stay awake while adjusting balloon
        vdpa: Remove usage of the deprecated ida_simple_xx() API
        virtio: Add support for no-reset virtio PCI PM
        virtio_net: fix missing dma unmap for resize
        vhost-vdpa: account iommu allocations
        vdpa: Fix an error handling path in eni_vdpa_probe()
      0b7359cc
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v6.8-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · da3c45c7
      Linus Torvalds authored
      Pull hwmonfix from Guenter Roeck:
       "Fix crash seen when instantiating npcm750-pwm-fan"
      
      * tag 'hwmon-for-v6.8-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (npcm750-pwm-fan) Fix crash observed when instantiating nuvoton,npcm750-pwm-fan
      da3c45c7
    • Linus Torvalds's avatar
      Merge tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · db5ccb9e
      Linus Torvalds authored
      Pull CXL (Compute Express Link) updates from Dan Williams:
       "The bulk of this update is support for enumerating the performance
        capabilities of CXL memory targets and connecting that to a platform
        CXL memory QoS class. Some follow-on work remains to hook up this data
        into core-mm policy, but that is saved for v6.9.
      
        The next significant update is unifying how CXL event records (things
        like background scrub errors) are processed between so called
        "firmware first" and native error record retrieval. The CXL driver
        handler that processes the record retrieved from the device mailbox is
        now the handler for that same record format coming from an EFI/ACPI
        notification source.
      
        This also contains miscellaneous feature updates, like Get Timestamp,
        and other fixups.
      
        Summary:
      
         - Add support for parsing the Coherent Device Attribute Table (CDAT)
      
         - Add support for calculating a platform CXL QoS class from CDAT data
      
         - Unify the tracing of EFI CXL Events with native CXL Events.
      
         - Add Get Timestamp support
      
         - Miscellaneous cleanups and fixups"
      
      * tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits)
        cxl/core: use sysfs_emit() for attr's _show()
        cxl/pci: Register for and process CPER events
        PCI: Introduce cleanup helpers for device reference counts and locks
        acpi/ghes: Process CXL Component Events
        cxl/events: Create a CXL event union
        cxl/events: Separate UUID from event structures
        cxl/events: Remove passing a UUID to known event traces
        cxl/events: Create common event UUID defines
        cxl/events: Promote CXL event structures to a core header
        cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe()
        cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge()
        cxl: Fix device reference leak in cxl_port_perf_data_calculate()
        cxl: Convert find_cxl_root() to return a 'struct cxl_root *'
        cxl: Introduce put_cxl_root() helper
        cxl/port: Fix missing target list lock
        cxl/port: Fix decoder initialization when nr_targets > interleave_ways
        cxl/region: fix x9 interleave typo
        cxl/trace: Pass UUID explicitly to event traces
        cxl/region: use %pap format to print resource_size_t
        cxl/region: Add dev_dbg() detail on failure to allocate HPA space
        ...
      db5ccb9e
  5. 18 Jan, 2024 11 commits
    • Linus Torvalds's avatar
      Merge tag 'vfio-v6.8-rc1' of https://github.com/awilliam/linux-vfio · 244aefb1
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Add debugfs support, initially used for reporting device migration
         state (Longfang Liu)
      
       - Fixes and support for migration dirty tracking across multiple IOVA
         regions in the pds-vfio-pci driver (Brett Creeley)
      
       - Improved IOMMU allocation accounting visibility (Pasha Tatashin)
      
       - Virtio infrastructure and a new virtio-vfio-pci variant driver, which
         provides emulation of a legacy virtio interfaces on modern virtio
         hardware for virtio-net VF devices where the PF driver exposes
         support for legacy admin queues, ie. an emulated IO BAR on an SR-IOV
         VF to provide driver ABI compatibility to legacy devices (Yishai
         Hadas & Feng Liu)
      
       - Migration fixes for the hisi-acc-vfio-pci variant driver (Shameer
         Kolothum)
      
       - Kconfig dependency fix for new virtio-vfio-pci variant driver (Arnd
         Bergmann)
      
      * tag 'vfio-v6.8-rc1' of https://github.com/awilliam/linux-vfio: (22 commits)
        vfio/virtio: fix virtio-pci dependency
        hisi_acc_vfio_pci: Update migration data pointer correctly on saving/resume
        vfio/virtio: Declare virtiovf_pci_aer_reset_done() static
        vfio/virtio: Introduce a vfio driver over virtio devices
        vfio/pci: Expose vfio_pci_core_iowrite/read##size()
        vfio/pci: Expose vfio_pci_core_setup_barmap()
        virtio-pci: Introduce APIs to execute legacy IO admin commands
        virtio-pci: Initialize the supported admin commands
        virtio-pci: Introduce admin commands
        virtio-pci: Introduce admin command sending function
        virtio-pci: Introduce admin virtqueue
        virtio: Define feature bit for administration virtqueue
        vfio/type1: account iommu allocations
        vfio/pds: Add multi-region support
        vfio/pds: Move seq/ack bitmaps into region struct
        vfio/pds: Pass region info to relevant functions
        vfio/pds: Move and rename region specific info
        vfio/pds: Only use a single SGL for both seq and ack
        vfio/pds: Fix calculations in pds_vfio_dirty_sync
        MAINTAINERS: Add vfio debugfs interface doc link
        ...
      244aefb1
    • Linus Torvalds's avatar
      Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd · 86c4d58a
      Linus Torvalds authored
      Pull iommufd updates from Jason Gunthorpe:
       "This brings the first of three planned user IO page table invalidation
        operations:
      
         - IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into
           the iommu itself. The Intel implementation will also generate an
           ATC invalidation to flush the device IOTLB as it unambiguously
           knows the device, but other HW will not.
      
        It goes along with the prior PR to implement userspace IO page tables
        (aka nested translation for VMs) to allow Intel to have full
        functionality for simple cases. An Intel implementation of the
        operation is provided.
      
        Also fix a small bug in the selftest mock iommu driver probe"
      
      * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
        iommufd/selftest: Check the bus type during probe
        iommu/vt-d: Add iotlb flush for nested domain
        iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
        iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
        iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
        iommufd/selftest: Add mock_domain_cache_invalidate_user support
        iommu: Add iommu_copy_struct_from_user_array helper
        iommufd: Add IOMMU_HWPT_INVALIDATE
        iommu: Add cache_invalidate_user op
      86c4d58a
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 0dde2bf6
      Linus Torvalds authored
      Pull iommu updates from Joerg Roedel:
       "Core changes:
         - Fix race conditions in device probe path
         - Retire IOMMU bus_ops
         - Support for passing custom allocators to page table drivers
         - Clean up Kconfig around IOMMU_SVA
         - Support for sharing SVA domains with all devices bound to a mm
         - Firmware data parsing cleanup
         - Tracing improvements for iommu-dma code
         - Some smaller fixes and cleanups
      
        ARM-SMMU drivers:
         - Device-tree binding updates:
            - Add additional compatible strings for Qualcomm SoCs
            - Document Adreno clocks for Qualcomm's SM8350 SoC
         - SMMUv2:
            - Implement support for the ->domain_alloc_paging() callback
            - Ensure Secure context is restored following suspend of Qualcomm
              SMMU implementation
         - SMMUv3:
            - Disable stalling mode for the "quiet" context descriptor
            - Minor refactoring and driver cleanups
      
        Intel VT-d driver:
         - Cleanup and refactoring
      
        AMD IOMMU driver:
         - Improve IO TLB invalidation logic
         - Small cleanups and improvements
      
        Rockchip IOMMU driver:
         - DT binding update to add Rockchip RK3588
      
        Apple DART driver:
         - Apple M1 USB4/Thunderbolt DART support
         - Cleanups
      
        Virtio IOMMU driver:
         - Add support for iotlb_sync_map
         - Enable deferred IO TLB flushes"
      
      * tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits)
        iommu: Don't reserve 0-length IOVA region
        iommu/vt-d: Move inline helpers to header files
        iommu/vt-d: Remove unused vcmd interfaces
        iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through()
        iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly
        iommu/sva: Fix memory leak in iommu_sva_bind_device()
        dt-bindings: iommu: rockchip: Add Rockchip RK3588
        iommu/dma: Trace bounce buffer usage when mapping buffers
        iommu/arm-smmu: Convert to domain_alloc_paging()
        iommu/arm-smmu: Pass arm_smmu_domain to internal functions
        iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED
        iommu/arm-smmu: Convert to a global static identity domain
        iommu/arm-smmu: Reorganize arm_smmu_domain_add_master()
        iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
        iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
        iommu/arm-smmu-v3: Add a type for the STE
        iommu/arm-smmu-v3: disable stall for quiet_cd
        iommu/qcom: restore IOMMU state if needed
        iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible
        iommu/arm-smmu-qcom: Add missing GMU entry to match table
        ...
      0dde2bf6
    • Linus Torvalds's avatar
      Merge tag 'percpu-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu · e7ded275
      Linus Torvalds authored
      Pull percpu updates from Dennis Zhou:
       "Enable percpu page allocator for RISC-V.
      
        There are RISC-V configurations with sparse NUMA configurations and
        small vmalloc space causing dynamic percpu allocations to fail as the
        backing chunk stride is too far apart"
      
      * tag 'percpu-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
        riscv: Enable pcpu page first chunk allocator
        mm: Introduce flush_cache_vmap_early()
      e7ded275
    • Linus Torvalds's avatar
      Merge tag 'eventfs-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 24f3a63e
      Linus Torvalds authored
      Pull eventfs updates from Steven Rostedt:
      
       - Remove "lookup" parameter of create_dir_dentry() and
         create_file_dentry(). These functions were called by lookup and the
         readdir logic, where readdir needed it to up the ref count of the
         dentry but the lookup did not. A "lookup" parameter was passed in to
         tell it what to do, but this complicated the code. It is better to
         just always up the ref count and require the caller to decrement it,
         even for lookup.
      
       - Modify the .iterate_shared callback to not use the dcache_readdir()
         logic and just handle what gets displayed by that one function. This
         removes the need for eventfs to hijack the file->private_data from
         the dcache_readdir() "cursor" pointer, and makes the code a bit more
         sane
      
       - Use the root and instance inodes for default ownership. Instead of
         walking the dentry tree and updating each dentry gid, use the
         getattr(), setattr() and permission() callbacks to set the ownership
         and permissions using the root or instance as the default
      
       - Some other optimizations with the eventfs iterate_shared logic
      
       - Hard-code the inodes for eventfs to the same number for files, and
         the same number for directories
      
       - Have getdent() not create dentries/inodes in iterate_shared() as now
         it has hard-coded inode numbers
      
       - Use kcalloc() instead of kzalloc() on a list of elements
      
       - Fix seq_buf warning and make static work properly.
      
      * tag 'eventfs-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        seq_buf: Make DECLARE_SEQ_BUF() usable
        eventfs: Use kcalloc() instead of kzalloc()
        eventfs: Do not create dentries nor inodes in iterate_shared
        eventfs: Have the inodes all for files and directories all be the same
        eventfs: Shortcut eventfs_iterate() by skipping entries already read
        eventfs: Read ei->entries before ei->children in eventfs_iterate()
        eventfs: Do ctx->pos update for all iterations in eventfs_iterate()
        eventfs: Have eventfs_iterate() stop immediately if ei->is_freed is set
        tracefs/eventfs: Use root and instance inodes as default ownership
        eventfs: Stop using dcache_readdir() for getdents()
        eventfs: Remove "lookup" parameter from create_dir/file_dentry()
      24f3a63e
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · a2ded784
      Linus Torvalds authored
      Pull tracing updates from Steven Rostedt:
      
       - Allow kernel trace instance creation to specify what events are
         created
      
         Inside the kernel, a subsystem may create a tracing instance that it
         can use to send events to user space. This sub-system may not care
         about the thousands of events that exist in eventfs. Allow the
         sub-system to specify what sub-systems of events it cares about, and
         only those events are exposed to this instance.
      
       - Allow the ring buffer to be broken up into bigger sub-buffers than
         just the architecture page size.
      
         A new tracefs file called "buffer_subbuf_size_kb" is created. The
         user can now specify a minimum size the sub-buffer may be in
         kilobytes. Note, that the implementation currently make the
         sub-buffer size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user
         only writes in kilobyte size, and the sub-buffer will be updated to
         the next size that it will can accommodate it. If the user writes in
         10, it will change the size to be 4 pages on x86 (16K), as that is
         the next available size that can hold 10K pages.
      
       - Update the debug output when a corrupt time is detected in the ring
         buffer. If the ring buffer detects inconsistent timestamps, there's a
         debug config options that will dump the contents of the meta data of
         the sub-buffer that is used for debugging. Add some more information
         to this dump that helps with debugging.
      
       - Add more timestamp debugging checks (only triggers when the config is
         enabled)
      
       - Increase the trace_seq iterator to 2 page sizes.
      
       - Allow strings written into tracefs_marker to be larger. Up to just
         under 2 page sizes (based on what trace_seq can hold).
      
       - Increase the trace_maker_raw write to be as big as a sub-buffer can
         hold.
      
       - Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has
         been removed.
      
       - More selftests were added.
      
       - Some code clean ups as well.
      
      * tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (29 commits)
        ring-buffer: Remove stale comment from ring_buffer_size()
        tracing histograms: Simplify parse_actions() function
        tracing/selftests: Remove exec permissions from trace_marker.tc test
        ring-buffer: Use subbuf_order for buffer page masking
        tracing: Update subbuffer with kilobytes not page order
        ringbuffer/selftest: Add basic selftest to test changing subbuf order
        ring-buffer: Add documentation on the buffer_subbuf_order file
        ring-buffer: Just update the subbuffers when changing their allocation order
        ring-buffer: Keep the same size when updating the order
        tracing: Stop the tracing while changing the ring buffer subbuf size
        tracing: Update snapshot order along with main buffer order
        ring-buffer: Make sure the spare sub buffer used for reads has same size
        ring-buffer: Do no swap cpu buffers if order is different
        ring-buffer: Clear pages on error in ring_buffer_subbuf_order_set() failure
        ring-buffer: Read and write to ring buffers with custom sub buffer size
        ring-buffer: Set new size of the ring buffer sub page
        ring-buffer: Add interface for configuring trace sub buffer size
        ring-buffer: Page size per ring buffer
        ring-buffer: Have ring_buffer_print_page_header() be able to access ring_buffer_iter
        ring-buffer: Check if absolute timestamp goes backwards
        ...
      a2ded784
    • Linus Torvalds's avatar
      Merge tag 'probes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 5b890ad4
      Linus Torvalds authored
      Pull probes update from Masami Hiramatsu:
      
       - Update the Kprobes trace event to show the actual function name in
         notrace-symbol warning.
      
         Instead of using the user specified symbol name, use "%ps" printk
         format to show the actual symbol at the probe address. Since kprobe
         event accepts the offset from symbol which is bigger than the symbol
         size, the user specified symbol may not be the actual probed symbol.
      
      * tag 'probes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        trace/kprobe: Display the actual notrace function when rejecting a probe
      5b890ad4
    • Linus Torvalds's avatar
      Merge tag 's390-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 302d1858
      Linus Torvalds authored
      Pull more s390 updates from Alexander Gordeev:
      
       - do not enable by default the support of 31-bit Enterprise Systems
         Architecture (ESA) ELF binaries
      
       - drop automatic CONFIG_KEXEC selection, while set CONFIG_KEXEC=y
         explicitly for defconfig and debug_defconfig only
      
       - fix zpci_get_max_io_size() to allow PCI block stores where normal PCI
         stores were used otherwise
      
       - remove unneeded tsk variable in do_exception() fault handler
      
       - __load_fpu_regs() is only called from the core kernel code.
         Therefore, remove not needed EXPORT_SYMBOL.
      
       - remove leftover comment from s390_fpregs_set() callback
      
       - few cleanups to Processor Activity Instrumentation (PAI) code (which
         perf framework is based on)
      
       - replace Wenjia Zhang with Thorsten Winkler as s390 Inter-User
         Communication Vehicle (IUCV) networking maintainer
      
       - Fix all scenarios where queues previously removed from a guest's
         Adjunct-Processor (AP) configuration do not re-appear in a reset
         state when they are subsequently made available to a guest again
      
      * tag 's390-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/vfio-ap: do not reset queue removed from host config
        s390/vfio-ap: reset queues associated with adapter for queue unbound from driver
        s390/vfio-ap: reset queues filtered from the guest's AP config
        s390/vfio-ap: let on_scan_complete() callback filter matrix and update guest's APCB
        s390/vfio-ap: loop over the shadow APCB when filtering guest's AP configuration
        s390/vfio-ap: always filter entire AP matrix
        s390/net: add Thorsten Winkler as maintainer
        s390/pai_ext: split function paiext_push_sample
        s390/pai_ext: rework function paiext_copy argments
        s390/pai: rework paiXXX_start and paiXXX_stop functions
        s390/pai_crypto: split function paicrypt_push_sample
        s390/pai: rework paixxxx_getctr interface
        s390/ptrace: remove leftover comment
        s390/fpu: remove __load_fpu_regs() export
        s390/mm,fault: remove not needed tsk variable
        s390/pci: fix max size calculation in zpci_memcpy_toio()
        s390/kexec: do not automatically select KEXEC option
        s390/compat: change default for CONFIG_COMPAT to "n"
      302d1858
    • Linus Torvalds's avatar
      Merge tag 'x86_tdx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b4442cad
      Linus Torvalds authored
      Pull x86 TDX updates from Dave Hansen:
       "This contains the initial support for host-side TDX support so that
        KVM can run TDX-protected guests. This does not include the actual
        KVM-side support which will come from the KVM folks. The TDX host
        interactions with kexec also needs to be ironed out before this is
        ready for prime time, so this code is currently Kconfig'd off when
        kexec is on.
      
        The majority of the code here is the kernel telling the TDX module
        which memory to protect and handing some additional memory over to it
        to use to store TDX module metadata. That sounds pretty simple, but
        the TDX architecture is rather flexible and it takes quite a bit of
        back-and-forth to say, "just protect all memory, please."
      
        There is also some code tacked on near the end of the series to handle
        a hardware erratum. The erratum can make software bugs such as a
        kernel write to TDX-protected memory cause a machine check and
        masquerade as a real hardware failure. The erratum handling watches
        out for these and tries to provide nicer user errors"
      
      * tag 'x86_tdx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
        x86/virt/tdx: Make TDX host depend on X86_MCE
        x86/virt/tdx: Disable TDX host support when kexec is enabled
        Documentation/x86: Add documentation for TDX host support
        x86/mce: Differentiate real hardware #MCs from TDX erratum ones
        x86/cpu: Detect TDX partial write machine check erratum
        x86/virt/tdx: Handle TDX interaction with sleep and hibernation
        x86/virt/tdx: Initialize all TDMRs
        x86/virt/tdx: Configure global KeyID on all packages
        x86/virt/tdx: Configure TDX module with the TDMRs and global KeyID
        x86/virt/tdx: Designate reserved areas for all TDMRs
        x86/virt/tdx: Allocate and set up PAMTs for TDMRs
        x86/virt/tdx: Fill out TDMRs to cover all TDX memory regions
        x86/virt/tdx: Add placeholder to construct TDMRs to cover all TDX memory regions
        x86/virt/tdx: Get module global metadata for module initialization
        x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory
        x86/virt/tdx: Add skeleton to enable TDX on demand
        x86/virt/tdx: Add SEAMCALL error printing for module initialization
        x86/virt/tdx: Handle SEAMCALL no entropy error in common code
        x86/virt/tdx: Make INTEL_TDX_HOST depend on X86_X2APIC
        x86/virt/tdx: Define TDX supported page sizes as macros
        ...
      b4442cad
    • Linus Torvalds's avatar
      Merge tag 'x86_sgx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ba7dd857
      Linus Torvalds authored
      Pull x86 SGX updates from Dave Hansen:
       "This time, these are entirely confined to SGX selftests fixes.
      
        The mini SGX enclave built by the selftests has garnered some
        attention because it stands alone and does not need the sizable
        infrastructure of the official SGX SDK. I think that's why folks are
        suddently interested in cleaning it up.
      
         - Clean up selftest compilation issues, mostly from non-gcc compilers
      
         - Avoid building selftests when not on x86"
      
      * tag 'x86_sgx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        selftests/sgx: Skip non X86_64 platform
        selftests/sgx: Remove incomplete ABI sanitization code in test enclave
        selftests/sgx: Discard unsupported ELF sections
        selftests/sgx: Ensure expected location of test enclave buffer
        selftests/sgx: Ensure test enclave buffer is entirely preserved
        selftests/sgx: Fix linker script asserts
        selftests/sgx: Handle relocations in test enclave
        selftests/sgx: Produce static-pie executable for test enclave
        selftests/sgx: Remove redundant enclave base address save/restore
        selftests/sgx: Specify freestanding environment for enclave compilation
        selftests/sgx: Separate linker options
        selftests/sgx: Include memory clobber for inline asm in test enclave
        selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry
        selftests/sgx: Fix uninitialized pointer dereference in error path
      ba7dd857
    • Jakub Kicinski's avatar
      Merge tag 'nf-24-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 925781a4
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following batch contains Netfilter fixes for net. Slightly larger
      than usual because this batch includes several patches to tighten the
      nf_tables control plane to reject inconsistent configuration:
      
      1) Restrict NFTA_SET_POLICY to NFT_SET_POL_PERFORMANCE and
         NFT_SET_POL_MEMORY.
      
      2) Bail out if a nf_tables expression registers more than 16 netlink
         attributes which is what struct nft_expr_info allows.
      
      3) Bail out if NFT_EXPR_STATEFUL provides no .clone interface, remove
         existing fallback to memcpy() when cloning which might accidentally
         duplicate memory reference to the same object.
      
      4) Fix br_netfilter interaction with neighbour layer. This requires
         three preparation patches:
      
         - Use nf_bridge_get_physinif() in nfnetlink_log
         - Use nf_bridge_info_exists() to check in br_netfilter context
           is available in nf_queue.
         - Pass net to nf_bridge_get_physindev()
      
         And finally, the fix which replaces physindev with physinif
         in nf_bridge_info.
      
         Patches from Pavel Tikhomirov.
      
      5) Catch-all deactivation happens in the transaction, hence this
         oneliner to check for the next generation. This bug uncovered after
         the removal of the _BUSY bit, which happened in set elements back in
         summer 2023.
      
      6) Ensure set (total) key length size and concat field length description
         is consistent, otherwise bail out.
      
      7) Skip set element with the _DEAD flag on from the netlink dump path.
         A tests occasionally shows that dump is mismatching because GC might
         lose race to get rid of this element while a netlink dump is in
         progress.
      
      8) Reject NFT_SET_CONCAT for field_count < 1.
      
      9) Use IP6_INC_STATS in ipvs to fix preemption BUG splat, patch
         from Fedor Pchelkin.
      
      * tag 'nf-24-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        ipvs: avoid stat macros calls from preemptible context
        netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
        netfilter: nf_tables: skip dead set elements in netlink dump
        netfilter: nf_tables: do not allow mismatch field size and set key length
        netfilter: nf_tables: check if catch-all set element is active in next generation
        netfilter: bridge: replace physindev with physinif in nf_bridge_info
        netfilter: propagate net to nf_bridge_get_physindev
        netfilter: nf_queue: remove excess nf_bridge variable
        netfilter: nfnetlink_log: use proper helper for fetching physinif
        netfilter: nft_limit: do not ignore unsupported flags
        netfilter: nf_tables: bail out if stateful expression provides no .clone
        netfilter: nf_tables: validate .maxattr at expression registration
        netfilter: nf_tables: reject invalid set policy
      ====================
      
      Link: https://lore.kernel.org/r/20240118161726.14838-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      925781a4