1. 04 Oct, 2023 11 commits
    • Florian Westphal's avatar
      netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure · 08738827
      Florian Westphal authored
      nft_rbtree_gc_elem() walks back and removes the end interval element that
      comes before the expired element.
      
      There is a small chance that we've cached this element as 'rbe_ge'.
      If this happens, we hold and test a pointer that has been queued for
      freeing.
      
      It also causes spurious insertion failures:
      
      $ cat test-testcases-sets-0044interval_overlap_0.1/testout.log
      Error: Could not process rule: File exists
      add element t s {  0 -  2 }
                         ^^^^^^
      Failed to insert  0 -  2 given:
      table ip t {
              set s {
                      type inet_service
                      flags interval,timeout
                      timeout 2s
                      gc-interval 2s
              }
      }
      
      The set (rbtree) is empty. The 'failure' doesn't happen on next attempt.
      
      Reason is that when we try to insert, the tree may hold an expired
      element that collides with the range we're adding.
      While we do evict/erase this element, we can trip over this check:
      
      if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new))
            return -ENOTEMPTY;
      
      rbe_ge was erased by the synchronous gc, we should not have done this
      check.  Next attempt won't find it, so retry results in successful
      insertion.
      
      Restart in-kernel to avoid such spurious errors.
      
      Such restart are rare, unless userspace intentionally adds very large
      numbers of elements with very short timeouts while setting a huge
      gc interval.
      
      Even in this case, this cannot loop forever, on each retry an existing
      element has been removed.
      
      As the caller is holding the transaction mutex, its impossible
      for a second entity to add more expiring elements to the tree.
      
      After this it also becomes feasible to remove the async gc worker
      and perform all garbage collection from the commit path.
      
      Fixes: c9e6978e ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      08738827
    • Phil Sutter's avatar
      netfilter: nf_tables: Deduplicate nft_register_obj audit logs · 0d880dc6
      Phil Sutter authored
      When adding/updating an object, the transaction handler emits suitable
      audit log entries already, the one in nft_obj_notify() is redundant. To
      fix that (and retain the audit logging from objects' 'update' callback),
      Introduce an "audit log free" variant for internal use.
      
      Fixes: c520292f ("audit: log nftables configuration change events once per table")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Acked-by: Paul Moore <paul@paul-moore.com> (Audit)
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      0d880dc6
    • Phil Sutter's avatar
      selftests: netfilter: Extend nft_audit.sh · 203bb9d3
      Phil Sutter authored
      Add tests for sets and elements and deletion of all kinds. Also
      reorder rule reset tests: By moving the bulk rule add command up, the
      two 'reset rules' tests become identical.
      
      While at it, fix for a failing bulk rule add test's error status getting
      lost due to its use in a pipe. Avoid this by using a temporary file.
      
      Headings in diff output for failing tests contain no useful data, strip
      them.
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      203bb9d3
    • Xin Long's avatar
      selftests: netfilter: test for sctp collision processing in nf_conntrack · cf791b22
      Xin Long authored
      This patch adds a test case to reproduce the SCTP DATA chunk retransmission
      timeout issue caused by the improper SCTP collision processing in netfilter
      nf_conntrack_proto_sctp.
      
      In this test, client sends a INIT chunk, but the INIT_ACK replied from
      server is delayed until the server sends a INIT chunk to start a new
      connection from its side. After the connection is complete from server
      side, the delayed INIT_ACK arrives in nf_conntrack_proto_sctp.
      
      The delayed INIT_ACK should be dropped in nf_conntrack_proto_sctp instead
      of updating the vtag with the out-of-date init_tag, otherwise, the vtag
      in DATA chunks later sent by client don't match the vtag in the conntrack
      entry and the DATA chunks get dropped.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      cf791b22
    • Xin Long's avatar
      netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp · 8e56b063
      Xin Long authored
      In Scenario A and B below, as the delayed INIT_ACK always changes the peer
      vtag, SCTP ct with the incorrect vtag may cause packet loss.
      
      Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK
      
        192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772]
          192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151]
          192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772]
        192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] *
        192.168.1.2 > 192.168.1.1: [COOKIE ECHO]
          192.168.1.1 > 192.168.1.2: [COOKIE ECHO]
          192.168.1.2 > 192.168.1.1: [COOKIE ACK]
      
      Scenario B: INIT_ACK is delayed until the peer completes its own handshake
      
        192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
          192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
          192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
          192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
          192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
        192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] *
      
      This patch fixes it as below:
      
      In SCTP_CID_INIT processing:
      - clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] &&
        ct->proto.sctp.init[!dir]. (Scenario E)
      - set ct->proto.sctp.init[dir].
      
      In SCTP_CID_INIT_ACK processing:
      - drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] &&
        ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C)
      - drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] &&
        ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A)
      
      In SCTP_CID_COOKIE_ACK processing:
      - clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir].
        (Scenario D)
      
      Also, it's important to allow the ct state to move forward with cookie_echo
      and cookie_ack from the opposite dir for the collision scenarios.
      
      There are also other Scenarios where it should allow the packet through,
      addressed by the processing above:
      
      Scenario C: new CT is created by INIT_ACK.
      
      Scenario D: start INIT on the existing ESTABLISHED ct.
      
      Scenario E: start INIT after the old collision on the existing ESTABLISHED
      ct.
      
        192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
        192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
        (both side are stopped, then start new connection again in hours)
        192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742]
      
      Fixes: 9fb9cbb1 ("[NETFILTER]: Add nf_conntrack subsystem.")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      8e56b063
    • Florian Westphal's avatar
      netfilter: nft_payload: rebuild vlan header on h_proto access · af84f9e4
      Florian Westphal authored
      nft can perform merging of adjacent payload requests.
      This means that:
      
      ether saddr 00:11 ... ether type 8021ad ...
      
      is a single payload expression, for 8 bytes, starting at the
      ethernet source offset.
      
      Check that offset+length is fully within the source/destination mac
      addersses.
      
      This bug prevents 'ether type' from matching the correct h_proto in case
      vlan tag got stripped.
      
      Fixes: de6843be ("netfilter: nft_payload: rebuild vlan header when needed")
      Reported-by: default avatarDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      af84f9e4
    • David Wilder's avatar
      ibmveth: Remove condition to recompute TCP header checksum. · 51e7a666
      David Wilder authored
      In some OVS environments the TCP pseudo header checksum may need to be
      recomputed. Currently this is only done when the interface instance is
      configured for "Trunk Mode". We found the issue also occurs in some
      Kubernetes environments, these environments do not use "Trunk Mode",
      therefor the condition is removed.
      
      Performance tests with this change show only a fractional decrease in
      throughput (< 0.2%).
      
      Fixes: 7525de25 ("ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.")
      Signed-off-by: default avatarDavid Wilder <dwilder@us.ibm.com>
      Reviewed-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51e7a666
    • Dan Carpenter's avatar
      dmaengine: ti: k3-udma-glue: clean up k3_udma_glue_tx_get_irq() return · f9a1d321
      Dan Carpenter authored
      The k3_udma_glue_tx_get_irq() function currently returns negative error
      codes on error, zero on error and positive values for success.  This
      complicates life for the callers who need to propagate the error code.
      Also GCC will not warn about unsigned comparisons when you check:
      
      	if (unsigned_irq <= 0)
      
      All the callers have been fixed now but let's just make this easy going
      forward.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarRoger Quadros <rogerq@kernel.org>
      Acked-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9a1d321
    • Dan Carpenter's avatar
      net: ti: icssg-prueth: Fix signedness bug in prueth_init_tx_chns() · a325f174
      Dan Carpenter authored
      The "tx_chn->irq" variable is unsigned so the error checking does not
      work correctly.
      
      Fixes: 128d5874 ("net: ti: icssg-prueth: Add ICSSG ethernet driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarRoger Quadros <rogerq@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a325f174
    • Dan Carpenter's avatar
      net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns() · 37d4f555
      Dan Carpenter authored
      This accidentally returns success, but it should return a negative error
      code.
      
      Fixes: 93a76530 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarRoger Quadros <rogerq@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37d4f555
    • Stefano Garzarella's avatar
      vringh: don't use vringh_kiov_advance() in vringh_iov_xfer() · 7aed44ba
      Stefano Garzarella authored
      In the while loop of vringh_iov_xfer(), `partlen` could be 0 if one of
      the `iov` has 0 lenght.
      In this case, we should skip the iov and go to the next one.
      But calling vringh_kiov_advance() with 0 lenght does not cause the
      advancement, since it returns immediately if asked to advance by 0 bytes.
      
      Let's restore the code that was there before commit b8c06ad4
      ("vringh: implement vringh_kiov_advance()"), avoiding using
      vringh_kiov_advance().
      
      Fixes: b8c06ad4 ("vringh: implement vringh_kiov_advance()")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7aed44ba
  2. 03 Oct, 2023 7 commits
    • Yoshihiro Shimoda's avatar
      rswitch: Fix PHY station management clock setting · a0c55bba
      Yoshihiro Shimoda authored
      Fix the MPIC.PSMCS value following the programming example in the
      section 6.4.2 Management Data Clock (MDC) Setting, Ethernet MAC IP,
      S4 Hardware User Manual Rev.1.00.
      
      The value is calculated by
          MPIC.PSMCS = clk[MHz] / (MDC frequency[MHz] * 2) - 1
      with the input clock frequency from clk_get_rate() and MDC frequency
      of 2.5MHz. Otherwise, this driver cannot communicate PHYs on the R-Car
      S4 Starter Kit board.
      
      Fixes: 3590918b ("net: ethernet: renesas: Add support for "Ethernet Switch"")
      Reported-by: default avatarTam Nguyen <tam.nguyen.xa@renesas.com>
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Tested-by: default avatarKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230926123054.3976752-1-yoshihiro.shimoda.uh@renesas.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a0c55bba
    • Jeremy Cline's avatar
      net: nfc: llcp: Add lock when modifying device list · dfc7f7a9
      Jeremy Cline authored
      The device list needs its associated lock held when modifying it, or the
      list could become corrupted, as syzbot discovered.
      
      Reported-and-tested-by: syzbot+c1d0a03d305972dbbe14@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=c1d0a03d305972dbbe14Signed-off-by: default avatarJeremy Cline <jeremy@jcline.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Fixes: 6709d4b7 ("net: nfc: Fix use-after-free caused by nfc_llcp_find_local")
      Link: https://lore.kernel.org/r/20230908235853.1319596-1-jeremy@jcline.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dfc7f7a9
    • Parthiban Veerasooran's avatar
      ethtool: plca: fix plca enable data type while parsing the value · 8957261c
      Parthiban Veerasooran authored
      The ETHTOOL_A_PLCA_ENABLED data type is u8. But while parsing the
      value from the attribute, nla_get_u32() is used in the plca_update_sint()
      function instead of nla_get_u8(). So plca_cfg.enabled variable is updated
      with some garbage value instead of 0 or 1 and always enables plca even
      though plca is disabled through ethtool application. This bug has been
      fixed by parsing the values based on the attributes type in the policy.
      
      Fixes: 8580e16c ("net/ethtool: add netlink interface for the PLCA RS")
      Signed-off-by: default avatarParthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230908044548.5878-1-Parthiban.Veerasooran@microchip.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8957261c
    • Gustavo A. R. Silva's avatar
      qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info · eea03d18
      Gustavo A. R. Silva authored
      The flexible structure (a structure that contains a flexible-array member
      at the end) `qed_ll2_tx_packet` is nested within the second layer of
      `struct qed_ll2_info`:
      
      struct qed_ll2_tx_packet {
      	...
              /* Flexible Array of bds_set determined by max_bds_per_packet */
              struct {
                      struct core_tx_bd *txq_bd;
                      dma_addr_t tx_frag;
                      u16 frag_len;
              } bds_set[];
      };
      
      struct qed_ll2_tx_queue {
      	...
      	struct qed_ll2_tx_packet cur_completing_packet;
      };
      
      struct qed_ll2_info {
      	...
      	struct qed_ll2_tx_queue tx_queue;
              struct qed_ll2_cbs cbs;
      };
      
      The problem is that member `cbs` in `struct qed_ll2_info` is placed just
      after an object of type `struct qed_ll2_tx_queue`, which is in itself
      an implicit flexible structure, which by definition ends in a flexible
      array member, in this case `bds_set`. This causes an undefined behavior
      bug at run-time when dynamic memory is allocated for `bds_set`, which
      could lead to a serious issue if `cbs` in `struct qed_ll2_info` is
      overwritten by the contents of `bds_set`. Notice that the type of `cbs`
      is a structure full of function pointers (and a cookie :) ):
      
      include/linux/qed/qed_ll2_if.h:
      107 typedef
      108 void (*qed_ll2_complete_rx_packet_cb)(void *cxt,
      109                                       struct qed_ll2_comp_rx_data *data);
      110
      111 typedef
      112 void (*qed_ll2_release_rx_packet_cb)(void *cxt,
      113                                      u8 connection_handle,
      114                                      void *cookie,
      115                                      dma_addr_t rx_buf_addr,
      116                                      bool b_last_packet);
      117
      118 typedef
      119 void (*qed_ll2_complete_tx_packet_cb)(void *cxt,
      120                                       u8 connection_handle,
      121                                       void *cookie,
      122                                       dma_addr_t first_frag_addr,
      123                                       bool b_last_fragment,
      124                                       bool b_last_packet);
      125
      126 typedef
      127 void (*qed_ll2_release_tx_packet_cb)(void *cxt,
      128                                      u8 connection_handle,
      129                                      void *cookie,
      130                                      dma_addr_t first_frag_addr,
      131                                      bool b_last_fragment, bool b_last_packet);
      132
      133 typedef
      134 void (*qed_ll2_slowpath_cb)(void *cxt, u8 connection_handle,
      135                             u32 opaque_data_0, u32 opaque_data_1);
      136
      137 struct qed_ll2_cbs {
      138         qed_ll2_complete_rx_packet_cb rx_comp_cb;
      139         qed_ll2_release_rx_packet_cb rx_release_cb;
      140         qed_ll2_complete_tx_packet_cb tx_comp_cb;
      141         qed_ll2_release_tx_packet_cb tx_release_cb;
      142         qed_ll2_slowpath_cb slowpath_cb;
      143         void *cookie;
      144 };
      
      Fix this by moving the declaration of `cbs` to the  middle of its
      containing structure `qed_ll2_info`, preventing it from being
      overwritten by the contents of `bds_set` at run-time.
      
      This bug was introduced in 2017, when `bds_set` was converted to a
      one-element array, and started to be used as a Variable Length Object
      (VLO) at run-time.
      
      Fixes: f5823fe6 ("qed: Add ll2 option to limit the number of bds per packet")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/ZQ+Nz8DfPg56pIzr@workSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      eea03d18
    • Shigeru Yoshida's avatar
      net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg · e9c65989
      Shigeru Yoshida authored
      syzbot reported the following uninit-value access issue:
      
      =====================================================
      BUG: KMSAN: uninit-value in smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:975 [inline]
      BUG: KMSAN: uninit-value in smsc75xx_bind+0x5c9/0x11e0 drivers/net/usb/smsc75xx.c:1482
      CPU: 0 PID: 8696 Comm: kworker/0:3 Not tainted 5.8.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: usb_hub_wq hub_event
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x21c/0x280 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:975 [inline]
       smsc75xx_bind+0x5c9/0x11e0 drivers/net/usb/smsc75xx.c:1482
       usbnet_probe+0x1152/0x3f90 drivers/net/usb/usbnet.c:1737
       usb_probe_interface+0xece/0x1550 drivers/usb/core/driver.c:374
       really_probe+0xf20/0x20b0 drivers/base/dd.c:529
       driver_probe_device+0x293/0x390 drivers/base/dd.c:701
       __device_attach_driver+0x63f/0x830 drivers/base/dd.c:807
       bus_for_each_drv+0x2ca/0x3f0 drivers/base/bus.c:431
       __device_attach+0x4e2/0x7f0 drivers/base/dd.c:873
       device_initial_probe+0x4a/0x60 drivers/base/dd.c:920
       bus_probe_device+0x177/0x3d0 drivers/base/bus.c:491
       device_add+0x3b0e/0x40d0 drivers/base/core.c:2680
       usb_set_configuration+0x380f/0x3f10 drivers/usb/core/message.c:2032
       usb_generic_driver_probe+0x138/0x300 drivers/usb/core/generic.c:241
       usb_probe_device+0x311/0x490 drivers/usb/core/driver.c:272
       really_probe+0xf20/0x20b0 drivers/base/dd.c:529
       driver_probe_device+0x293/0x390 drivers/base/dd.c:701
       __device_attach_driver+0x63f/0x830 drivers/base/dd.c:807
       bus_for_each_drv+0x2ca/0x3f0 drivers/base/bus.c:431
       __device_attach+0x4e2/0x7f0 drivers/base/dd.c:873
       device_initial_probe+0x4a/0x60 drivers/base/dd.c:920
       bus_probe_device+0x177/0x3d0 drivers/base/bus.c:491
       device_add+0x3b0e/0x40d0 drivers/base/core.c:2680
       usb_new_device+0x1bd4/0x2a30 drivers/usb/core/hub.c:2554
       hub_port_connect drivers/usb/core/hub.c:5208 [inline]
       hub_port_connect_change drivers/usb/core/hub.c:5348 [inline]
       port_event drivers/usb/core/hub.c:5494 [inline]
       hub_event+0x5e7b/0x8a70 drivers/usb/core/hub.c:5576
       process_one_work+0x1688/0x2140 kernel/workqueue.c:2269
       worker_thread+0x10bc/0x2730 kernel/workqueue.c:2415
       kthread+0x551/0x590 kernel/kthread.c:292
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
      
      Local variable ----buf.i87@smsc75xx_bind created at:
       __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline]
       smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:968 [inline]
       smsc75xx_bind+0x485/0x11e0 drivers/net/usb/smsc75xx.c:1482
       __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline]
       smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:968 [inline]
       smsc75xx_bind+0x485/0x11e0 drivers/net/usb/smsc75xx.c:1482
      
      This issue is caused because usbnet_read_cmd() reads less bytes than requested
      (zero byte in the reproducer). In this case, 'buf' is not properly filled.
      
      This patch fixes the issue by returning -ENODATA if usbnet_read_cmd() reads
      less bytes than requested.
      
      Fixes: d0cad871 ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver")
      Reported-and-tested-by: syzbot+6966546b78d050bb0b5d@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=6966546b78d050bb0b5dSigned-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230923173549.3284502-1-syoshida@redhat.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e9c65989
    • Ilya Maximets's avatar
      ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling · 9593c7cb
      Ilya Maximets authored
      Commit b0e214d2 ("netfilter: keep conntrack reference until
      IPsecv6 policy checks are done") is a direct copy of the old
      commit b59c2701 ("[NETFILTER]: Keep conntrack reference until
      IPsec policy checks are done") but for IPv6.  However, it also
      copies a bug that this old commit had.  That is: when the third
      packet of 3WHS connection establishment contains payload, it is
      added into socket receive queue without the XFRM check and the
      drop of connection tracking context.
      
      That leads to nf_conntrack module being impossible to unload as
      it waits for all the conntrack references to be dropped while
      the packet release is deferred in per-cpu cache indefinitely, if
      not consumed by the application.
      
      The issue for IPv4 was fixed in commit 6f0012e3 ("tcp: add a
      missing nf_reset_ct() in 3WHS handling") by adding a missing XFRM
      check and correctly dropping the conntrack context.  However, the
      issue was introduced to IPv6 code afterwards.  Fixing it the
      same way for IPv6 now.
      
      Fixes: b0e214d2 ("netfilter: keep conntrack reference until IPsecv6 policy checks are done")
      Link: https://lore.kernel.org/netdev/d589a999-d4dd-2768-b2d5-89dec64a4a42@ovn.org/Signed-off-by: default avatarIlya Maximets <i.maximets@ovn.org>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230922210530.2045146-1-i.maximets@ovn.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9593c7cb
    • Hangbin Liu's avatar
      ipv4/fib: send notify when delete source address routes · 4b2b6060
      Hangbin Liu authored
      After deleting an interface address in fib_del_ifaddr(), the function
      scans the fib_info list for stray entries and calls fib_flush() and
      fib_table_flush(). Then the stray entries will be deleted silently and no
      RTM_DELROUTE notification will be sent.
      
      This lack of notification can make routing daemons, or monitor like
      `ip monitor route` miss the routing changes. e.g.
      
      + ip link add dummy1 type dummy
      + ip link add dummy2 type dummy
      + ip link set dummy1 up
      + ip link set dummy2 up
      + ip addr add 192.168.5.5/24 dev dummy1
      + ip route add 7.7.7.0/24 dev dummy2 src 192.168.5.5
      + ip -4 route
      7.7.7.0/24 dev dummy2 scope link src 192.168.5.5
      192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5
      + ip monitor route
      + ip addr del 192.168.5.5/24 dev dummy1
      Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5
      Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5
      Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5
      
      As Ido reminded, fib_table_flush() isn't only called when an address is
      deleted, but also when an interface is deleted or put down. The lack of
      notification in these cases is deliberate. And commit 7c6bb7d2
      ("net/ipv6: Add knob to skip DELROUTE message on device down") introduced
      a sysctl to make IPv6 behave like IPv4 in this regard. So we can't send
      the route delete notify blindly in fib_table_flush().
      
      To fix this issue, let's add a new flag in "struct fib_info" to track the
      deleted prefer source address routes, and only send notify for them.
      
      After update:
      + ip monitor route
      + ip addr del 192.168.5.5/24 dev dummy1
      Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5
      Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5
      Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5
      Deleted 7.7.7.0/24 dev dummy2 scope link src 192.168.5.5
      Suggested-by: default avatarThomas Haller <thaller@redhat.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20230922075508.848925-1-liuhangbin@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4b2b6060
  3. 02 Oct, 2023 3 commits
    • Kees Cook's avatar
      sky2: Make sure there is at least one frag_addr available · 6a70e5cb
      Kees Cook authored
      In the pathological case of building sky2 with 16k PAGE_SIZE, the
      frag_addr[] array would never be used, so the original code was correct
      that size should be 0. But the compiler now gets upset with 0 size arrays
      in places where it hasn't eliminated the code that might access such an
      array (it can't figure out that in this case an rx skb with fragments
      would never be created). To keep the compiler happy, make sure there is
      at least 1 frag_addr in struct rx_ring_info:
      
         In file included from include/linux/skbuff.h:28,
                          from include/net/net_namespace.h:43,
                          from include/linux/netdevice.h:38,
                          from drivers/net/ethernet/marvell/sky2.c:18:
         drivers/net/ethernet/marvell/sky2.c: In function 'sky2_rx_unmap_skb':
         include/linux/dma-mapping.h:416:36: warning: array subscript i is outside array bounds of 'dma_addr_t[0]' {aka 'long long unsigned int[]'} [-Warray-bounds=]
           416 | #define dma_unmap_page(d, a, s, r) dma_unmap_page_attrs(d, a, s, r, 0)
               |                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         drivers/net/ethernet/marvell/sky2.c:1257:17: note: in expansion of macro 'dma_unmap_page'
          1257 |                 dma_unmap_page(&pdev->dev, re->frag_addr[i],
               |                 ^~~~~~~~~~~~~~
         In file included from drivers/net/ethernet/marvell/sky2.c:41:
         drivers/net/ethernet/marvell/sky2.h:2198:25: note: while referencing 'frag_addr'
          2198 |         dma_addr_t      frag_addr[ETH_JUMBO_MTU >> PAGE_SHIFT];
               |                         ^~~~~~~~~
      
      With CONFIG_PAGE_SIZE_16KB=y, PAGE_SHIFT == 14, so:
      
        #define ETH_JUMBO_MTU   9000
      
      causes "ETH_JUMBO_MTU >> PAGE_SHIFT" to be 0. Use "?: 1" to solve this build warning.
      
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: netdev@vger.kernel.org
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202309191958.UBw1cjXk-lkp@intel.com/Reviewed-by: default avatarAlexander Lobakin <aleksander.lobakin@intel.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a70e5cb
    • Fabio Estevam's avatar
      net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent · 6ccf50d4
      Fabio Estevam authored
      Since commit 23d775f1 ("net: dsa: mv88e6xxx: Wait for EEPROM done
      before HW reset") the following error is seen on a imx8mn board with
      a 88E6320 switch:
      
      mv88e6085 30be0000.ethernet-1:00: Timeout waiting for EEPROM done
      
      This board does not have an EEPROM attached to the switch though.
      
      This problem is well explained by Andrew Lunn:
      
      "If there is an EEPROM, and the EEPROM contains a lot of data, it could
      be that when we perform a hardware reset towards the end of probe, it
      interrupts an I2C bus transaction, leaving the I2C bus in a bad state,
      and future reads of the EEPROM do not work.
      
      The work around for this was to poll the EEInt status and wait for it
      to go true before performing the hardware reset.
      
      However, we have discovered that for some boards which do not have an
      EEPROM, EEInt never indicates complete. As a result,
      mv88e6xxx_g1_wait_eeprom_done() spins for a second and then prints a
      warning.
      
      We probably need a different solution than calling
      mv88e6xxx_g1_wait_eeprom_done(). The datasheet for 6352 documents the
      EEPROM Command register:
      
      bit 15 is:
      
        EEPROM Unit Busy. This bit must be set to a one to start an EEPROM
        operation (see EEOp below). Only one EEPROM operation can be
        executing at one time so this bit must be zero before setting it to
        a one.  When the requested EEPROM operation completes this bit will
        automatically be cleared to a zero. The transition of this bit from
        a one to a zero can be used to generate an interrupt (the EEInt in
        Global 1, offset 0x00).
      
      and more interesting is bit 11:
      
        Register Loader Running. This bit is set to one whenever the
        register loader is busy executing instructions contained in the
        EEPROM."
      
      Change to using mv88e6xxx_g2_eeprom_wait() to fix the timeout error
      when the EEPROM chip is not present.
      
      Fixes: 23d775f1 ("net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset")
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarFabio Estevam <festevam@denx.de>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ccf50d4
    • Dinghao Liu's avatar
      ptp: ocp: Fix error handling in ptp_ocp_device_init · caa0578c
      Dinghao Liu authored
      When device_add() fails, ptp_ocp_dev_release() will be called
      after put_device(). Therefore, it seems that the
      ptp_ocp_dev_release() before put_device() is redundant.
      
      Fixes: 773bda96 ("ptp: ocp: Expose various resources on the timecard.")
      Signed-off-by: default avatarDinghao Liu <dinghao.liu@zju.edu.cn>
      Reviewed-by: default avatarVadim Feodrenko <vadim.fedorenko@linux.dev>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      caa0578c
  4. 01 Oct, 2023 8 commits
    • Jordan Rife's avatar
      net: prevent address rewrite in kernel_bind() · c889a99a
      Jordan Rife authored
      Similar to the change in commit 0bdf3993("net: Avoid address
      overwrite in kernel_connect"), BPF hooks run on bind may rewrite the
      address passed to kernel_bind(). This change
      
      1) Makes a copy of the bind address in kernel_bind() to insulate
         callers.
      2) Replaces direct calls to sock->ops->bind() in net with kernel_bind()
      
      Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
      Fixes: 4fbac77d ("bpf: Hooks for sys_bind")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJordan Rife <jrife@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c889a99a
    • Jordan Rife's avatar
      net: prevent rewrite of msg_name in sock_sendmsg() · 86a7e0b6
      Jordan Rife authored
      Callers of sock_sendmsg(), and similarly kernel_sendmsg(), in kernel
      space may observe their value of msg_name change in cases where BPF
      sendmsg hooks rewrite the send address. This has been confirmed to break
      NFS mounts running in UDP mode and has the potential to break other
      systems.
      
      This patch:
      
      1) Creates a new function called __sock_sendmsg() with same logic as the
         old sock_sendmsg() function.
      2) Replaces calls to sock_sendmsg() made by __sys_sendto() and
         __sys_sendmsg() with __sock_sendmsg() to avoid an unnecessary copy,
         as these system calls are already protected.
      3) Modifies sock_sendmsg() so that it makes a copy of msg_name if
         present before passing it down the stack to insulate callers from
         changes to the send address.
      
      Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
      Fixes: 1cedee13 ("bpf: Hooks for sys_sendmsg")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJordan Rife <jrife@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86a7e0b6
    • Jordan Rife's avatar
      net: replace calls to sock->ops->connect() with kernel_connect() · 26297b4c
      Jordan Rife authored
      commit 0bdf3993 ("net: Avoid address overwrite in kernel_connect")
      ensured that kernel_connect() will not overwrite the address parameter
      in cases where BPF connect hooks perform an address rewrite. This change
      replaces direct calls to sock->ops->connect() in net with kernel_connect()
      to make these call safe.
      
      Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
      Fixes: d74bad4e ("bpf: Hooks for sys_connect")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJordan Rife <jrife@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26297b4c
    • David Howells's avatar
      ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data() · 9d4c7580
      David Howells authored
      Including the transhdrlen in length is a problem when the packet is
      partially filled (e.g. something like send(MSG_MORE) happened previously)
      when appending to an IPv4 or IPv6 packet as we don't want to repeat the
      transport header or account for it twice.  This can happen under some
      circumstances, such as splicing into an L2TP socket.
      
      The symptom observed is a warning in __ip6_append_data():
      
          WARNING: CPU: 1 PID: 5042 at net/ipv6/ip6_output.c:1800 __ip6_append_data.isra.0+0x1be8/0x47f0 net/ipv6/ip6_output.c:1800
      
      that occurs when MSG_SPLICE_PAGES is used to append more data to an already
      partially occupied skbuff.  The warning occurs when 'copy' is larger than
      the amount of data in the message iterator.  This is because the requested
      length includes the transport header length when it shouldn't.  This can be
      triggered by, for example:
      
              sfd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_L2TP);
              bind(sfd, ...); // ::1
              connect(sfd, ...); // ::1 port 7
              send(sfd, buffer, 4100, MSG_MORE);
              sendfile(sfd, dfd, NULL, 1024);
      
      Fix this by only adding transhdrlen into the length if the write queue is
      empty in l2tp_ip6_sendmsg(), analogously to how UDP does things.
      
      l2tp_ip_sendmsg() looks like it won't suffer from this problem as it builds
      the UDP packet itself.
      
      Fixes: a32e0eec ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
      Reported-by: syzbot+62cbf263225ae13ff153@syzkaller.appspotmail.com
      Link: https://lore.kernel.org/r/0000000000001c12b30605378ce8@google.com/Suggested-by: default avatarWillem de Bruijn <willemdebruijn.kernel@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: David Ahern <dsahern@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: netdev@vger.kernel.org
      cc: bpf@vger.kernel.org
      cc: syzkaller-bugs@googlegroups.com
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d4c7580
    • Eric Dumazet's avatar
      neighbour: fix data-races around n->output · 5baa0433
      Eric Dumazet authored
      n->output field can be read locklessly, while a writer
      might change the pointer concurrently.
      
      Add missing annotations to prevent load-store tearing.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5baa0433
    • Eric Dumazet's avatar
      net: fix possible store tearing in neigh_periodic_work() · 25563b58
      Eric Dumazet authored
      While looking at a related syzbot report involving neigh_periodic_work(),
      I found that I forgot to add an annotation when deleting an
      RCU protected item from a list.
      
      Readers use rcu_deference(*np), we need to use either
      rcu_assign_pointer() or WRITE_ONCE() on writer side
      to prevent store tearing.
      
      I use rcu_assign_pointer() to have lockdep support,
      this was the choice made in neigh_flush_dev().
      
      Fixes: 767e97e1 ("neigh: RCU conversion of struct neighbour")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25563b58
    • David S. Miller's avatar
      Merge tag 'for-net-2023-09-20' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · c15cd642
      David S. Miller authored
      bluetooth pull request for net:
      
       - Fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER
       - Fix handling of listen for ISO unicast
       - Fix build warnings
       - Fix leaking content of local_codecs
       - Add shutdown function for QCA6174
       - Delete unused hci_req_prepare_suspend() declaration
       - Fix hci_link_tx_to RCU lock usage
       - Avoid redundant authentication
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c15cd642
    • Clark Wang's avatar
      net: stmmac: platform: fix the incorrect parameter · 6b09edc1
      Clark Wang authored
      The second parameter of stmmac_pltfr_init() needs the pointer of
      "struct plat_stmmacenet_data". So, correct the parameter typo when calling the
      function.
      
      Otherwise, it may cause this alignment exception when doing suspend/resume.
      [   49.067201] CPU1 is up
      [   49.135258] Internal error: SP/PC alignment exception: 000000008a000000 [#1] PREEMPT SMP
      [   49.143346] Modules linked in: soc_imx9 crct10dif_ce polyval_ce nvmem_imx_ocotp_fsb_s400 polyval_generic layerscape_edac_mod snd_soc_fsl_asoc_card snd_soc_imx_audmux snd_soc_imx_card snd_soc_wm8962 el_enclave snd_soc_fsl_micfil rtc_pcf2127 rtc_pcf2131 flexcan can_dev snd_soc_fsl_xcvr snd_soc_fsl_sai imx8_media_dev(C) snd_soc_fsl_utils fuse
      [   49.173393] CPU: 0 PID: 565 Comm: sh Tainted: G         C         6.5.0-rc4-next-20230804-05047-g5781a6249dae #677
      [   49.183721] Hardware name: NXP i.MX93 11X11 EVK board (DT)
      [   49.189190] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [   49.196140] pc : 0x80800052
      [   49.198931] lr : stmmac_pltfr_resume+0x34/0x50
      [   49.203368] sp : ffff800082f8bab0
      [   49.206670] x29: ffff800082f8bab0 x28: ffff0000047d0ec0 x27: ffff80008186c170
      [   49.213794] x26: 0000000b5e4ff1ba x25: ffff800081e5fa74 x24: 0000000000000010
      [   49.220918] x23: ffff800081fe0000 x22: 0000000000000000 x21: 0000000000000000
      [   49.228042] x20: ffff0000001b4010 x19: ffff0000001b4010 x18: 0000000000000006
      [   49.235166] x17: ffff7ffffe007000 x16: ffff800080000000 x15: 0000000000000000
      [   49.242290] x14: 00000000000000fc x13: 0000000000000000 x12: 0000000000000000
      [   49.249414] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff800082f8b8c0
      [   49.256538] x8 : 0000000000000008 x7 : 0000000000000001 x6 : 000000005f54a200
      [   49.263662] x5 : 0000000001000000 x4 : ffff800081b93680 x3 : ffff800081519be0
      [   49.270786] x2 : 0000000080800052 x1 : 0000000000000000 x0 : ffff0000001b4000
      [   49.277911] Call trace:
      [   49.280346]  0x80800052
      [   49.282781]  platform_pm_resume+0x2c/0x68
      [   49.286785]  dpm_run_callback.constprop.0+0x74/0x134
      [   49.291742]  device_resume+0x88/0x194
      [   49.295391]  dpm_resume+0x10c/0x230
      [   49.298866]  dpm_resume_end+0x18/0x30
      [   49.302515]  suspend_devices_and_enter+0x2b8/0x624
      [   49.307299]  pm_suspend+0x1fc/0x348
      [   49.310774]  state_store+0x80/0x104
      [   49.314258]  kobj_attr_store+0x18/0x2c
      [   49.318002]  sysfs_kf_write+0x44/0x54
      [   49.321659]  kernfs_fop_write_iter+0x120/0x1ec
      [   49.326088]  vfs_write+0x1bc/0x300
      [   49.329485]  ksys_write+0x70/0x104
      [   49.332874]  __arm64_sys_write+0x1c/0x28
      [   49.336783]  invoke_syscall+0x48/0x114
      [   49.340527]  el0_svc_common.constprop.0+0xc4/0xe4
      [   49.345224]  do_el0_svc+0x38/0x98
      [   49.348526]  el0_svc+0x2c/0x84
      [   49.351568]  el0t_64_sync_handler+0x100/0x12c
      [   49.355910]  el0t_64_sync+0x190/0x194
      [   49.359567] Code: ???????? ???????? ???????? ???????? (????????)
      [   49.365644] ---[ end trace 0000000000000000 ]---
      
      Fixes: 97117eb5 ("net: stmmac: platform: provide stmmac_pltfr_init()")
      Signed-off-by: default avatarClark Wang <xiaoning.wang@nxp.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b09edc1
  5. 28 Sep, 2023 1 commit
    • Michal Schmidt's avatar
      ice: always add legacy 32byte RXDID in supported_rxdids · c070e51d
      Michal Schmidt authored
      When the PF and VF drivers both support flexible rx descriptors and have
      negotiated the VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC capability, the VF driver
      queries the PF for the list of supported descriptor formats
      (VIRTCHNL_OP_GET_SUPPORTED_RXDIDS). The PF driver is supposed to set the
      supported_rxdids bits that correspond to the descriptor formats the
      firmware implements. The legacy 32-byte rx desc format is always
      supported, even though it is not expressed in GLFLXP_RXDID_FLAGS.
      
      The ice driver does not advertise the legacy 32-byte rx desc support,
      which leads to this failure to bring up the VF using the Intel
      out-of-tree iavf driver:
       iavf 0000:41:01.0: PF does not list support for default Rx descriptor format
       ...
       iavf 0000:41:01.0: PF returned error -5 (VIRTCHNL_STATUS_ERR_PARAM) to our request 6
      
      The in-tree iavf driver does not expose this bug, because it does not
      yet implement VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC.
      
      The ice driver must always set the ICE_RXDID_LEGACY_1 bit in
      supported_rxdids. The Intel out-of-tree ice driver and the ice driver in
      DPDK both do this.
      
      I copied this piece of the code and the comment text from the Intel
      out-of-tree driver.
      
      Fixes: e753df8f ("ice: Add support Flex RXD")
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Link: https://lore.kernel.org/r/20230920115439.61172-1-mschmidt@redhat.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c070e51d
  6. 22 Sep, 2023 1 commit
  7. 21 Sep, 2023 9 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 27bbf45e
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from netfilter and bpf.
      
        Current release - regressions:
      
         - bpf: adjust size_index according to the value of KMALLOC_MIN_SIZE
      
         - netfilter: fix entries val in rule reset audit log
      
         - eth: stmmac: fix incorrect rxq|txq_stats reference
      
        Previous releases - regressions:
      
         - ipv4: fix null-deref in ipv4_link_failure
      
         - netfilter:
            - fix several GC related issues
            - fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
      
         - eth: team: fix null-ptr-deref when team device type is changed
      
         - eth: i40e: fix VF VLAN offloading when port VLAN is configured
      
         - eth: ionic: fix 16bit math issue when PAGE_SIZE >= 64KB
      
        Previous releases - always broken:
      
         - core: fix ETH_P_1588 flow dissector
      
         - mptcp: fix several connection hang-up conditions
      
         - bpf:
            - avoid deadlock when using queue and stack maps from NMI
            - add override check to kprobe multi link attach
      
         - hsr: properly parse HSRv1 supervisor frames.
      
         - eth: igc: fix infinite initialization loop with early XDP redirect
      
         - eth: octeon_ep: fix tx dma unmap len values in SG
      
         - eth: hns3: fix GRE checksum offload issue"
      
      * tag 'net-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
        sfc: handle error pointers returned by rhashtable_lookup_get_insert_fast()
        igc: Expose tx-usecs coalesce setting to user
        octeontx2-pf: Do xdp_do_flush() after redirects.
        bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI
        net: ena: Flush XDP packets on error.
        net/handshake: Fix memory leak in __sock_create() and sock_alloc_file()
        net: hinic: Fix warning-hinic_set_vlan_fliter() warn: variable dereferenced before check 'hwdev'
        netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
        netfilter: nf_tables: fix memleak when more than 255 elements expired
        netfilter: nf_tables: disable toggling dormant table state more than once
        vxlan: Add missing entries to vxlan_get_size()
        net: rds: Fix possible NULL-pointer dereference
        team: fix null-ptr-deref when team device type is changed
        net: bridge: use DEV_STATS_INC()
        net: hns3: add 5ms delay before clear firmware reset irq source
        net: hns3: fix fail to delete tc flower rules during reset issue
        net: hns3: only enable unicast promisc when mac table full
        net: hns3: fix GRE checksum offload issue
        net: hns3: add cmdq check for vf periodic service task
        net: stmmac: fix incorrect rxq|txq_stats reference
        ...
      27bbf45e
    • Linus Torvalds's avatar
      Merge tag 'v6.6-rc3.vfs.ctime.revert' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · b5cbe7c0
      Linus Torvalds authored
      Pull finegrained timestamp reverts from Christian Brauner:
       "Earlier this week we sent a few minor fixes for the multi-grained
        timestamp work in [1]. While we were polishing those up after Linus
        realized that there might be a nicer way to fix them we received a
        regression report in [2] that fine grained timestamps break gnulib
        tests and thus possibly other tools.
      
        The kernel will elide fine-grain timestamp updates when no one is
        actively querying for them to avoid performance impacts. So a sequence
        like write(f1) stat(f2) write(f2) stat(f2) write(f1) stat(f1) may
        result in timestamp f1 to be older than the final f2 timestamp even
        though f1 was last written too but the second write didn't update the
        timestamp.
      
        Such plotholes can lead to subtle bugs when programs compare
        timestamps. For example, the nap() function in [2] will estimate that
        it needs to wait one ns on a fine-grain timestamp enabled filesytem
        between subsequent calls to observe a timestamp change. But in general
        we don't update timestamps with more than one jiffie if we think that
        no one is actively querying for fine-grain timestamps to avoid
        performance impacts.
      
        While discussing various fixes the decision was to go back to the
        drawing board and ultimately to explore a solution that involves only
        exposing such fine-grained timestamps to nfs internally and never to
        userspace.
      
        As there are multiple solutions discussed the honest thing to do here
        is not to fix this up or disable it but to cleanly revert. The general
        infrastructure will probably come back but there is no reason to keep
        this code in mainline.
      
        The general changes to timestamp handling are valid and a good cleanup
        that will stay. The revert is fully bisectable"
      
      Link: https://lore.kernel.org/all/20230918-hirte-neuzugang-4c2324e7bae3@brauner [1]
      Link: https://lore.kernel.org/all/bf0524debb976627693e12ad23690094e4514303.camel@linuxfromscratch.org [2]
      
      * tag 'v6.6-rc3.vfs.ctime.revert' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        Revert "fs: add infrastructure for multigrain timestamps"
        Revert "btrfs: convert to multigrain timestamps"
        Revert "ext4: switch to multigrain timestamps"
        Revert "xfs: switch to multigrain timestamps"
        Revert "tmpfs: add support for multigrain timestamps"
      b5cbe7c0
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 7bdfc1af
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - A fix for breakpoint handling which was using get_user() while atomic
      
       - Fix the Power10 HASHCHK handler which was using get_user() while
         atomic
      
       - A few build fixes for issues caused by recent changes
      
      Thanks to Benjamin Gray, Christophe Leroy, Kajol Jain, and Naveen N Rao.
      
      * tag 'powerpc-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/dexcr: Move HASHCHK trap handler
        powerpc/82xx: Select FSL_SOC
        powerpc: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION and FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
        powerpc/watchpoints: Annotate atomic context in more places
        powerpc/watchpoint: Disable pagefaults when getting user instruction
        powerpc/watchpoints: Disable preemption in thread_change_pc()
        powerpc/perf/hv-24x7: Update domain value check
      7bdfc1af
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.6a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 88a174a9
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - remove some unused functions in the Xen event channel handling
      
       - fix a regression (introduced during the merge window) when booting as
         Xen PV guest
      
       - small cleanup removing another strncpy() instance
      
      * tag 'for-linus-6.6a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/efi: refactor deprecated strncpy
        x86/xen: allow nesting of same lazy mode
        x86/xen: move paravirt lazy code
        arm/xen: remove lazy mode related definitions
        xen: simplify evtchn_do_upcall() call maze
      88a174a9
    • Linus Torvalds's avatar
      Merge tag 'fixes-2023-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock · fb8b1b93
      Linus Torvalds authored
      Pull memblock test fixes from Mike Rapoport:
       "Fix several compilation errors and warnings in memblock tests"
      
      * tag 'fixes-2023-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
        memblock tests: fix warning ‘struct seq_file’ declared inside parameter list
        memblock tests: fix warning: "__ALIGN_KERNEL" redefined
        memblock tests: Fix compilation errors.
      fb8b1b93
    • Linus Torvalds's avatar
      Merge tag 'sound-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2af5acba
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A large collection of fixes around this time.
      
        All small and mostly trivial fixes.
      
         - Lots of fixes for the new -Wformat-truncation warnings
      
         - A fix in ALSA rawmidi core regression and UMP handling
      
         - Series of Cirrus codec fixes
      
         - ASoC Intel and Realtek codec fixes
      
         - Usual HD- and USB-audio quirks and AMD ASoC quirks"
      
      * tag 'sound-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (64 commits)
        ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support
        ALSA: hda: cs35l56: Use the new RUNTIME_PM_OPS() macro
        ALSA: usb-audio: scarlett_gen2: Fix another -Wformat-truncation warning
        ALSA: rawmidi: Fix NULL dereference at proc read
        ASoC: SOF: core: Only call sof_ops_free() on remove if the probe was successful
        ASoC: SOF: Intel: MTL: Reduce the DSP init timeout
        ASoC: cs42l43: Add shared IRQ flag for shutters
        ASoC: imx-audmix: Fix return error with devm_clk_get()
        ASoC: hdaudio.c: Add missing check for devm_kstrdup
        ALSA: riptide: Fix -Wformat-truncation warning for longname string
        ALSA: cs4231: Fix -Wformat-truncation warning for longname string
        ALSA: ad1848: Fix -Wformat-truncation warning for longname string
        ALSA: hda: generic: Check potential mixer name string truncation
        ALSA: cmipci: Fix -Wformat-truncation warning
        ALSA: firewire: Fix -Wformat-truncation warning for MIDI stream names
        ALSA: firewire: Fix -Wformat-truncation warning for longname string
        ALSA: xen: Fix -Wformat-truncation warning
        ALSA: opti9x: Fix -Wformat-truncation warning
        ALSA: es1688: Fix -Wformat-truncation warning
        ALSA: cs4236: Fix -Wformat-truncation warning
        ...
      2af5acba
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v6.6-rc3' of... · b300c0fd
      Linus Torvalds authored
      Merge tag 'hwmon-for-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fix from Guenter Roeck:
       "One patch to drop a non-existent alarm attribute in the nct6775 driver"
      
      * tag 'hwmon-for-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (nct6775) Fix non-existent ALARM warning
      b300c0fd
    • Paolo Abeni's avatar
      Merge tag 'nf-23-09-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · ecf43926
      Paolo Abeni authored
      Florian Westphal says:
      
      ====================
      netfilter updates for net
      
      The following three patches fix regressions in the netfilter subsystem:
      
      1. Reject attempts to repeatedly toggle the 'dormant' flag in a single
         transaction.  Doing so makes nf_tables lose track of the real state
         vs. the desired state.  This ends with an attempt to unregister hooks
         that were never registered in the first place, which yields a splat.
      
      2. Fix element counting in the new nftables garbage collection infra
         that came with 6.5:  More than 255 expired elements wraps a counter
         which results in memory leak.
      
      3. Since 6.4 ipset can BUG when a set is renamed while a CREATE command
         is in progress, fix from Jozsef Kadlecsik.
      
      * tag 'nf-23-09-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
        netfilter: nf_tables: fix memleak when more than 255 elements expired
        netfilter: nf_tables: disable toggling dormant table state more than once
      ====================
      
      Link: https://lore.kernel.org/r/20230920084156.4192-1-fw@strlen.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ecf43926
    • Edward Cree's avatar
      sfc: handle error pointers returned by rhashtable_lookup_get_insert_fast() · fc21f083
      Edward Cree authored
      Several places in TC offload code assumed that the return from
       rhashtable_lookup_get_insert_fast() was always either NULL or a valid
       pointer to an existing entry, but in fact that function can return an
       error pointer.  In that case, perform the usual cleanup of the newly
       created entry, then pass up the error, rather than attempting to take a
       reference on the old entry.
      
      Fixes: d902e1a7 ("sfc: bare bones TC offload on EF100")
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarEdward Cree <ecree.xilinx@gmail.com>
      Link: https://lore.kernel.org/r/20230919183949.59392-1-edward.cree@amd.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fc21f083