1. 03 May, 2023 23 commits
  2. 01 May, 2023 11 commits
    • David S. Miller's avatar
      Merge branch 'rxrpc-timeout-fixes' · fb7cba61
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Timeout handling fixes
      
      Here are three patches to fix timeouts handling in AF_RXRPC:
      
       (1) The hard call timeout should be interpreted in seconds, not
           milliseconds.
      
       (2) Allow a waiting call to be aborted (thereby cancelling the call) in
           the case a signal interrupts sendmsg() and leaves it hanging until it
           is granted a channel on a connection.
      
       (3) Kernel-generated calls get the timer started on them even if they're
           still waiting to be attached to a connection.  If the timer expires
           before the wait is complete and a conn is attached, an oops will
           occur.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb7cba61
    • David Howells's avatar
      rxrpc: Fix timeout of a call that hasn't yet been granted a channel · db099c62
      David Howells authored
      afs_make_call() calls rxrpc_kernel_begin_call() to begin a call (which may
      get stalled in the background waiting for a connection to become
      available); it then calls rxrpc_kernel_set_max_life() to set the timeouts -
      but that starts the call timer so the call timer might then expire before
      we get a connection assigned - leading to the following oops if the call
      stalled:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000000
      	...
      	CPU: 1 PID: 5111 Comm: krxrpcio/0 Not tainted 6.3.0-rc7-build3+ #701
      	RIP: 0010:rxrpc_alloc_txbuf+0xc0/0x157
      	...
      	Call Trace:
      	 <TASK>
      	 rxrpc_send_ACK+0x50/0x13b
      	 rxrpc_input_call_event+0x16a/0x67d
      	 rxrpc_io_thread+0x1b6/0x45f
      	 ? _raw_spin_unlock_irqrestore+0x1f/0x35
      	 ? rxrpc_input_packet+0x519/0x519
      	 kthread+0xe7/0xef
      	 ? kthread_complete_and_exit+0x1b/0x1b
      	 ret_from_fork+0x22/0x30
      
      Fix this by noting the timeouts in struct rxrpc_call when the call is
      created.  The timer will be started when the first packet is transmitted.
      
      It shouldn't be possible to trigger this directly from userspace through
      AF_RXRPC as sendmsg() will return EBUSY if the call is in the
      waiting-for-conn state if it dropped out of the wait due to a signal.
      
      Fixes: 9d35d880 ("rxrpc: Move client call connection to the I/O thread")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: netdev@vger.kernel.org
      cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db099c62
    • David Howells's avatar
      rxrpc: Make it so that a waiting process can be aborted · 0eb362d2
      David Howells authored
      When sendmsg() creates an rxrpc call, it queues it to wait for a connection
      and channel to be assigned and then waits before it can start shovelling
      data as the encrypted DATA packet content includes a summary of the
      connection parameters.
      
      However, sendmsg() may get interrupted before a connection gets assigned
      and further sendmsg() calls will fail with EBUSY until an assignment is
      made.
      
      Fix this so that the call can at least be aborted without failing on
      EBUSY.  We have to be careful here as sendmsg() mustn't be allowed to start
      the call timer if the call doesn't yet have a connection assigned as an
      oops may follow shortly thereafter.
      
      Fixes: 540b1c48 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: netdev@vger.kernel.org
      cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eb362d2
    • David Howells's avatar
      rxrpc: Fix hard call timeout units · 0d098d83
      David Howells authored
      The hard call timeout is specified in the RXRPC_SET_CALL_TIMEOUT cmsg in
      seconds, so fix the point at which sendmsg() applies it to the call to
      convert to jiffies from seconds, not milliseconds.
      
      Fixes: a158bdd3 ("rxrpc: Fix timeout of a call that hasn't yet been granted a channel")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: netdev@vger.kernel.org
      cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d098d83
    • Tom Rix's avatar
      net: atlantic: Define aq_pm_ops conditionally on CONFIG_PM · 4f163bf8
      Tom Rix authored
      For s390, gcc with W=1 reports
      drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:458:32: error:
        'aq_pm_ops' defined but not used [-Werror=unused-const-variable=]
        458 | static const struct dev_pm_ops aq_pm_ops = {
            |                                ^~~~~~~~~
      
      The only use of aq_pm_ops is conditional on CONFIG_PM.
      The definition of aq_pm_ops and its functions should also
      be conditional on CONFIG_PM.
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f163bf8
    • Andy Moreton's avatar
      sfc: Fix module EEPROM reporting for QSFP modules · 281900a9
      Andy Moreton authored
      The sfc driver does not report QSFP module EEPROM contents correctly
      as only the first page is fetched from hardware.
      
      Commit 0e1a2a3e ("ethtool: Add SFF-8436 and SFF-8636 max EEPROM
      length definitions") added ETH_MODULE_SFF_8436_MAX_LEN for the overall
      size of the EEPROM info, so use that to report the full EEPROM contents.
      
      Fixes: 9b17010d ("sfc: Add ethtool -m support for QSFP modules")
      Signed-off-by: default avatarAndy Moreton <andy.moreton@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      281900a9
    • David S. Miller's avatar
      Merge branch 'r8152-fixes' · f858e2fd
      David S. Miller authored
      Hayes Wang says:
      
      ====================
      r8152: fix 2.5G devices
      
      v3:
      For patch #2, modify the comment.
      
      v2:
      For patch #1, Remove inline for fc_pause_on_auto() and fc_pause_off_auto(),
      and update the commit message.
      
      For patch #2, define the magic value for OCP register 0xa424.
      
      v1:
      These patches are used to fix some issues of RTL8156.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f858e2fd
    • Hayes Wang's avatar
      r8152: move setting r8153b_rx_agg_chg_indicate() · cce8334f
      Hayes Wang authored
      Move setting r8153b_rx_agg_chg_indicate() for 2.5G devices. The
      r8153b_rx_agg_chg_indicate() has to be called after enabling tx/rx.
      Otherwise, the coalescing settings are useless.
      
      Fixes: 195aae32 ("r8152: support new chips")
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cce8334f
    • Hayes Wang's avatar
      r8152: fix the poor throughput for 2.5G devices · 61b0ad6f
      Hayes Wang authored
      Fix the poor throughput for 2.5G devices, when changing the speed from
      auto mode to force mode. This patch is used to notify the MAC when the
      mode is changed.
      
      Fixes: 195aae32 ("r8152: support new chips")
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61b0ad6f
    • Hayes Wang's avatar
      r8152: fix flow control issue of RTL8156A · 8ceda6d5
      Hayes Wang authored
      The feature of flow control becomes abnormal, if the device sends a
      pause frame and the tx/rx is disabled before sending a release frame. It
      causes the lost of packets.
      
      Set PLA_RX_FIFO_FULL and PLA_RX_FIFO_EMPTY to zeros before disabling the
      tx/rx. And, toggle FC_PATCH_TASK before enabling tx/rx to reset the flow
      control patch and timer. Then, the hardware could clear the state and
      the flow control becomes normal after enabling tx/rx.
      
      Besides, remove inline for fc_pause_on_auto() and fc_pause_off_auto().
      
      Fixes: 195aae32 ("r8152: support new chips")
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ceda6d5
    • Victor Nogueira's avatar
      net/sched: act_mirred: Add carrier check · 526f28bd
      Victor Nogueira authored
      There are cases where the device is adminstratively UP, but operationally
      down. For example, we have a physical device (Nvidia ConnectX-6 Dx, 25Gbps)
      who's cable was pulled out, here is its ip link output:
      
      5: ens2f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
          link/ether b8:ce:f6:4b:68:35 brd ff:ff:ff:ff:ff:ff
          altname enp179s0f1np1
      
      As you can see, it's administratively UP but operationally down.
      In this case, sending a packet to this port caused a nasty kernel hang (so
      nasty that we were unable to capture it). Aborting a transmit based on
      operational status (in addition to administrative status) fixes the issue.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarVictor Nogueira <victor@mojatatu.com>
      v1->v2: Add fixes tag
      v2->v3: Remove blank line between tags + add change log, suggested by Leon
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      526f28bd
  3. 28 Apr, 2023 6 commits
    • Angelo Dureghello's avatar
      net: dsa: mv88e6xxx: add mv88e6321 rsvd2cpu · 66863178
      Angelo Dureghello authored
      Add rsvd2cpu capability for mv88e6321 model, to allow proper bpdu
      processing.
      Signed-off-by: default avatarAngelo Dureghello <angelo.dureghello@timesys.com>
      Fixes: 51c901a7 ("net: dsa: mv88e6xxx: distinguish Global 2 Rsvd2CPU")
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66863178
    • Antoine Tenart's avatar
      net: ipv6: fix skb hash for some RST packets · dc6456e9
      Antoine Tenart authored
      The skb hash comes from sk->sk_txhash when using TCP, except for some
      IPv6 RST packets. This is because in tcp_v6_send_reset when not in
      TIME_WAIT the hash is taken from sk->sk_hash, while it should come from
      sk->sk_txhash as those two hashes are not computed the same way.
      
      Packetdrill script to test the above,
      
         0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
        +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
        +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
      
        +0 > (flowlabel 0x1) S 0:0(0) <...>
      
        // Wrong ack seq, trigger a rst.
        +0 < S. 0:0(0) ack 0 win 4000
      
        // Check the flowlabel matches prior one from SYN.
        +0 > (flowlabel 0x1) R 0:0(0) <...>
      
      Fixes: 9258b8b1 ("ipv6: tcp: send consistent autoflowlabel in RST packets")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc6456e9
    • Andrea Mayer's avatar
      selftests: srv6: make srv6_end_dt46_l3vpn_test more robust · 46ef24c6
      Andrea Mayer authored
      On some distributions, the rp_filter is automatically set (=1) by
      default on a netdev basis (also on VRFs).
      In an SRv6 End.DT46 behavior, decapsulated IPv4 packets are routed using
      the table associated with the VRF bound to that tunnel. During lookup
      operations, the rp_filter can lead to packet loss when activated on the
      VRF.
      Therefore, we chose to make this selftest more robust by explicitly
      disabling the rp_filter during tests (as it is automatically set by some
      Linux distributions).
      
      Fixes: 03a0b567 ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarAndrea Mayer <andrea.mayer@uniroma2.it>
      Tested-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46ef24c6
    • Cong Wang's avatar
      sit: update dev->needed_headroom in ipip6_tunnel_bind_dev() · c88f8d5c
      Cong Wang authored
      When a tunnel device is bound with the underlying device, its
      dev->needed_headroom needs to be updated properly. IPv4 tunnels
      already do the same in ip_tunnel_bind_dev(). Otherwise we may
      not have enough header room for skb, especially after commit
      b17f709a ("gue: TX support for using remote checksum offload option").
      
      Fixes: 32b8a8e5 ("sit: add IPv4 over IPv4 support")
      Reported-by: default avatarPalash Oswal <oswalpalash@gmail.com>
      Link: https://lore.kernel.org/netdev/CAGyP=7fDcSPKu6nttbGwt7RXzE3uyYxLjCSE97J64pRxJP8jPA@mail.gmail.com/
      Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c88f8d5c
    • Vlad Buslov's avatar
      net/sched: cls_api: remove block_cb from driver_list before freeing · da94a778
      Vlad Buslov authored
      Error handler of tcf_block_bind() frees the whole bo->cb_list on error.
      However, by that time the flow_block_cb instances are already in the driver
      list because driver ndo_setup_tc() callback is called before that up the
      call chain in tcf_block_offload_cmd(). This leaves dangling pointers to
      freed objects in the list and causes use-after-free[0]. Fix it by also
      removing flow_block_cb instances from driver_list before deallocating them.
      
      [0]:
      [  279.868433] ==================================================================
      [  279.869964] BUG: KASAN: slab-use-after-free in flow_block_cb_setup_simple+0x631/0x7c0
      [  279.871527] Read of size 8 at addr ffff888147e2bf20 by task tc/2963
      
      [  279.873151] CPU: 6 PID: 2963 Comm: tc Not tainted 6.3.0-rc6+ #4
      [  279.874273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [  279.876295] Call Trace:
      [  279.876882]  <TASK>
      [  279.877413]  dump_stack_lvl+0x33/0x50
      [  279.878198]  print_report+0xc2/0x610
      [  279.878987]  ? flow_block_cb_setup_simple+0x631/0x7c0
      [  279.879994]  kasan_report+0xae/0xe0
      [  279.880750]  ? flow_block_cb_setup_simple+0x631/0x7c0
      [  279.881744]  ? mlx5e_tc_reoffload_flows_work+0x240/0x240 [mlx5_core]
      [  279.883047]  flow_block_cb_setup_simple+0x631/0x7c0
      [  279.884027]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
      [  279.885037]  ? tcf_block_setup+0x6b0/0x6b0
      [  279.885901]  ? mutex_lock+0x7d/0xd0
      [  279.886669]  ? __mutex_unlock_slowpath.constprop.0+0x2d0/0x2d0
      [  279.887844]  ? ingress_init+0x1c0/0x1c0 [sch_ingress]
      [  279.888846]  tcf_block_get_ext+0x61c/0x1200
      [  279.889711]  ingress_init+0x112/0x1c0 [sch_ingress]
      [  279.890682]  ? clsact_init+0x2b0/0x2b0 [sch_ingress]
      [  279.891701]  qdisc_create+0x401/0xea0
      [  279.892485]  ? qdisc_tree_reduce_backlog+0x470/0x470
      [  279.893473]  tc_modify_qdisc+0x6f7/0x16d0
      [  279.894344]  ? tc_get_qdisc+0xac0/0xac0
      [  279.895213]  ? mutex_lock+0x7d/0xd0
      [  279.896005]  ? __mutex_lock_slowpath+0x10/0x10
      [  279.896910]  rtnetlink_rcv_msg+0x5fe/0x9d0
      [  279.897770]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
      [  279.898672]  ? __sys_sendmsg+0xb5/0x140
      [  279.899494]  ? do_syscall_64+0x3d/0x90
      [  279.900302]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [  279.901337]  ? kasan_save_stack+0x2e/0x40
      [  279.902177]  ? kasan_save_stack+0x1e/0x40
      [  279.903058]  ? kasan_set_track+0x21/0x30
      [  279.903913]  ? kasan_save_free_info+0x2a/0x40
      [  279.904836]  ? ____kasan_slab_free+0x11a/0x1b0
      [  279.905741]  ? kmem_cache_free+0x179/0x400
      [  279.906599]  netlink_rcv_skb+0x12c/0x360
      [  279.907450]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
      [  279.908360]  ? netlink_ack+0x1550/0x1550
      [  279.909192]  ? rhashtable_walk_peek+0x170/0x170
      [  279.910135]  ? kmem_cache_alloc_node+0x1af/0x390
      [  279.911086]  ? _copy_from_iter+0x3d6/0xc70
      [  279.912031]  netlink_unicast+0x553/0x790
      [  279.912864]  ? netlink_attachskb+0x6a0/0x6a0
      [  279.913763]  ? netlink_recvmsg+0x416/0xb50
      [  279.914627]  netlink_sendmsg+0x7a1/0xcb0
      [  279.915473]  ? netlink_unicast+0x790/0x790
      [  279.916334]  ? iovec_from_user.part.0+0x4d/0x220
      [  279.917293]  ? netlink_unicast+0x790/0x790
      [  279.918159]  sock_sendmsg+0xc5/0x190
      [  279.918938]  ____sys_sendmsg+0x535/0x6b0
      [  279.919813]  ? import_iovec+0x7/0x10
      [  279.920601]  ? kernel_sendmsg+0x30/0x30
      [  279.921423]  ? __copy_msghdr+0x3c0/0x3c0
      [  279.922254]  ? import_iovec+0x7/0x10
      [  279.923041]  ___sys_sendmsg+0xeb/0x170
      [  279.923854]  ? copy_msghdr_from_user+0x110/0x110
      [  279.924797]  ? ___sys_recvmsg+0xd9/0x130
      [  279.925630]  ? __perf_event_task_sched_in+0x183/0x470
      [  279.926656]  ? ___sys_sendmsg+0x170/0x170
      [  279.927529]  ? ctx_sched_in+0x530/0x530
      [  279.928369]  ? update_curr+0x283/0x4f0
      [  279.929185]  ? perf_event_update_userpage+0x570/0x570
      [  279.930201]  ? __fget_light+0x57/0x520
      [  279.931023]  ? __switch_to+0x53d/0xe70
      [  279.931846]  ? sockfd_lookup_light+0x1a/0x140
      [  279.932761]  __sys_sendmsg+0xb5/0x140
      [  279.933560]  ? __sys_sendmsg_sock+0x20/0x20
      [  279.934436]  ? fpregs_assert_state_consistent+0x1d/0xa0
      [  279.935490]  do_syscall_64+0x3d/0x90
      [  279.936300]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [  279.937311] RIP: 0033:0x7f21c814f887
      [  279.938085] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
      [  279.941448] RSP: 002b:00007fff11efd478 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  279.942964] RAX: ffffffffffffffda RBX: 0000000064401979 RCX: 00007f21c814f887
      [  279.944337] RDX: 0000000000000000 RSI: 00007fff11efd4e0 RDI: 0000000000000003
      [  279.945660] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
      [  279.947003] R10: 00007f21c8008708 R11: 0000000000000246 R12: 0000000000000001
      [  279.948345] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400
      [  279.949690]  </TASK>
      
      [  279.950706] Allocated by task 2960:
      [  279.951471]  kasan_save_stack+0x1e/0x40
      [  279.952338]  kasan_set_track+0x21/0x30
      [  279.953165]  __kasan_kmalloc+0x77/0x90
      [  279.954006]  flow_block_cb_setup_simple+0x3dd/0x7c0
      [  279.955001]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
      [  279.956020]  tcf_block_get_ext+0x61c/0x1200
      [  279.956881]  ingress_init+0x112/0x1c0 [sch_ingress]
      [  279.957873]  qdisc_create+0x401/0xea0
      [  279.958656]  tc_modify_qdisc+0x6f7/0x16d0
      [  279.959506]  rtnetlink_rcv_msg+0x5fe/0x9d0
      [  279.960392]  netlink_rcv_skb+0x12c/0x360
      [  279.961216]  netlink_unicast+0x553/0x790
      [  279.962044]  netlink_sendmsg+0x7a1/0xcb0
      [  279.962906]  sock_sendmsg+0xc5/0x190
      [  279.963702]  ____sys_sendmsg+0x535/0x6b0
      [  279.964534]  ___sys_sendmsg+0xeb/0x170
      [  279.965343]  __sys_sendmsg+0xb5/0x140
      [  279.966132]  do_syscall_64+0x3d/0x90
      [  279.966908]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      [  279.968407] Freed by task 2960:
      [  279.969114]  kasan_save_stack+0x1e/0x40
      [  279.969929]  kasan_set_track+0x21/0x30
      [  279.970729]  kasan_save_free_info+0x2a/0x40
      [  279.971603]  ____kasan_slab_free+0x11a/0x1b0
      [  279.972483]  __kmem_cache_free+0x14d/0x280
      [  279.973337]  tcf_block_setup+0x29d/0x6b0
      [  279.974173]  tcf_block_offload_cmd.isra.0+0x226/0x2d0
      [  279.975186]  tcf_block_get_ext+0x61c/0x1200
      [  279.976080]  ingress_init+0x112/0x1c0 [sch_ingress]
      [  279.977065]  qdisc_create+0x401/0xea0
      [  279.977857]  tc_modify_qdisc+0x6f7/0x16d0
      [  279.978695]  rtnetlink_rcv_msg+0x5fe/0x9d0
      [  279.979562]  netlink_rcv_skb+0x12c/0x360
      [  279.980388]  netlink_unicast+0x553/0x790
      [  279.981214]  netlink_sendmsg+0x7a1/0xcb0
      [  279.982043]  sock_sendmsg+0xc5/0x190
      [  279.982827]  ____sys_sendmsg+0x535/0x6b0
      [  279.983703]  ___sys_sendmsg+0xeb/0x170
      [  279.984510]  __sys_sendmsg+0xb5/0x140
      [  279.985298]  do_syscall_64+0x3d/0x90
      [  279.986076]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      [  279.987532] The buggy address belongs to the object at ffff888147e2bf00
                      which belongs to the cache kmalloc-192 of size 192
      [  279.989747] The buggy address is located 32 bytes inside of
                      freed 192-byte region [ffff888147e2bf00, ffff888147e2bfc0)
      
      [  279.992367] The buggy address belongs to the physical page:
      [  279.993430] page:00000000550f405c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147e2a
      [  279.995182] head:00000000550f405c order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      [  279.996713] anon flags: 0x200000000010200(slab|head|node=0|zone=2)
      [  279.997878] raw: 0200000000010200 ffff888100042a00 0000000000000000 dead000000000001
      [  279.999384] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
      [  280.000894] page dumped because: kasan: bad access detected
      
      [  280.002386] Memory state around the buggy address:
      [  280.003338]  ffff888147e2be00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  280.004781]  ffff888147e2be80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [  280.006224] >ffff888147e2bf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  280.007700]                                ^
      [  280.008592]  ffff888147e2bf80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [  280.010035]  ffff888147e2c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  280.011564] ==================================================================
      
      Fixes: 59094b1e ("net: sched: use flow block API")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da94a778
    • Eric Dumazet's avatar
      tcp: fix skb_copy_ubufs() vs BIG TCP · 7e692df3
      Eric Dumazet authored
      David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy
      using hugepages, and skb length bigger than ~68 KB.
      
      skb_copy_ubufs() assumed it could copy all payload using up to
      MAX_SKB_FRAGS order-0 pages.
      
      This assumption broke when BIG TCP was able to put up to 512 KB per skb.
      
      We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45
      and limit gso_max_size to 180000.
      
      A solution is to use higher order pages if needed.
      
      v2: add missing __GFP_COMP, or we leak memory.
      
      Fixes: 7c4e983c ("net: allow gso_max_size to exceed 65536")
      Reported-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/netdev/c70000f6-baa4-4a05-46d0-4b3e0dc1ccc8@gmail.com/T/Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Coco Li <lixiaoyan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e692df3