1. 19 Nov, 2020 8 commits
  2. 18 Nov, 2020 17 commits
  3. 17 Nov, 2020 15 commits
    • Tariq Toukan's avatar
      net/tls: Fix wrong record sn in async mode of device resync · 138559b9
      Tariq Toukan authored
      In async_resync mode, we log the TCP seq of records until the async request
      is completed.  Later, in case one of the logged seqs matches the resync
      request, we return it, together with its record serial number.  Before this
      fix, we mistakenly returned the serial number of the current record
      instead.
      
      Fixes: ed9b7646 ("net/tls: Add asynchronous resync")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarBoris Pismenny <borisp@nvidia.com>
      Link: https://lore.kernel.org/r/20201115131448.2702-1-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      138559b9
    • Taehee Yoo's avatar
      netdevsim: set .owner to THIS_MODULE · a5bbcbf2
      Taehee Yoo authored
      If THIS_MODULE is not set, the module would be removed while debugfs is
      being used.
      It eventually makes kernel panic.
      
      Fixes: 82c93a87 ("netdevsim: implement couple of testing devlink health reporters")
      Fixes: 424be63a ("netdevsim: add UDP tunnel port offload support")
      Fixes: 4418f862 ("netdevsim: implement support for devlink region and snapshots")
      Fixes: d3cbb907 ("netdevsim: add ACL trap reporting cookie as a metadata")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20201115103041.30701-1-ap420073@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a5bbcbf2
    • Alex Marginean's avatar
      enetc: Workaround for MDIO register access issue · fd5736bf
      Alex Marginean authored
      Due to a hardware issue, an access to MDIO registers
      that is concurrent with other ENETC register accesses
      may lead to the MDIO access being dropped or corrupted.
      The workaround introduces locking for all register accesses
      to the ENETC register space.  To reduce performance impact,
      a readers-writers locking scheme has been implemented.
      The writer in this case is the MDIO access code (irrelevant
      whether that MDIO access is a register read or write), and
      the reader is any access code to non-MDIO ENETC registers.
      Also, the datapath functions acquire the read lock fewer times
      and use _hot accessors.  All the rest of the code uses the _wa
      accessors which lock every register access.
      The commit introducing MDIO support is -
      commit ebfcb23d ("enetc: Add ENETC PF level external MDIO support")
      but due to subsequent refactoring this patch is applicable on
      top of a later commit.
      
      Fixes: 6517798d ("enetc: Make MDIO accessors more generic and export to include/linux/fsl")
      Signed-off-by: default avatarAlex Marginean <alexandru.marginean@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Link: https://lore.kernel.org/r/20201112182608.26177-1-claudiu.manoil@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fd5736bf
    • Wang Hai's avatar
      net/mlx5: fix error return code in mlx5e_tc_nic_init() · 68ec32da
      Wang Hai authored
      Fix to return a negative error code from the error handling
      case instead of 0, as done elsewhere in this function.
      
      Fixes: aedd133d ("net/mlx5e: Support CT offload for tc nic flows")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      68ec32da
    • Eli Cohen's avatar
      net/mlx5: E-Switch, Fail mlx5_esw_modify_vport_rate if qos disabled · 5b8631c7
      Eli Cohen authored
      Avoid calling mlx5_esw_modify_vport_rate() if qos is not enabled and
      avoid unnecessary syndrome messages from firmware.
      
      Fixes: fcb64c0f ("net/mlx5: E-Switch, add ingress rate support")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      5b8631c7
    • Vladyslav Tarasiuk's avatar
      net/mlx5: Disable QoS when min_rates on all VFs are zero · 470b7475
      Vladyslav Tarasiuk authored
      Currently when QoS is enabled for VF and any min_rate is configured,
      the driver sets bw_share value to at least 1 and doesn’t allow to set
      it to 0 to make minimal rate unlimited. It means there is always a
      minimal rate configured for every VF, even if user tries to remove it.
      
      In order to make QoS disable possible, check whether all vports have
      configured min_rate = 0. If this is true, set their bw_share to 0 to
      disable min_rate limitations.
      
      Fixes: c9497c98 ("net/mlx5: Add support for setting VF min rate")
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      470b7475
    • Vladyslav Tarasiuk's avatar
      net/mlx5: Clear bw_share upon VF disable · 1ce5fc72
      Vladyslav Tarasiuk authored
      Currently, if user disables VFs with some min and max rates configured,
      they are cleared. But QoS data is not cleared and restored upon next VF
      enable placing limits on minimal rate for given VF, when user expects
      none.
      
      To match cleared vport->info struct with QoS-related min and max rates
      upon VF disable, clear vport->qos struct too.
      
      Fixes: 556b9d16 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1ce5fc72
    • Michael Guralnik's avatar
      net/mlx5: Add handling of port type in rule deletion · 8cbcc5ef
      Michael Guralnik authored
      Handle destruction of rules with port destination type to enable
      full destruction of flow.
      
      Without this handling of TX rules the deletion of these rules fails.
      Dmesg of flow destruction failure:
      
      [  203.714146] mlx5_core 0000:00:0b.0: mlx5_cmd_check:753:(pid 342): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x144b7a)
      [  210.547387] ------------[ cut here ]------------
      [  210.548663] refcount_t: decrement hit 0; leaking memory.
      [  210.550651] WARNING: CPU: 4 PID: 342 at lib/refcount.c:31 refcount_warn_saturate+0x5c/0x110
      [  210.550654] Modules linked in: mlx5_ib mlx5_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core
      [  210.550675] CPU: 4 PID: 342 Comm: test Not tainted 5.8.0-rc2+ #116
      [  210.550678] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [  210.550680] RIP: 0010:refcount_warn_saturate+0x5c/0x110
      [  210.550685] Code: c6 d1 1b 01 00 0f 84 ad 00 00 00 5b 5d c3 80 3d b5 d1 1b 01 00 75 f4 48 c7 c7 20 d1 15 82 c6 05 a5 d1 1b 01 01 e8 a7 eb af ff <0f> 0b eb dd 80 3d 99 d1 1b 01 00 75 d4 48 c7 c7 c0 cf 15 82 c6 05
      [  210.550687] RSP: 0018:ffff8881642e77e8 EFLAGS: 00010282
      [  210.550691] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
      [  210.550694] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102c85ceef
      [  210.550696] RBP: ffff888161720428 R08: ffffffff8124c10e R09: ffffed103243beae
      [  210.550698] R10: ffff8881921df56b R11: ffffed103243bead R12: ffff8881841b4180
      [  210.550701] R13: ffff888161720428 R14: ffff8881616d0000 R15: ffff888161720380
      [  210.550704] FS:  00007fc27f025740(0000) GS:ffff888192000000(0000) knlGS:0000000000000000
      [  210.550706] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  210.550708] CR2: 0000557e4b41a6a0 CR3: 0000000002415004 CR4: 0000000000360ea0
      [  210.550711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  210.550713] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  210.550715] Call Trace:
      [  210.550717]  mlx5_del_flow_rules+0x484/0x490 [mlx5_core]
      [  210.550720]  ? mlx5_cmd_set_fte+0xa80/0xa80 [mlx5_core]
      [  210.550722]  mlx5_ib_destroy_flow+0x17f/0x280 [mlx5_ib]
      [  210.550724]  uverbs_free_flow+0x4c/0x90 [ib_uverbs]
      [  210.550726]  destroy_hw_idr_uobject+0x41/0xb0 [ib_uverbs]
      [  210.550728]  uverbs_destroy_uobject+0xaa/0x390 [ib_uverbs]
      [  210.550731]  __uverbs_cleanup_ufile+0x129/0x1b0 [ib_uverbs]
      [  210.550733]  ? uverbs_destroy_uobject+0x390/0x390 [ib_uverbs]
      [  210.550735]  uverbs_destroy_ufile_hw+0x78/0x190 [ib_uverbs]
      [  210.550737]  ib_uverbs_close+0x36/0x140 [ib_uverbs]
      [  210.550739]  __fput+0x181/0x380
      [  210.550741]  task_work_run+0x88/0xd0
      [  210.550743]  do_exit+0x5f6/0x13b0
      [  210.550745]  ? sched_clock_cpu+0x30/0x140
      [  210.550747]  ? is_current_pgrp_orphaned+0x70/0x70
      [  210.550750]  ? lock_downgrade+0x360/0x360
      [  210.550752]  ? mark_held_locks+0x1d/0x90
      [  210.550754]  do_group_exit+0x8a/0x140
      [  210.550756]  get_signal+0x20a/0xf50
      [  210.550758]  do_signal+0x8c/0xbe0
      [  210.550760]  ? hrtimer_nanosleep+0x1d8/0x200
      [  210.550762]  ? nanosleep_copyout+0x50/0x50
      [  210.550764]  ? restore_sigcontext+0x320/0x320
      [  210.550766]  ? __hrtimer_init+0xf0/0xf0
      [  210.550768]  ? timespec64_add_safe+0x150/0x150
      [  210.550770]  ? mark_held_locks+0x1d/0x90
      [  210.550772]  ? lockdep_hardirqs_on_prepare+0x14c/0x240
      [  210.550774]  __prepare_exit_to_usermode+0x119/0x170
      [  210.550776]  do_syscall_64+0x65/0x300
      [  210.550778]  ? trace_hardirqs_off+0x10/0x120
      [  210.550781]  ? mark_held_locks+0x1d/0x90
      [  210.550783]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
      [  210.550785]  ? lockdep_hardirqs_on+0x112/0x190
      [  210.550787]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  210.550789] RIP: 0033:0x7fc27f1cd157
      [  210.550791] Code: Bad RIP value.
      [  210.550793] RSP: 002b:00007ffd4db27ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000023
      [  210.550798] RAX: fffffffffffffdfc RBX: ffffffffffffff80 RCX: 00007fc27f1cd157
      [  210.550800] RDX: 00007fc27f025740 RSI: 00007ffd4db27eb0 RDI: 00007ffd4db27eb0
      [  210.550803] RBP: 0000000000000016 R08: 0000000000000000 R09: 000000000000000e
      [  210.550805] R10: 00007ffd4db27dc7 R11: 0000000000000246 R12: 0000000000400c00
      [  210.550808] R13: 00007ffd4db285f0 R14: 0000000000000000 R15: 0000000000000000
      [  210.550809] irq event stamp: 49399
      [  210.550812] hardirqs last  enabled at (49399): [<ffffffff81172d36>] console_unlock+0x556/0x6f0
      [  210.550815] hardirqs last disabled at (49398): [<ffffffff81172897>] console_unlock+0xb7/0x6f0
      [  210.550818] softirqs last  enabled at (48706): [<ffffffff81e0037b>] __do_softirq+0x37b/0x60c
      [  210.550820] softirqs last disabled at (48697): [<ffffffff81c00e2f>] asm_call_on_stack+0xf/0x20
      [  210.550822] ---[ end trace ad18c0e6fa846454 ]---
      [  210.581862] mlx5_core 0000:00:0c.0: mlx5_destroy_flow_table:2132:(pid 342): Flow table 262150 wasn't destroyed, refcount > 1
      
      Fixes: a7ee18bd ("RDMA/mlx5: Allow creating a matcher for a NIC TX flow table")
      Signed-off-by: default avatarMichael Guralnik <michaelgur@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8cbcc5ef
    • Maor Dickman's avatar
      net/mlx5e: Fix check if netdev is bond slave · 219b3267
      Maor Dickman authored
      Bond events handler uses bond_slave_get_rtnl to check if net device
      is bond slave. bond_slave_get_rtnl return the rcu rx_handler pointer
      from the netdev which exists for bond slaves but also exists for
      devices that are attached to linux bridge so using it as indication
      for bond slave is wrong.
      
      Fix by using netif_is_lag_port instead.
      
      Fixes: 7e51891a ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRaed Salem <raeds@nvidia.com>
      Reviewed-by: default avatarAriel Levkovich <lariel@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      219b3267
    • Huy Nguyen's avatar
      net/mlx5e: Fix IPsec packet drop by mlx5e_tc_update_skb · 6248ce99
      Huy Nguyen authored
      Both TC and IPsec crypto offload use metadata_regB to store
      private information. Since TC does not use bit 31 of regB, IPsec
      will use bit 31 as the IPsec packet marker. The IPsec's regB usage
      is changed to:
      Bit31: IPsec marker
      Bit30-24: IPsec syndrome
      Bit23-0: IPsec obj id
      
      Fixes: b2ac7541 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: default avatarRaed Salem <raeds@nvidia.com>
      Reviewed-by: default avatarAriel Levkovich <lariel@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6248ce99
    • Huy Nguyen's avatar
      net/mlx5e: Set IPsec WAs only in IP's non checksum partial case. · 5cfb540e
      Huy Nguyen authored
      The IP's checksum partial still requires L4 csum flag on Ethernet WQE.
      Make the IPsec WAs only for the IP's non checksum partial case
      (for example icmd packet)
      
      Fixes: 5be01904 ("net/mlx5e: IPsec: Add Connect-X IPsec Tx data path offload")
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: default avatarRaed Salem <raeds@nvidia.com>
      Reviewed-by: default avatarAlaa Hleihel <alaa@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      5cfb540e
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Fix refcount leak on kTLS RX resync · ea636098
      Maxim Mikityanskiy authored
      On resync, the driver calls inet_lookup_established
      (__inet6_lookup_established) that increases sk_refcnt of the socket. To
      decrease it, the driver set skb->destructor to sock_edemux. However, it
      didn't work well, because the TCP stack also sets this destructor for
      early demux, and the refcount gets decreased only once, while increased
      two times (in mlx5e and in the TCP stack). It leads to a socket leak, a
      TLS context leak, which in the end leads to calling tls_dev_del twice:
      on socket close and on driver unload, which in turn leads to a crash.
      
      This commit fixes the refcount leak by calling sock_gen_put right away
      after using the socket, thus fixing all the subsequent issues.
      
      Fixes: 0419d8c9 ("net/mlx5e: kTLS, Add kTLS RX resync support")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ea636098
    • Ryan Sharpelletti's avatar
      tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate · 1b9e2a8c
      Ryan Sharpelletti authored
      During loss recovery, retransmitted packets are forced to use TCP
      timestamps to calculate the RTT samples, which have a millisecond
      granularity. BBR is designed using a microsecond granularity. As a
      result, multiple RTT samples could be truncated to the same RTT value
      during loss recovery. This is problematic, as BBR will not enter
      PROBE_RTT if the RTT sample is <= the current min_rtt sample, meaning
      that if there are persistent losses, PROBE_RTT will constantly be
      pushed off and potentially never re-entered. This patch makes sure
      that BBR enters PROBE_RTT by checking if RTT sample is < the current
      min_rtt sample, rather than <=.
      
      The Netflix transport/TCP team discovered this bug in the Linux TCP
      BBR code during lab tests.
      
      Fixes: 0f8782ea ("tcp_bbr: add BBR congestion control")
      Signed-off-by: default avatarRyan Sharpelletti <sharpelletti@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Link: https://lore.kernel.org/r/20201116174412.1433277-1-sharpelletti.kdev@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1b9e2a8c
    • Joel Stanley's avatar
      net: ftgmac100: Fix crash when removing driver · 3d517945
      Joel Stanley authored
      When removing the driver we would hit BUG_ON(!list_empty(&dev->ptype_specific))
      in net/core/dev.c due to still having the NC-SI packet handler
      registered.
      
       # echo 1e660000.ethernet > /sys/bus/platform/drivers/ftgmac100/unbind
        ------------[ cut here ]------------
        kernel BUG at net/core/dev.c:10254!
        Internal error: Oops - BUG: 0 [#1] SMP ARM
        CPU: 0 PID: 115 Comm: sh Not tainted 5.10.0-rc3-next-20201111-00007-g02e0365710c4 #46
        Hardware name: Generic DT based system
        PC is at netdev_run_todo+0x314/0x394
        LR is at cpumask_next+0x20/0x24
        pc : [<806f5830>]    lr : [<80863cb0>]    psr: 80000153
        sp : 855bbd58  ip : 00000001  fp : 855bbdac
        r10: 80c03d00  r9 : 80c06228  r8 : 81158c54
        r7 : 00000000  r6 : 80c05dec  r5 : 80c05d18  r4 : 813b9280
        r3 : 813b9054  r2 : 8122c470  r1 : 00000002  r0 : 00000002
        Flags: Nzcv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
        Control: 00c5387d  Table: 85514008  DAC: 00000051
        Process sh (pid: 115, stack limit = 0x7cb5703d)
       ...
        Backtrace:
        [<806f551c>] (netdev_run_todo) from [<80707eec>] (rtnl_unlock+0x18/0x1c)
         r10:00000051 r9:854ed710 r8:81158c54 r7:80c76bb0 r6:81158c10 r5:8115b410
         r4:813b9000
        [<80707ed4>] (rtnl_unlock) from [<806f5db8>] (unregister_netdev+0x2c/0x30)
        [<806f5d8c>] (unregister_netdev) from [<805a8180>] (ftgmac100_remove+0x20/0xa8)
         r5:8115b410 r4:813b9000
        [<805a8160>] (ftgmac100_remove) from [<805355e4>] (platform_drv_remove+0x34/0x4c)
      
      Fixes: bd466c3f ("net/faraday: Support NCSI mode")
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Link: https://lore.kernel.org/r/20201117024448.1170761-1-joel@jms.id.auSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3d517945
    • Zhang Changzhong's avatar
      net: b44: fix error return code in b44_init_one() · 7b027c24
      Zhang Changzhong authored
      Fix to return a negative error code from the error handling
      case instead of 0, as done elsewhere in this function.
      
      Fixes: 39a6f4bc ("b44: replace the ssb_dma API with the generic DMA API")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZhang Changzhong <zhangchangzhong@huawei.com>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/1605582131-36735-1-git-send-email-zhangchangzhong@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7b027c24