1. 09 Aug, 2019 8 commits
  2. 08 Aug, 2019 2 commits
    • Guillaume Nault's avatar
      inet: frags: re-introduce skb coalescing for local delivery · 891584f4
      Guillaume Nault authored
      Before commit d4289fcc ("net: IP6 defrag: use rbtrees for IPv6
      defrag"), a netperf UDP_STREAM test[0] using big IPv6 datagrams (thus
      generating many fragments) and running over an IPsec tunnel, reported
      more than 6Gbps throughput. After that patch, the same test gets only
      9Mbps when receiving on a be2net nic (driver can make a big difference
      here, for example, ixgbe doesn't seem to be affected).
      
      By reusing the IPv4 defragmentation code, IPv6 lost fragment coalescing
      (IPv4 fragment coalescing was dropped by commit 14fe22e3 ("Revert
      "ipv4: use skb coalescing in defragmentation"")).
      
      Without fragment coalescing, be2net runs out of Rx ring entries and
      starts to drop frames (ethtool reports rx_drops_no_frags errors). Since
      the netperf traffic is only composed of UDP fragments, any lost packet
      prevents reassembly of the full datagram. Therefore, fragments which
      have no possibility to ever get reassembled pile up in the reassembly
      queue, until the memory accounting exeeds the threshold. At that point
      no fragment is accepted anymore, which effectively discards all
      netperf traffic.
      
      When reassembly timeout expires, some stale fragments are removed from
      the reassembly queue, so a few packets can be received, reassembled
      and delivered to the netperf receiver. But the nic still drops frames
      and soon the reassembly queue gets filled again with stale fragments.
      These long time frames where no datagram can be received explain why
      the performance drop is so significant.
      
      Re-introducing fragment coalescing is enough to get the initial
      performances again (6.6Gbps with be2net): driver doesn't drop frames
      anymore (no more rx_drops_no_frags errors) and the reassembly engine
      works at full speed.
      
      This patch is quite conservative and only coalesces skbs for local
      IPv4 and IPv6 delivery (in order to avoid changing skb geometry when
      forwarding). Coalescing could be extended in the future if need be, as
      more scenarios would probably benefit from it.
      
      [0]: Test configuration
      Sender:
      ip xfrm policy flush
      ip xfrm state flush
      ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1
      ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir in tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow
      ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1
      ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir out tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow
      netserver -D -L fc00:2::1
      
      Receiver:
      ip xfrm policy flush
      ip xfrm state flush
      ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1
      ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir in tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow
      ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1
      ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir out tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow
      netperf -H fc00:2::1 -f k -P 0 -L fc00:1::1 -l 60 -t UDP_STREAM -I 99,5 -i 5,5 -T5,5 -6
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      891584f4
    • David S. Miller's avatar
      Merge tag 'batadv-net-for-davem-20190808' of git://git.open-mesh.org/linux-merge · f6649feb
      David S. Miller authored
      Simon Wunderlich says:
      
      ====================
      Here are some batman-adv bugfixes:
      
       - Fix netlink dumping of all mcast_flags buckets, by Sven Eckelmann
      
       - Fix deletion of RTR(4|6) mcast list entries, by Sven Eckelmann
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6649feb
  3. 07 Aug, 2019 1 commit
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 33920f1e
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Yeah I should have sent a pull request last week, so there is a lot
        more here than usual:
      
         1) Fix memory leak in ebtables compat code, from Wenwen Wang.
      
         2) Several kTLS bug fixes from Jakub Kicinski (circular close on
            disconnect etc.)
      
         3) Force slave speed check on link state recovery in bonding 802.3ad
            mode, from Thomas Falcon.
      
         4) Clear RX descriptor bits before assigning buffers to them in
            stmmac, from Jose Abreu.
      
         5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF
            loops, from Nishka Dasgupta.
      
         6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean.
      
         7) Need to hold sock across skb->destructor invocation, from Cong
            Wang.
      
         8) IP header length needs to be validated in ipip tunnel xmit, from
            Haishuang Yan.
      
         9) Use after free in ip6 tunnel driver, also from Haishuang Yan.
      
        10) Do not use MSI interrupts on r8169 chips before RTL8168d, from
            Heiner Kallweit.
      
        11) Upon bridge device init failure, we need to delete the local fdb.
            From Nikolay Aleksandrov.
      
        12) Handle erros from of_get_mac_address() properly in stmmac, from
            Martin Blumenstingl.
      
        13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef
            Kadlecsik.
      
        14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with
            some devices, so revert. From Johannes Berg.
      
        15) Fix deadlock in rxrpc, from David Howells.
      
        16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue
            Haibing.
      
        17) Fix mvpp2 crash on module removal, from Matteo Croce.
      
        18) Fix race in genphy_update_link, from Heiner Kallweit.
      
        19) bpf_xdp_adjust_head() stopped working with generic XDP when we
            fixes generic XDP to support stacked devices properly, fix from
            Jesper Dangaard Brouer.
      
        20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from
            David Ahern.
      
        21) Several memory leaks in new sja1105 driver, from Vladimir Oltean"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits)
        net: dsa: sja1105: Fix memory leak on meta state machine error path
        net: dsa: sja1105: Fix memory leak on meta state machine normal path
        net: dsa: sja1105: Really fix panic on unregistering PTP clock
        net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well
        net: dsa: sja1105: Fix broken learning with vlan_filtering disabled
        net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus()
        net: sched: sample: allow accessing psample_group with rtnl
        net: sched: police: allow accessing police->params with rtnl
        net: hisilicon: Fix dma_map_single failed on arm64
        net: hisilicon: fix hip04-xmit never return TX_BUSY
        net: hisilicon: make hip04_tx_reclaim non-reentrant
        tc-testing: updated vlan action tests with batch create/delete
        net sched: update vlan action for batched events operations
        net: stmmac: tc: Do not return a fragment entry
        net: stmmac: Fix issues when number of Queues >= 4
        net: stmmac: xgmac: Fix XGMAC selftests
        be2net: disable bh with spin_lock in be_process_mcc
        net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
        net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs
        net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
        ...
      33920f1e
  4. 06 Aug, 2019 29 commits
    • David S. Miller's avatar
      Merge branch 'sja1105-fixes' · feac1d68
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Fixes for SJA1105 DSA: FDBs, Learning and PTP
      
      This is an assortment of functional fixes for the sja1105 switch driver
      targeted for the "net" tree (although they apply on net-next just as
      well).
      
      Patch 1/5 ("net: dsa: sja1105: Fix broken learning with vlan_filtering
      disabled") repairs a breakage introduced in the early development stages
      of the driver: support for traffic from the CPU has broken "normal"
      frame forwarding (based on DMAC) - there is connectivity through the
      switch only because all frames are flooded.
      I debated whether this patch qualifies as a fix, since it puts the
      switch into a mode it has never operated in before (aka SVL). But
      "normal" forwarding did use to work before the "Traffic support for
      SJA1105 DSA driver" patchset, and arguably this patch should have been
      part of that.
      Also, it would be strange for this feature to be broken in the 5.2 LTS.
      
      Patch 2/5 ("net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as
      well") is a simplification of a previous FDB-related patch that is
      currently in the 5.3 rc's.
      
      Patches 3/5 - 5/5 fix various crashes found while running linuxptp over the
      switch ports for extended periods of time, or in conjunction with other
      error conditions. The fixed-up commits were all introduced in 5.2.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      feac1d68
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Fix memory leak on meta state machine error path · 93fa8587
      Vladimir Oltean authored
      When RX timestamping is enabled and two link-local (non-meta) frames are
      received in a row, this constitutes an error.
      
      The tagger is always caching the last link-local frame, in an attempt to
      merge it with the meta follow-up frame when that arrives. To recover
      from the above error condition, the initial cached link-local frame is
      dropped and the second frame in a row is cached (in expectance of the
      second meta frame).
      
      However, when dropping the initial link-local frame, its backing memory
      was being leaked.
      
      Fixes: f3097be2 ("net: dsa: sja1105: Add a state machine for RX timestamping")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93fa8587
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Fix memory leak on meta state machine normal path · f163fed2
      Vladimir Oltean authored
      After a meta frame is received, it is associated with the cached
      sp->data->stampable_skb from the DSA tagger private structure.
      
      Cached means its refcount is incremented with skb_get() in order for
      dsa_switch_rcv() to not free it when the tagger .rcv returns NULL.
      
      The mistake is that skb_unref() is not the correct function to use. It
      will correctly decrement the refcount (which will go back to zero) but
      the skb memory will not be freed.  That is the job of kfree_skb(), which
      also calls skb_unref().
      
      But it turns out that freeing the cached stampable_skb is in fact not
      necessary.  It is still a perfectly valid skb, and now it is even
      annotated with the partial RX timestamp.  So remove the skb_copy()
      altogether and simply pass the stampable_skb with a refcount of 1
      (incremented by us, decremented by dsa_switch_rcv) up the stack.
      
      Fixes: f3097be2 ("net: dsa: sja1105: Add a state machine for RX timestamping")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f163fed2
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Really fix panic on unregistering PTP clock · 6cb0abbd
      Vladimir Oltean authored
      The IS_ERR_OR_NULL(priv->clock) check inside
      sja1105_ptp_clock_unregister() is preventing cancel_delayed_work_sync
      from actually being run.
      
      Additionally, sja1105_ptp_clock_unregister() does not actually get run,
      when placed in sja1105_remove(). The DSA switch gets torn down, but the
      sja1105 module does not get unregistered. So sja1105_ptp_clock_unregister
      needs to be moved to sja1105_teardown, to be symmetrical with
      sja1105_ptp_clock_register which is called from the DSA sja1105_setup.
      
      It is strange to fix a "fixes" patch, but the probe failure can only be
      seen when the attached PHY does not respond to MDIO (issue which I can't
      pinpoint the reason to) and it goes away after I power-cycle the board.
      This time the patch was validated on a failing board, and the kernel
      panic from the fixed commit's message can no longer be seen.
      
      Fixes: 29dd908d ("net: dsa: sja1105: Cancel PTP delayed work on unregister")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6cb0abbd
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well · 4b7da3d8
      Vladimir Oltean authored
      It looks like the FDB dump taken from first-generation switches also
      contains information on whether entries are static or not. So use that
      instead of searching through the driver's tables.
      
      Fixes: d7637782 ("net: dsa: sja1105: Implement is_static for FDB entries on E/T")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b7da3d8
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Fix broken learning with vlan_filtering disabled · 6d7c7d94
      Vladimir Oltean authored
      When put under a bridge with vlan_filtering 0, the SJA1105 ports will
      flood all traffic as if learning was broken. This is because learning
      interferes with the rx_vid's configured by dsa_8021q as unique pvid's.
      
      So learning technically still *does* work, it's just that the learnt
      entries never get matched due to their unique VLAN ID.
      
      The setting that saves the day is Shared VLAN Learning, which on this
      switch family works exactly as desired: VLAN tagging still works
      (untagged traffic gets the correct pvid) and FDB entries are still
      populated with the correct contents including VID. Also, a frame cannot
      violate the forwarding domain restrictions enforced by its classified
      VLAN. It is just that the VID is ignored when looking up the FDB for
      taking a forwarding decision (selecting the egress port).
      
      This patch activates SVL, and the result is that frames with a learnt
      DMAC are no longer flooded in the scenario described above.
      
      Now exactly *because* SVL works as desired, we have to revisit some
      earlier patches:
      
      - It is no longer necessary to manipulate the VID of the 'bridge fdb
        {add,del}' command when vlan_filtering is off. This is because now,
        SVL is enabled for that case, so the actual VID does not matter*.
      
      - It is still desirable to hide dsa_8021q VID's in the FDB dump
        callback. But right now the dump callback should no longer hide
        duplicates (one per each front panel port's pvid, plus one for the
        VLAN that the CPU port is going to tag a TX frame with), because there
        shouldn't be any (the switch will match a single FDB entry no matter
        its VID anyway).
      
      * Not really... It's no longer necessary to transform a 'bridge fdb add'
        into 5 fdb add operations, but the user might still add a fdb entry with
        any vid, and all of them would appear as duplicates in 'bridge fdb
        show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we
        can prune the duplicates at insertion time.
      
      ** The VID of 0 is better than 1 because it is always guaranteed to be
         in the ports' hardware filter. DSA also avoids putting the VID inside
         the netlink response message towards the bridge driver when we return
         this particular VID, which makes it suitable for FDB entries learnt
         with vlan_filtering off.
      
      Fixes: 227d07a0 ("net: dsa: sja1105: Add support for traffic through standalone ports")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarGeorg Waibel <georg.waibel@sensor-technik.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d7c7d94
    • Nishka Dasgupta's avatar
      net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus() · f26e0cca
      Nishka Dasgupta authored
      Each iteration of for_each_available_child_of_node() puts the previous
      node, but in the case of a return from the middle of the loop, there
      is no put, thus causing a memory leak. Hence add an of_node_put() before
      the return.
      Additionally, the local variable ports in the function
      qca8k_setup_mdio_bus() takes the return value of of_get_child_by_name(),
      which gets a node but does not put it. If the function returns without
      putting ports, it may cause a memory leak. Hence put ports before the
      mid-loop return statement, and also outside the loop after its last usage
      in this function.
      Issues found with Coccinelle.
      Signed-off-by: default avatarNishka Dasgupta <nishkadg.linux@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f26e0cca
    • David S. Miller's avatar
      Merge branch 'flow_offload-action-fixes' · 443bfb4a
      David S. Miller authored
      Vlad Buslov says:
      
      ====================
      action fixes for flow_offload infra compatibility
      
      Fix rcu warnings due to usage of action helpers that expect rcu read lock
      protection from rtnl-protected context of flow_offload infra.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      443bfb4a
    • Vlad Buslov's avatar
      net: sched: sample: allow accessing psample_group with rtnl · 67cbf7de
      Vlad Buslov authored
      Recently implemented support for sample action in flow_offload infra leads
      to following rcu usage warning:
      
      [ 1938.234856] =============================
      [ 1938.234858] WARNING: suspicious RCU usage
      [ 1938.234863] 5.3.0-rc1+ #574 Not tainted
      [ 1938.234866] -----------------------------
      [ 1938.234869] include/net/tc_act/tc_sample.h:47 suspicious rcu_dereference_check() usage!
      [ 1938.234872]
                     other info that might help us debug this:
      
      [ 1938.234875]
                     rcu_scheduler_active = 2, debug_locks = 1
      [ 1938.234879] 1 lock held by tc/19540:
      [ 1938.234881]  #0: 00000000b03cb918 (rtnl_mutex){+.+.}, at: tc_new_tfilter+0x47c/0x970
      [ 1938.234900]
                     stack backtrace:
      [ 1938.234905] CPU: 2 PID: 19540 Comm: tc Not tainted 5.3.0-rc1+ #574
      [ 1938.234908] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 1938.234911] Call Trace:
      [ 1938.234922]  dump_stack+0x85/0xc0
      [ 1938.234930]  tc_setup_flow_action+0xed5/0x2040
      [ 1938.234944]  fl_hw_replace_filter+0x11f/0x2e0 [cls_flower]
      [ 1938.234965]  fl_change+0xd24/0x1b30 [cls_flower]
      [ 1938.234990]  tc_new_tfilter+0x3e0/0x970
      [ 1938.235021]  ? tc_del_tfilter+0x720/0x720
      [ 1938.235028]  rtnetlink_rcv_msg+0x389/0x4b0
      [ 1938.235038]  ? netlink_deliver_tap+0x95/0x400
      [ 1938.235044]  ? rtnl_dellink+0x2d0/0x2d0
      [ 1938.235053]  netlink_rcv_skb+0x49/0x110
      [ 1938.235063]  netlink_unicast+0x171/0x200
      [ 1938.235073]  netlink_sendmsg+0x224/0x3f0
      [ 1938.235091]  sock_sendmsg+0x5e/0x60
      [ 1938.235097]  ___sys_sendmsg+0x2ae/0x330
      [ 1938.235111]  ? __handle_mm_fault+0x12cd/0x19e0
      [ 1938.235125]  ? __handle_mm_fault+0x12cd/0x19e0
      [ 1938.235138]  ? find_held_lock+0x2b/0x80
      [ 1938.235147]  ? do_user_addr_fault+0x22d/0x490
      [ 1938.235160]  __sys_sendmsg+0x59/0xa0
      [ 1938.235178]  do_syscall_64+0x5c/0xb0
      [ 1938.235187]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1938.235192] RIP: 0033:0x7ff9a4d597b8
      [ 1938.235197] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83
       ec 28 89 54
      [ 1938.235200] RSP: 002b:00007ffcfe381c48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 1938.235205] RAX: ffffffffffffffda RBX: 000000005d4497f9 RCX: 00007ff9a4d597b8
      [ 1938.235208] RDX: 0000000000000000 RSI: 00007ffcfe381cb0 RDI: 0000000000000003
      [ 1938.235211] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
      [ 1938.235214] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000001
      [ 1938.235217] R13: 0000000000480640 R14: 0000000000000012 R15: 0000000000000001
      
      Change tcf_sample_psample_group() helper to allow using it from both rtnl
      and rcu protected contexts.
      
      Fixes: a7a7be60 ("net/sched: add sample action to the hardware intermediate representation")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67cbf7de
    • Vlad Buslov's avatar
      net: sched: police: allow accessing police->params with rtnl · c4bd4869
      Vlad Buslov authored
      Recently implemented support for police action in flow_offload infra leads
      to following rcu usage warning:
      
      [ 1925.881092] =============================
      [ 1925.881094] WARNING: suspicious RCU usage
      [ 1925.881098] 5.3.0-rc1+ #574 Not tainted
      [ 1925.881100] -----------------------------
      [ 1925.881104] include/net/tc_act/tc_police.h:57 suspicious rcu_dereference_check() usage!
      [ 1925.881106]
                     other info that might help us debug this:
      
      [ 1925.881109]
                     rcu_scheduler_active = 2, debug_locks = 1
      [ 1925.881112] 1 lock held by tc/18591:
      [ 1925.881115]  #0: 00000000b03cb918 (rtnl_mutex){+.+.}, at: tc_new_tfilter+0x47c/0x970
      [ 1925.881124]
                     stack backtrace:
      [ 1925.881127] CPU: 2 PID: 18591 Comm: tc Not tainted 5.3.0-rc1+ #574
      [ 1925.881130] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 1925.881132] Call Trace:
      [ 1925.881138]  dump_stack+0x85/0xc0
      [ 1925.881145]  tc_setup_flow_action+0x1771/0x2040
      [ 1925.881155]  fl_hw_replace_filter+0x11f/0x2e0 [cls_flower]
      [ 1925.881175]  fl_change+0xd24/0x1b30 [cls_flower]
      [ 1925.881200]  tc_new_tfilter+0x3e0/0x970
      [ 1925.881231]  ? tc_del_tfilter+0x720/0x720
      [ 1925.881243]  rtnetlink_rcv_msg+0x389/0x4b0
      [ 1925.881250]  ? netlink_deliver_tap+0x95/0x400
      [ 1925.881257]  ? rtnl_dellink+0x2d0/0x2d0
      [ 1925.881264]  netlink_rcv_skb+0x49/0x110
      [ 1925.881275]  netlink_unicast+0x171/0x200
      [ 1925.881284]  netlink_sendmsg+0x224/0x3f0
      [ 1925.881299]  sock_sendmsg+0x5e/0x60
      [ 1925.881305]  ___sys_sendmsg+0x2ae/0x330
      [ 1925.881309]  ? task_work_add+0x43/0x50
      [ 1925.881314]  ? fput_many+0x45/0x80
      [ 1925.881329]  ? __lock_acquire+0x248/0x1930
      [ 1925.881342]  ? find_held_lock+0x2b/0x80
      [ 1925.881347]  ? task_work_run+0x7b/0xd0
      [ 1925.881359]  __sys_sendmsg+0x59/0xa0
      [ 1925.881375]  do_syscall_64+0x5c/0xb0
      [ 1925.881381]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1925.881384] RIP: 0033:0x7feb245047b8
      [ 1925.881388] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83
       ec 28 89 54
      [ 1925.881391] RSP: 002b:00007ffc2d2a5788 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 1925.881395] RAX: ffffffffffffffda RBX: 000000005d4497ed RCX: 00007feb245047b8
      [ 1925.881398] RDX: 0000000000000000 RSI: 00007ffc2d2a57f0 RDI: 0000000000000003
      [ 1925.881400] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
      [ 1925.881403] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000001
      [ 1925.881406] R13: 0000000000480640 R14: 0000000000000012 R15: 0000000000000001
      
      Change tcf_police_rate_bytes_ps() and tcf_police_tcfp_burst() helpers to
      allow using them from both rtnl and rcu protected contexts.
      
      Fixes: 8c8cfc6e ("net/sched: add police action to the hardware intermediate representation")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4bd4869
    • David S. Miller's avatar
      Merge branch 'hisilicon-fixes' · 2b0dfc17
      David S. Miller authored
      Jiangfeng Xiao says:
      
      ====================
      net: hisilicon: Fix a few problems with hip04_eth
      
      During the use of the hip04_eth driver,
      several problems were found,
      which solved the hip04_tx_reclaim reentry problem,
      fixed the problem that hip04_mac_start_xmit never
      returns NETDEV_TX_BUSY
      and the dma_map_single failed on the arm64 platform.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b0dfc17
    • Jiangfeng Xiao's avatar
      net: hisilicon: Fix dma_map_single failed on arm64 · 96a50c0d
      Jiangfeng Xiao authored
      On the arm64 platform, executing "ifconfig eth0 up" will fail,
      returning "ifconfig: SIOCSIFFLAGS: Input/output error."
      
      ndev->dev is not initialized, dma_map_single->get_dma_ops->
      dummy_dma_ops->__dummy_map_page will return DMA_ERROR_CODE
      directly, so when we use dma_map_single, the first parameter
      is to use the device of platform_device.
      Signed-off-by: default avatarJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96a50c0d
    • Jiangfeng Xiao's avatar
      net: hisilicon: fix hip04-xmit never return TX_BUSY · f2243b82
      Jiangfeng Xiao authored
      TX_DESC_NUM is 256, in tx_count, the maximum value of
      mod(TX_DESC_NUM - 1) is 254, the variable "count" in
      the hip04_mac_start_xmit function is never equal to
      (TX_DESC_NUM - 1), so hip04_mac_start_xmit never
      return NETDEV_TX_BUSY.
      
      tx_count is modified to mod(TX_DESC_NUM) so that
      the maximum value of tx_count can reach
      (TX_DESC_NUM - 1), then hip04_mac_start_xmit can reurn
      NETDEV_TX_BUSY.
      Signed-off-by: default avatarJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2243b82
    • Jiangfeng Xiao's avatar
      net: hisilicon: make hip04_tx_reclaim non-reentrant · 1a2c070a
      Jiangfeng Xiao authored
      If hip04_tx_reclaim is interrupted while it is running
      and then __napi_schedule continues to execute
      hip04_rx_poll->hip04_tx_reclaim, reentrancy occurs
      and oops is generated. So you need to mask the interrupt
      during the hip04_tx_reclaim run.
      
      The kernel oops exception stack is as follows:
      
      Unable to handle kernel NULL pointer dereference
      at virtual address 00000050
      pgd = c0003000
      [00000050] *pgd=80000000a04003, *pmd=00000000
      Internal error: Oops: 206 [#1] SMP ARM
      Modules linked in: hip04_eth mtdblock mtd_blkdevs mtd
      ohci_platform ehci_platform ohci_hcd ehci_hcd
      vfat fat sd_mod usb_storage scsi_mod usbcore usb_common
      CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O    4.4.185 #1
      Hardware name: Hisilicon A15
      task: c0a250e0 task.stack: c0a00000
      PC is at hip04_tx_reclaim+0xe0/0x17c [hip04_eth]
      LR is at hip04_tx_reclaim+0x30/0x17c [hip04_eth]
      pc : [<bf30c3a4>]    lr : [<bf30c2f4>]    psr: 600e0313
      sp : c0a01d88  ip : 00000000  fp : c0601f9c
      r10: 00000000  r9 : c3482380  r8 : 00000001
      r7 : 00000000  r6 : 000000e1  r5 : c3482000  r4 : 0000000c
      r3 : f2209800  r2 : 00000000  r1 : 00000000  r0 : 00000000
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
      Control: 32c5387d  Table: 03d28c80  DAC: 55555555
      Process swapper/0 (pid: 0, stack limit = 0xc0a00190)
      Stack: (0xc0a01d88 to 0xc0a02000)
      [<bf30c3a4>] (hip04_tx_reclaim [hip04_eth]) from [<bf30d2e0>]
                                                      (hip04_rx_poll+0x88/0x368 [hip04_eth])
      [<bf30d2e0>] (hip04_rx_poll [hip04_eth]) from [<c04c2d9c>] (net_rx_action+0x114/0x34c)
      [<c04c2d9c>] (net_rx_action) from [<c021eed8>] (__do_softirq+0x218/0x318)
      [<c021eed8>] (__do_softirq) from [<c021f284>] (irq_exit+0x88/0xac)
      [<c021f284>] (irq_exit) from [<c0240090>] (msa_irq_exit+0x11c/0x1d4)
      [<c0240090>] (msa_irq_exit) from [<c02677e0>] (__handle_domain_irq+0x110/0x148)
      [<c02677e0>] (__handle_domain_irq) from [<c0201588>] (gic_handle_irq+0xd4/0x118)
      [<c0201588>] (gic_handle_irq) from [<c0551700>] (__irq_svc+0x40/0x58)
      Exception stack(0xc0a01f30 to 0xc0a01f78)
      1f20:                                     c0ae8b40 00000000 00000000 00000000
      1f40: 00000002 ffffe000 c0601f9c 00000000 ffffffff c0a2257c c0a22440 c0831a38
      1f60: c0a01ec4 c0a01f80 c0203714 c0203718 600e0213 ffffffff
      [<c0551700>] (__irq_svc) from [<c0203718>] (arch_cpu_idle+0x20/0x3c)
      [<c0203718>] (arch_cpu_idle) from [<c025bfd8>] (cpu_startup_entry+0x244/0x29c)
      [<c025bfd8>] (cpu_startup_entry) from [<c054b0d8>] (rest_init+0xc8/0x10c)
      [<c054b0d8>] (rest_init) from [<c0800c58>] (start_kernel+0x468/0x514)
      Code: a40599e5 016086e2 018088e2 7660efe6 (503090e5)
      ---[ end trace 1db21d6d09c49d74 ]---
      Kernel panic - not syncing: Fatal exception in interrupt
      CPU3: stopping
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D    O    4.4.185 #1
      Signed-off-by: default avatarJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a2c070a
    • David S. Miller's avatar
      Merge branch 'Fix-batched-event-generation-for-vlan-action' · 5b0bce24
      David S. Miller authored
      Roman Mashak says:
      
      ====================
      Fix batched event generation for vlan action
      
      When adding or deleting a batch of entries, the kernel sends up to
      TCA_ACT_MAX_PRIO (defined to 32 in kernel) entries in an event to user
      space. However it does not consider that the action sizes may vary and
      require different skb sizes.
      
      For example, consider the following script adding 32 entries with all
      supported vlan parameters (in order to maximize netlink messages size):
      
      % cat tc-batch.sh
      TC="sudo /mnt/iproute2.git/tc/tc"
      
      $TC actions flush action vlan
      for i in `seq 1 $1`;
      do
         cmd="action vlan push protocol 802.1q id 4094 priority 7 pipe \
                     index $i cookie aabbccddeeff112233445566778800a1 "
         args=$args$cmd
      done
      $TC actions add $args
      %
      % ./tc-batch.sh 32
      Error: Failed to fill netlink attributes while adding TC action.
      We have an error talking to the kernel
      %
      
      patch 1 adds callback in tc_action_ops of vlan action, which calculates
      the action size, and passes size to tcf_add_notify()/tcf_del_notify().
      
      patch 2 updates the TDC test suite with relevant vlan test cases.
      ====================
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b0bce24
    • Roman Mashak's avatar
      tc-testing: updated vlan action tests with batch create/delete · 8571deb0
      Roman Mashak authored
      Update TDC tests with cases varifying ability of TC to install or delete
      batches of vlan actions.
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8571deb0
    • Roman Mashak's avatar
      net sched: update vlan action for batched events operations · b35475c5
      Roman Mashak authored
      Add get_fill_size() routine used to calculate the action size
      when building a batch of events.
      
      Fixes: c7e2b968 ("sched: introduce vlan action")
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b35475c5
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_5.3_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 76d7961f
      Linus Torvalds authored
      Pull MIPS fixes from Paul Burton:
       "A few MIPS fixes for 5.3:
      
         - Various switch fall through annotations to fixup warnings & errors
           resulting from -Wimplicit-fallthrough.
      
         - A fix for systems (at least jazz) using an i8253 PIT as clocksource
           when it's not suitably configured.
      
         - Set struct cacheinfo's cpu_map_populated field to true, indicating
           that we filled in cache info detected from cop0 registers &
           avoiding complaints about that info being (intentionally) missing
           in devicetree"
      
      * tag 'mips_fixes_5.3_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: BCM63XX: Mark expected switch fall-through
        MIPS: OProfile: Mark expected switch fall-throughs
        MIPS: Annotate fall-through in Cavium Octeon code
        MIPS: Annotate fall-through in kvm/emulate.c
        mips: fix cacheinfo
        MIPS: kernel: only use i8253 clocksource with periodic clockevent
      76d7961f
    • David S. Miller's avatar
      Merge branch 'stmmac-fixes' · 3abd24a1
      David S. Miller authored
      Jose Abreu says:
      
      ====================
      net: stmmac: Fixes for -net
      
      Couple of fixes for -net. More info in commit log.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3abd24a1
    • Jose Abreu's avatar
      net: stmmac: tc: Do not return a fragment entry · 4a6a1385
      Jose Abreu authored
      Do not try to return a fragment entry from TC list. Otherwise we may not
      clean properly allocated entries.
      Signed-off-by: default avatarJose Abreu <joabreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a6a1385
    • Jose Abreu's avatar
      net: stmmac: Fix issues when number of Queues >= 4 · e8df7e8c
      Jose Abreu authored
      When queues >= 4 we use different registers but we were not subtracting
      the offset of 4. Fix this.
      
      Found out by Coverity.
      Signed-off-by: default avatarJose Abreu <joabreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8df7e8c
    • Jose Abreu's avatar
      net: stmmac: xgmac: Fix XGMAC selftests · 0efedbf1
      Jose Abreu authored
      Fixup the XGMAC selftests by correctly finishing the implementation of
      set_filter callback.
      
      Result:
      $ ethtool -t enp4s0
      The test result is PASS
      The test extra info:
       1. MAC Loopback         	 0
       2. PHY Loopback         	 -95
       3. MMC Counters         	 -95
       4. EEE                  	 -95
       5. Hash Filter MC       	 0
       6. Perfect Filter UC    	 0
       7. MC Filter            	 0
       8. UC Filter            	 0
       9. Flow Control         	 0
      Signed-off-by: default avatarJose Abreu <joabreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0efedbf1
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-for-davem-2019-08-06' of... · 0574f2ed
      David S. Miller authored
      Merge tag 'wireless-drivers-for-davem-2019-08-06' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 5.3
      
      Second set of fixes for 5.3. Lots of iwlwifi fixes have accumulated
      which consists most of patches in this pull request. Only most notable
      iwlwifi fixes are listed below.
      
      mwifiex
      
      * fix a regression related to WPA1 networks since v5.3-rc1
      
      iwlwifi
      
      * fix use-after-free issues
      
      * fix DMA mapping API usage errors
      
      * fix frame drop occurring due to reorder buffer handling in
        RSS in certain conditions
      
      * fix rate scale locking issues
      
      * disable TX A-MSDU on older NICs as it causes problems and was
        never supposed to be supported
      
      * new PCI IDs
      
      * GEO_TX_POWER_LIMIT API issue that many people were hitting
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0574f2ed
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · f4eb1423
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - functional regression fix for some of the Logitech unifying devices,
         from Hans de Goede
      
       - race condition fix in hid-sony for bug severely affecting
         Valve/Android deployments, from Roderick Colenbrander
      
       - several fixes for issues found by syzbot/kasan, from Oliver Neukum
         and Hillf Danton
      
       - functional regression fix for Wacom Cintiq device, from Aaron
         Armstrong Skomra
      
       - a few other assorted device-specific quirks
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: sony: Fix race condition between rumble and device remove.
        HID: hiddev: do cleanup in failure of opening a device
        HID: hiddev: avoid opening a disconnected device
        HID: input: fix a4tech horizontal wheel custom usage
        HID: Add quirk for HP X1200 PIXART OEM mouse
        HID: holtek: test for sanity of intfdata
        HID: wacom: fix bit shift for Cintiq Companion 2
        HID: quirks: Set the INCREMENT_USAGE_ON_DUPLICATE quirk on Saitek X52
        HID: logitech-dj: Really fix return value of logi_dj_recv_query_hidpp_devices
        HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
        HID: logitech-dj: add the Powerplay receiver
        HID: logitech-hidpp: add USB PID for a few more supported mice
        HID: logitech-dj: rename "gaming" receiver to "lightspeed"
      f4eb1423
    • Denis Kirjanov's avatar
      be2net: disable bh with spin_lock in be_process_mcc · d0d006a4
      Denis Kirjanov authored
      be_process_mcc() is invoked in 3 different places and
      always with BHs disabled except the be_poll function
      but since it's invoked from softirq with BHs
      disabled it won't hurt.
      
      v1->v2: added explanation to the patch
      v2->v3: add a missing call from be_cmds.c
      Signed-off-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0d006a4
    • Christophe JAILLET's avatar
      net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' · debea2cd
      Christophe JAILLET authored
      A call to 'kfree_skb()' is missing in the error handling path of
      'init_one()'.
      This is already present in 'remove_one()' but is missing here.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      debea2cd
    • Chen-Yu Tsai's avatar
      net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs · 5c4e2e1a
      Chen-Yu Tsai authored
      The sun4i-emac uses the "phy" property to find the PHY it's supposed to
      use. This property was deprecated in favor of "phy-handle" in commit
      8c5b0944 ("dt-bindings: net: sun4i-emac: Convert the binding to a
      schemas").
      
      Add support for this new property name, and fall back to the old one in
      case the device tree hasn't been updated.
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c4e2e1a
    • Linus Torvalds's avatar
      Merge branch 'x86/grand-schemozzle' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4368c4bc
      Linus Torvalds authored
      Pull pti updates from Thomas Gleixner:
       "The performance deterioration departement is not proud at all to
        present yet another set of speculation fences to mitigate the next
        chapter in the 'what could possibly go wrong' story.
      
        The new vulnerability belongs to the Spectre class and affects GS
        based data accesses and has therefore been dubbed 'Grand Schemozzle'
        for secret communication purposes. It's officially listed as
        CVE-2019-1125.
      
        Conditional branches in the entry paths which contain a SWAPGS
        instruction (interrupts and exceptions) can be mis-speculated which
        results in speculative accesses with a wrong GS base.
      
        This can happen on entry from user mode through a mis-speculated
        branch which takes the entry from kernel mode path and therefore does
        not execute the SWAPGS instruction. The following speculative accesses
        are done with user GS base.
      
        On entry from kernel mode the mis-speculated branch executes the
        SWAPGS instruction in the entry from user mode path which has the same
        effect that the following GS based accesses are done with user GS
        base.
      
        If there is a disclosure gadget available in these code paths the
        mis-speculated data access can be leaked through the usual side
        channels.
      
        The entry from user mode issue affects all CPUs which have speculative
        execution. The entry from kernel mode issue affects only Intel CPUs
        which can speculate through SWAPGS. On CPUs from other vendors SWAPGS
        has semantics which prevent that.
      
        SMAP migitates both problems but only when the CPU is not affected by
        the Meltdown vulnerability.
      
        The mitigation is to issue LFENCE instructions in the entry from
        kernel mode path for all affected CPUs and on the affected Intel CPUs
        also in the entry from user mode path unless PTI is enabled because
        the CR3 write is serializing.
      
        The fences are as usual enabled conditionally and can be completely
        disabled on the kernel command line. The Spectre V1 documentation is
        updated accordingly.
      
        A big "Thank You!" goes to Josh for doing the heavy lifting for this
        round of hardware misfeature 'repair'. Of course also "Thank You!" to
        everybody else who contributed in one way or the other"
      
      * 'x86/grand-schemozzle' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Documentation: Add swapgs description to the Spectre v1 documentation
        x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
        x86/entry/64: Use JMP instead of JMPQ
        x86/speculation: Enable Spectre v1 swapgs mitigations
        x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
      4368c4bc
    • Roderick Colenbrander's avatar
      HID: sony: Fix race condition between rumble and device remove. · e0f6974a
      Roderick Colenbrander authored
      Valve reported a kernel crash on Ubuntu 18.04 when disconnecting a DS4
      gamepad while rumble is enabled. This issue is reproducible with a
      frequency of 1 in 3 times in the game Borderlands 2 when using an
      automatic weapon, which triggers many rumble operations.
      
      We found the issue to be a race condition between sony_remove and the
      final device destruction by the HID / input system. The problem was
      that sony_remove didn't clean some of its work_item state in
      "struct sony_sc". After sony_remove work, the corresponding evdev
      node was around for sufficient time for applications to still queue
      rumble work after "sony_remove".
      
      On pre-4.19 kernels the race condition caused a kernel crash due to a
      NULL-pointer dereference as "sc->output_report_dmabuf" got freed during
      sony_remove. On newer kernels this crash doesn't happen due the buffer
      now being allocated using devm_kzalloc. However we can still queue work,
      while the driver is an undefined state.
      
      This patch fixes the described problem, by guarding the work_item
      "state_worker" with an initialized variable, which we are setting back
      to 0 on cleanup.
      Signed-off-by: default avatarRoderick Colenbrander <roderick.colenbrander@sony.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      e0f6974a