1. 28 Dec, 2016 6 commits
    • Daniel Jurgens's avatar
      net/mlx5: Cancel recovery work in remove flow · 689a248d
      Daniel Jurgens authored
      If there is pending delayed work for health recovery it must be canceled
      if the device is being unloaded.
      
      Fixes: 05ac2c0b ("net/mlx5: Fix race between PCI error handlers and health work")
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      689a248d
    • Noa Osherovich's avatar
      net/mlx5: Check FW limitations on log_max_qp before setting it · 883371c4
      Noa Osherovich authored
      When setting HCA capabilities, set log_max_qp to be the minimum
      between the selected profile's value and the HCA limitation.
      
      Fixes: 938fe83c ('net/mlx5_core: New device capabilities...')
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      883371c4
    • Or Gerlitz's avatar
      net/mlx5: Disable RoCE on the e-switch management port under switchdev mode · 9da34cd3
      Or Gerlitz authored
      Under the switchdev/offloads mode, packets that don't match any
      e-switch steering rule are sent towards the e-switch management
      port. We use a NIC HW steering rule set per vport (uplink and VFs)
      to make them be received into the host OS through the respective
      vport representor netdevice.
      
      Currnetly such missed RoCE packets will not get to this NIC steering
      rule, and hence VF RoCE will not work over the slow path of the offloads
      mode. This is b/c these packets will be matched by a steering rule added
      by the firmware that serves RoCE traffic set on the PF NIC vport which
      is also the e-switch management port under SRIOV.
      
      Disabling RoCE on the e-switch management vport when we are in the offloads
      mode, will signal to the firmware to remove their RoCE rule, and then the
      missed RoCE packets will be matched by the representor NIC steering rule
      as any other missed packets.
      
      To achieve that, we disable RoCE on the PF vport. We do that by removing
      (hot-unplugging) the IB device instance associated with the PF. This is
      also required by our current model where the PF serves as the uplink
      representor and hence only SW switching (TC, bridge, OVS) applications
      and slow path vport mlx5e net-device should be running over that vport.
      
      Fixes: c930a3ad ('net/mlx5e: Add devlink based SRIOV mode changes')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9da34cd3
    • Paul Blakey's avatar
      net/sched: cls_flower: Fix missing addr_type in classify · 0df0f207
      Paul Blakey authored
      Since we now use a non zero mask on addr_type, we are matching on its
      value (IPV4/IPV6). So before this fix, matching on enc_src_ip/enc_dst_ip
      failed in SW/classify path since its value was zero.
      This patch sets the proper value of addr_type for encapsulated packets.
      
      Fixes: 970bfcd0 ('net/sched: cls_flower: Use mask for addr_type')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Reviewed-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0df0f207
    • Florian Fainelli's avatar
      net: stmmac: Fix race between stmmac_drv_probe and stmmac_open · 57016590
      Florian Fainelli authored
      There is currently a small window during which the network device registered by
      stmmac can be made visible, yet all resources, including and clock and MDIO bus
      have not had a chance to be set up, this can lead to the following error to
      occur:
      
      [  473.919358] stmmaceth 0000:01:00.0 (unnamed net_device) (uninitialized):
                      stmmac_dvr_probe: warning: cannot get CSR clock
      [  473.919382] stmmaceth 0000:01:00.0: no reset control found
      [  473.919412] stmmac - user ID: 0x10, Synopsys ID: 0x42
      [  473.919429] stmmaceth 0000:01:00.0: DMA HW capability register supported
      [  473.919436] stmmaceth 0000:01:00.0: RX Checksum Offload Engine supported
      [  473.919443] stmmaceth 0000:01:00.0: TX Checksum insertion supported
      [  473.919451] stmmaceth 0000:01:00.0 (unnamed net_device) (uninitialized):
                      Enable RX Mitigation via HW Watchdog Timer
      [  473.921395] libphy: PHY stmmac-1:00 not found
      [  473.921417] stmmaceth 0000:01:00.0 eth0: Could not attach to PHY
      [  473.921427] stmmaceth 0000:01:00.0 eth0: stmmac_open: Cannot attach to
                      PHY (error: -19)
      [  473.959710] libphy: stmmac: probed
      [  473.959724] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 0 IRQ POLL
                      (stmmac-1:00) active
      [  473.959728] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 1 IRQ POLL
                      (stmmac-1:01)
      [  473.959731] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 2 IRQ POLL
                      (stmmac-1:02)
      [  473.959734] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 3 IRQ POLL
                      (stmmac-1:03)
      
      Fix this by making sure that register_netdev() is the last thing being done,
      which guarantees that the clock and the MDIO bus are available.
      
      Fixes: 4bfcbd7a ("stmmac: Move the mdio_register/_unregister in probe/remove")
      Reported-by: default avatarKweh, Hock Leong <hock.leong.kweh@intel.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57016590
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8f18e4d0
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Various ipvlan fixes from Eric Dumazet and Mahesh Bandewar.
      
          The most important is to not assume the packet is RX just because
          the destination address matches that of the device. Such an
          assumption causes problems when an interface is put into loopback
          mode.
      
       2) If we retry when creating a new tc entry (because we dropped the
          RTNL mutex in order to load a module, for example) we end up with
          -EAGAIN and then loop trying to replay the request. But we didn't
          reset some state when looping back to the top like this, and if
          another thread meanwhile inserted the same tc entry we were trying
          to, we re-link it creating an enless loop in the tc chain. Fix from
          Daniel Borkmann.
      
       3) There are two different WRITE bits in the MDIO address register for
          the stmmac chip, depending upon the chip variant. Due to a bug we
          could set them both, fix from Hock Leong Kweh.
      
       4) Fix mlx4 bug in XDP_TX handling, from Tariq Toukan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: stmmac: fix incorrect bit set in gmac4 mdio addr register
        r8169: add support for RTL8168 series add-on card.
        net: xdp: remove unused bfp_warn_invalid_xdp_buffer()
        openvswitch: upcall: Fix vlan handling.
        ipv4: Namespaceify tcp_tw_reuse knob
        net: korina: Fix NAPI versus resources freeing
        net, sched: fix soft lockup in tc_classify
        net/mlx4_en: Fix user prio field in XDP forward
        tipc: don't send FIN message from connectionless socket
        ipvlan: fix multicast processing
        ipvlan: fix various issues in ipvlan_process_multicast()
      8f18e4d0
  2. 27 Dec, 2016 7 commits
  3. 26 Dec, 2016 5 commits
    • Al Viro's avatar
      arm64: don't pull uaccess.h into *.S · b4b8664d
      Al Viro authored
      Split asm-only parts of arm64 uaccess.h into a new header and use that
      from *.S.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b4b8664d
    • Florian Fainelli's avatar
      net: korina: Fix NAPI versus resources freeing · e6afb1ad
      Florian Fainelli authored
      Commit beb0babf ("korina: disable napi on close and restart")
      introduced calls to napi_disable() that were missing before,
      unfortunately this leaves a small window during which NAPI has a chance
      to run, yet we just freed resources since korina_free_ring() has been
      called:
      
      Fix this by disabling NAPI first then freeing resource, and make sure
      that we also cancel the restart task before doing the resource freeing.
      
      Fixes: beb0babf ("korina: disable napi on close and restart")
      Reported-by: default avatarAlexandros C. Couloumbis <alex@ozo.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6afb1ad
    • Daniel Borkmann's avatar
      net, sched: fix soft lockup in tc_classify · 628185cf
      Daniel Borkmann authored
      Shahar reported a soft lockup in tc_classify(), where we run into an
      endless loop when walking the classifier chain due to tp->next == tp
      which is a state we should never run into. The issue only seems to
      trigger under load in the tc control path.
      
      What happens is that in tc_ctl_tfilter(), thread A allocates a new
      tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
      with it. In that classifier callback we had to unlock/lock the rtnl
      mutex and returned with -EAGAIN. One reason why we need to drop there
      is, for example, that we need to request an action module to be loaded.
      
      This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
      after we loaded and found the requested action, we need to redo the
      whole request so we don't race against others. While we had to unlock
      rtnl in that time, thread B's request was processed next on that CPU.
      Thread B added a new tp instance successfully to the classifier chain.
      When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
      and destroying its tp instance which never got linked, we goto replay
      and redo A's request.
      
      This time when walking the classifier chain in tc_ctl_tfilter() for
      checking for existing tp instances we had a priority match and found
      the tp instance that was created and linked by thread B. Now calling
      again into tp->ops->change() with that tp was successful and returned
      without error.
      
      tp_created was never cleared in the second round, thus kernel thinks
      that we need to link it into the classifier chain (once again). tp and
      *back point to the same object due to the match we had earlier on. Thus
      for thread B's already public tp, we reset tp->next to tp itself and
      link it into the chain, which eventually causes the mentioned endless
      loop in tc_classify() once a packet hits the data path.
      
      Fix is to clear tp_created at the beginning of each request, also when
      we replay it. On the paths that can cause -EAGAIN we already destroy
      the original tp instance we had and on replay we really need to start
      from scratch. It seems that this issue was first introduced in commit
      12186be7 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
      and avoid kernel panic when we use cls_cgroup").
      
      Fixes: 12186be7 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
      Reported-by: default avatarShahar Klein <shahark@mellanox.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarShahar Klein <shahark@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      628185cf
    • Linus Torvalds's avatar
      Linux 4.10-rc1 · 7ce7d89f
      Linus Torvalds authored
      7ce7d89f
    • Larry Finger's avatar
      powerpc: Fix build warning on 32-bit PPC · 8ae679c4
      Larry Finger authored
      I am getting the following warning when I build kernel 4.9-git on my
      PowerBook G4 with a 32-bit PPC processor:
      
          AS      arch/powerpc/kernel/misc_32.o
        arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]
      
      This problem is evident after commit 989cea5c ("kbuild: prevent
      lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
      error that has been in the code since 2005 when this source file was
      created.  That was with commit 9994a338 ("powerpc: Introduce
      entry_{32,64}.S, misc_{32,64}.S, systbl.S").
      
      The offending line does not make a lot of sense.  This error does not
      seem to cause any errors in the executable, thus I am not recommending
      that it be applied to any stable versions.
      
      Thanks to Nicholas Piggin for suggesting this solution.
      
      Fixes: 9994a338 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
      Signed-off-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8ae679c4
  4. 25 Dec, 2016 22 commits