1. 26 Sep, 2018 17 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 105bc130
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2018-09-25
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      The main changes are:
      
      1) Allow for RX stack hardening by implementing the kernel's flow
         dissector in BPF. Idea was originally presented at netconf 2017 [0].
         Quote from merge commit:
      
           [...] Because of the rigorous checks of the BPF verifier, this
           provides significant security guarantees. In particular, the BPF
           flow dissector cannot get inside of an infinite loop, as with
           CVE-2013-4348, because BPF programs are guaranteed to terminate.
           It cannot read outside of packet bounds, because all memory accesses
           are checked. Also, with BPF the administrator can decide which
           protocols to support, reducing potential attack surface. Rarely
           encountered protocols can be excluded from dissection and the
           program can be updated without kernel recompile or reboot if a
           bug is discovered. [...]
      
         Also, a sample flow dissector has been implemented in BPF as part
         of this work, from Petar and Willem.
      
         [0] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf
      
      2) Add support for bpftool to list currently active attachment
         points of BPF networking programs providing a quick overview
         similar to bpftool's perf subcommand, from Yonghong.
      
      3) Fix a verifier pruning instability bug where a union member
         from the register state was not cleared properly leading to
         branches not being pruned despite them being valid candidates,
         from Alexei.
      
      4) Various smaller fast-path optimizations in XDP's map redirect
         code, from Jesper.
      
      5) Enable to recognize BPF_MAP_TYPE_REUSEPORT_SOCKARRAY maps
         in bpftool, from Roman.
      
      6) Remove a duplicate check in libbpf that probes for function
         storage, from Taeung.
      
      7) Fix an issue in test_progs by avoid checking for errno since
         on success its value should not be checked, from Mauricio.
      
      8) Fix unused variable warning in bpf_getsockopt() helper when
         CONFIG_INET is not configured, from Anders.
      
      9) Fix a compilation failure in the BPF sample code's use of
         bpf_flow_keys, from Prashant.
      
      10) Minor cleanups in BPF code, from Yue and Zhong.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      105bc130
    • Hauke Mehrtens's avatar
      net: dsa: lantiq_gswip: Depend on HAS_IOMEM · 3475372f
      Hauke Mehrtens authored
      The driver uses devm_ioremap_resource() which is only available when
      CONFIG_HAS_IOMEM is set, make the driver depend on this config option.
      User mode Linux does not have CONFIG_HAS_IOMEM set and the driver was
      failing on this architecture.
      
      Fixes: 14fceff4 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3475372f
    • David S. Miller's avatar
      Merge branch 'net-phy-Eliminate-unnecessary-soft' · 921f432c
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: phy: Eliminate unnecessary soft
      
      This patch series eliminates unnecessary software resets of the PHY.
      This should hopefully not break anybody's hardware; but I would
      appreciate testing to make sure this is is the case.
      
      Sorry for this long email list, I wanted to make sure I reached out to
      all people who made changes to the Marvell PHY driver.
      
      Thank you!
      
      Changes since RFT:
      
      - added Tested-by tags from Wang, Dongsheng, Andrew, Chris and Clemens
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      921f432c
    • Florian Fainelli's avatar
      net: phy: marvell: Avoid unnecessary soft reset · d6ab9336
      Florian Fainelli authored
      The BMCR.RESET bit on the Marvell PHYs has a special meaning in that
      it commits the register writes into the HW for it to latch and be
      configured appropriately. Doing software resets causes link drops, and
      this is unnecessary disruption if nothing changed.
      
      Determine from marvell_set_polarity()'s return code whether the register value
      was changed and if it was, propagate that to the logic that hits the software
      reset bit.
      
      This avoids doing unnecessary soft reset if the PHY is configured in
      the same state it was previously, this also eliminates the need for a
      m88e1111_config_aneg() function since it now is the same as
      marvell_config_aneg().
      Tested-by: default avatarWang, Dongsheng <dongsheng.wang@hxt-semitech.com>
      Tested-by: default avatarChris Healy <cphealy@gmail.com>
      Tested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Tested-by: default avatarClemens Gruber <clemens.gruber@pqgruber.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6ab9336
    • Florian Fainelli's avatar
      net: phy: Stop with excessive soft reset · 6e2d85ec
      Florian Fainelli authored
      While consolidating the PHY reset in phy_init_hw() an unconditionaly
      BMCR soft-reset I became quite trigger happy with those. This was later
      on deactivated for the Generic PHY driver on the premise that a prior
      software entity (e.g: bootloader) might have applied workarounds in
      commit 0878fff1 ("net: phy: Do not perform software reset for
      Generic PHY").
      
      Since we have a hook to wire-up a soft_reset callback, just use that and
      get rid of the call to genphy_soft_reset() entirely. This speeds up
      initialization and link establishment for most PHYs out there that do
      not require a reset.
      
      Fixes: 87aa9f9c ("net: phy: consolidate PHY reset in phy_init_hw()")
      Tested-by: default avatarWang, Dongsheng <dongsheng.wang@hxt-semitech.com>
      Tested-by: default avatarChris Healy <cphealy@gmail.com>
      Tested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Tested-by: default avatarClemens Gruber <clemens.gruber@pqgruber.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e2d85ec
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 71f9b61c
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2018-09-25
      
      This series contains updates to i40e and xsk.
      
      Mariusz fixes an issue where the VF link state was not being updated
      properly when the PF is down or up.  Also cleaned up the promiscuous
      configuration during a VF reset.
      
      Patryk simplifies the code a bit to use the variables for PF and HW that
      are declared, rather than using the VSI pointers.  Cleaned up the
      message length parameter to several virtchnl functions, since it was not
      being used (or needed).
      
      Harshitha fixes two potential race conditions when trying to change VF
      settings by creating a helper function to validate that the VF is
      enabled and that the VSI is set up.
      
      Sergey corrects a double "link down" message by putting in a check for
      whether or not the link is up or going down.
      
      Björn addresses an AF_XDP zero-copy issue that buffers passed
      from userspace to the kernel was leaked when the hardware descriptor
      ring was torn down.  A zero-copy capable driver picks buffers off the
      fill ring and places them on the hardware receive ring to be completed at
      a later point when DMA is complete. Similar on the transmit side; The
      driver picks buffers off the transmit ring and places them on the
      transmit hardware ring.
      
      In the typical flow, the receive buffer will be placed onto an receive
      ring (completed to the user), and the transmit buffer will be placed on
      the completion ring to notify the user that the transfer is done.
      
      However, if the driver needs to tear down the hardware rings for some
      reason (interface goes down, reconfiguration and such), the userspace
      buffers cannot be leaked. They have to be reused or completed back to
      userspace.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71f9b61c
    • David S. Miller's avatar
      Merge branch 'Refactor-classifier-API-to-work-with-Qdisc-blocks-without-rtnl-lock' · 7a153655
      David S. Miller authored
      Vlad Buslov says:
      
      ====================
      Refactor classifier API to work with Qdisc/blocks without rtnl lock
      
      Currently, all netlink protocol handlers for updating rules, actions and
      qdiscs are protected with single global rtnl lock which removes any
      possibility for parallelism. This patch set is a third step to remove
      rtnl lock dependency from TC rules update path.
      
      Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
      Handlers registered with this flag are called without RTNL taken. End
      goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
      etc.) to be registered with UNLOCKED flag to allow parallel execution.
      However, there is no intention to completely remove or split rtnl lock
      itself. This patch set addresses specific problems in implementation of
      classifiers API that prevent its control path from being executed
      concurrently. Additional changes are required to refactor classifiers
      API and individual classifiers for parallel execution. This patch set
      lays groundwork to eventually register rule update handlers as
      rtnl-unlocked by modifying code in cls API that works with Qdiscs and
      blocks. Following patch set does the same for chains and classifiers.
      
      The goal of this change is to refactor tcf_block_find() and its
      dependencies to allow concurrent execution:
      - Extend Qdisc API with rcu to lookup and take reference to Qdisc
        without relying on rtnl lock.
      - Extend tcf_block with atomic reference counting and rcu.
      - Always take reference to tcf_block while working with it.
      - Implement tcf_block_release() to release resources obtained by
        tcf_block_find()
      - Create infrastructure to allow registering Qdiscs with class ops that
        do not require the caller to hold rtnl lock.
      
      All three netlink rule update handlers use tcf_block_find() to lookup
      Qdisc and block, and this patch set introduces additional means of
      synchronization to substitute rtnl lock in cls API.
      
      Some functions in cls and sch APIs have historic names that no longer
      clearly describe their intent. In order not make this code even more
      confusing when introducing their concurrency-friendly versions, rename
      these functions to describe actual implementation.
      
      Changes from V2 to V3:
      - Patch 1:
        - Explicitly include refcount.h in rtnetlink.h.
      - Patch 3:
        - Move rcu_head field to the end of struct Qdisc.
        - Rearrange local variable declarations in qdisc_lookup_rcu().
      - Patch 5:
        - Remove tcf_qdisc_put() and inline its content to callers.
      
      Changes from V1 to V2:
      - Rebase on latest net-next.
      - Patch 8 - remove.
      - Patch 9 - fold into patch 11.
      - Patch 11:
        - Rename tcf_block_{get|put}() to tcf_block_refcnt_{get|put}().
      - Patch 13 - remove.
      ====================
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a153655
    • Vlad Buslov's avatar
      net: sched: use reference counting for tcf blocks on rules update · 787ce6d0
      Vlad Buslov authored
      In order to remove dependency on rtnl lock on rules update path, always
      take reference to block while using it on rules update path. Change
      tcf_block_get() error handling to properly release block with reference
      counting, instead of just destroying it, in order to accommodate potential
      concurrent users.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      787ce6d0
    • Vlad Buslov's avatar
      net: sched: implement tcf_block_refcnt_{get|put}() · 0607e439
      Vlad Buslov authored
      Implement get/put function for blocks that only take/release the reference
      and perform deallocation. These functions are intended to be used by
      unlocked rules update path to always hold reference to block while working
      with it. They use on new fine-grained locking mechanisms introduced in
      previous patches in this set, instead of relying on global protection
      provided by rtnl lock.
      
      Extract code that is common with tcf_block_detach_ext() into common
      function __tcf_block_put().
      
      Extend tcf_block with rcu to allow safe deallocation when it is accessed
      concurrently.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0607e439
    • Vlad Buslov's avatar
      net: sched: protect block idr with spinlock · ab281629
      Vlad Buslov authored
      Protect block idr access with spinlock, instead of relying on rtnl lock.
      Take tn->idr_lock spinlock during block insertion and removal.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab281629
    • Vlad Buslov's avatar
      net: sched: implement functions to put and flush all chains · f0023436
      Vlad Buslov authored
      Extract code that flushes and puts all chains on tcf block to two
      standalone function to be shared with functions that locklessly get/put
      reference to block.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0023436
    • Vlad Buslov's avatar
      net: sched: change tcf block reference counter type to refcount_t · cfebd7e2
      Vlad Buslov authored
      As a preparation for removing rtnl lock dependency from rules update path,
      change tcf block reference counter type to refcount_t to allow modification
      by concurrent users.
      
      In block put function perform decrement and check reference counter once to
      accommodate concurrent modification by unlocked users. After this change
      tcf_chain_put at the end of block put function is called with
      block->refcnt==0 and will deallocate block after the last chain is
      released, so there is no need to manually deallocate block in this case.
      However, if block reference counter reached 0 and there are no chains to
      release, block must still be deallocated manually.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfebd7e2
    • Vlad Buslov's avatar
      net: sched: use Qdisc rcu API instead of relying on rtnl lock · e368fdb6
      Vlad Buslov authored
      As a preparation from removing rtnl lock dependency from rules update path,
      use Qdisc rcu and reference counting capabilities instead of relying on
      rtnl lock while working with Qdiscs. Create new tcf_block_release()
      function, and use it to free resources taken by tcf_block_find().
      Currently, this function only releases Qdisc and it is extended in next
      patches in this series.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e368fdb6
    • Vlad Buslov's avatar
      net: sched: add helper function to take reference to Qdisc · 9d7e82ce
      Vlad Buslov authored
      Implement function to take reference to Qdisc that relies on rcu read lock
      instead of rtnl mutex. Function only takes reference to Qdisc if reference
      counter isn't zero. Intended to be used by unlocked cls API.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d7e82ce
    • Vlad Buslov's avatar
      net: sched: extend Qdisc with rcu · 3a7d0d07
      Vlad Buslov authored
      Currently, Qdisc API functions assume that users have rtnl lock taken. To
      implement rtnl unlocked classifiers update interface, Qdisc API must be
      extended with functions that do not require rtnl lock.
      
      Extend Qdisc structure with rcu. Implement special version of put function
      qdisc_put_unlocked() that is called without rtnl lock taken. This function
      only takes rtnl lock if Qdisc reference counter reached zero and is
      intended to be used as optimization.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a7d0d07
    • Vlad Buslov's avatar
      net: sched: rename qdisc_destroy() to qdisc_put() · 86bd446b
      Vlad Buslov authored
      Current implementation of qdisc_destroy() decrements Qdisc reference
      counter and only actually destroy Qdisc if reference counter value reached
      zero. Rename qdisc_destroy() to qdisc_put() in order for it to better
      describe the way in which this function currently implemented and used.
      
      Extract code that deallocates Qdisc into new private qdisc_destroy()
      function. It is intended to be shared between regular qdisc_put() and its
      unlocked version that is introduced in next patch in this series.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86bd446b
    • Vlad Buslov's avatar
      net: core: netlink: add helper refcount dec and lock function · 6f99528e
      Vlad Buslov authored
      Rtnl lock is encapsulated in netlink and cannot be accessed by other
      modules directly. This means that reference counted objects that rely on
      rtnl lock cannot use it with refcounter helper function that atomically
      releases decrements reference and obtains mutex.
      
      This patch implements simple wrapper function around refcount_dec_and_lock
      that obtains rtnl lock if reference counter value reached 0.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f99528e
  2. 25 Sep, 2018 23 commits