1. 26 Sep, 2018 11 commits
    • David S. Miller's avatar
      Merge branch 'Refactor-classifier-API-to-work-with-Qdisc-blocks-without-rtnl-lock' · 7a153655
      David S. Miller authored
      Vlad Buslov says:
      
      ====================
      Refactor classifier API to work with Qdisc/blocks without rtnl lock
      
      Currently, all netlink protocol handlers for updating rules, actions and
      qdiscs are protected with single global rtnl lock which removes any
      possibility for parallelism. This patch set is a third step to remove
      rtnl lock dependency from TC rules update path.
      
      Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
      Handlers registered with this flag are called without RTNL taken. End
      goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
      etc.) to be registered with UNLOCKED flag to allow parallel execution.
      However, there is no intention to completely remove or split rtnl lock
      itself. This patch set addresses specific problems in implementation of
      classifiers API that prevent its control path from being executed
      concurrently. Additional changes are required to refactor classifiers
      API and individual classifiers for parallel execution. This patch set
      lays groundwork to eventually register rule update handlers as
      rtnl-unlocked by modifying code in cls API that works with Qdiscs and
      blocks. Following patch set does the same for chains and classifiers.
      
      The goal of this change is to refactor tcf_block_find() and its
      dependencies to allow concurrent execution:
      - Extend Qdisc API with rcu to lookup and take reference to Qdisc
        without relying on rtnl lock.
      - Extend tcf_block with atomic reference counting and rcu.
      - Always take reference to tcf_block while working with it.
      - Implement tcf_block_release() to release resources obtained by
        tcf_block_find()
      - Create infrastructure to allow registering Qdiscs with class ops that
        do not require the caller to hold rtnl lock.
      
      All three netlink rule update handlers use tcf_block_find() to lookup
      Qdisc and block, and this patch set introduces additional means of
      synchronization to substitute rtnl lock in cls API.
      
      Some functions in cls and sch APIs have historic names that no longer
      clearly describe their intent. In order not make this code even more
      confusing when introducing their concurrency-friendly versions, rename
      these functions to describe actual implementation.
      
      Changes from V2 to V3:
      - Patch 1:
        - Explicitly include refcount.h in rtnetlink.h.
      - Patch 3:
        - Move rcu_head field to the end of struct Qdisc.
        - Rearrange local variable declarations in qdisc_lookup_rcu().
      - Patch 5:
        - Remove tcf_qdisc_put() and inline its content to callers.
      
      Changes from V1 to V2:
      - Rebase on latest net-next.
      - Patch 8 - remove.
      - Patch 9 - fold into patch 11.
      - Patch 11:
        - Rename tcf_block_{get|put}() to tcf_block_refcnt_{get|put}().
      - Patch 13 - remove.
      ====================
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a153655
    • Vlad Buslov's avatar
      net: sched: use reference counting for tcf blocks on rules update · 787ce6d0
      Vlad Buslov authored
      In order to remove dependency on rtnl lock on rules update path, always
      take reference to block while using it on rules update path. Change
      tcf_block_get() error handling to properly release block with reference
      counting, instead of just destroying it, in order to accommodate potential
      concurrent users.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      787ce6d0
    • Vlad Buslov's avatar
      net: sched: implement tcf_block_refcnt_{get|put}() · 0607e439
      Vlad Buslov authored
      Implement get/put function for blocks that only take/release the reference
      and perform deallocation. These functions are intended to be used by
      unlocked rules update path to always hold reference to block while working
      with it. They use on new fine-grained locking mechanisms introduced in
      previous patches in this set, instead of relying on global protection
      provided by rtnl lock.
      
      Extract code that is common with tcf_block_detach_ext() into common
      function __tcf_block_put().
      
      Extend tcf_block with rcu to allow safe deallocation when it is accessed
      concurrently.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0607e439
    • Vlad Buslov's avatar
      net: sched: protect block idr with spinlock · ab281629
      Vlad Buslov authored
      Protect block idr access with spinlock, instead of relying on rtnl lock.
      Take tn->idr_lock spinlock during block insertion and removal.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab281629
    • Vlad Buslov's avatar
      net: sched: implement functions to put and flush all chains · f0023436
      Vlad Buslov authored
      Extract code that flushes and puts all chains on tcf block to two
      standalone function to be shared with functions that locklessly get/put
      reference to block.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0023436
    • Vlad Buslov's avatar
      net: sched: change tcf block reference counter type to refcount_t · cfebd7e2
      Vlad Buslov authored
      As a preparation for removing rtnl lock dependency from rules update path,
      change tcf block reference counter type to refcount_t to allow modification
      by concurrent users.
      
      In block put function perform decrement and check reference counter once to
      accommodate concurrent modification by unlocked users. After this change
      tcf_chain_put at the end of block put function is called with
      block->refcnt==0 and will deallocate block after the last chain is
      released, so there is no need to manually deallocate block in this case.
      However, if block reference counter reached 0 and there are no chains to
      release, block must still be deallocated manually.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfebd7e2
    • Vlad Buslov's avatar
      net: sched: use Qdisc rcu API instead of relying on rtnl lock · e368fdb6
      Vlad Buslov authored
      As a preparation from removing rtnl lock dependency from rules update path,
      use Qdisc rcu and reference counting capabilities instead of relying on
      rtnl lock while working with Qdiscs. Create new tcf_block_release()
      function, and use it to free resources taken by tcf_block_find().
      Currently, this function only releases Qdisc and it is extended in next
      patches in this series.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e368fdb6
    • Vlad Buslov's avatar
      net: sched: add helper function to take reference to Qdisc · 9d7e82ce
      Vlad Buslov authored
      Implement function to take reference to Qdisc that relies on rcu read lock
      instead of rtnl mutex. Function only takes reference to Qdisc if reference
      counter isn't zero. Intended to be used by unlocked cls API.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d7e82ce
    • Vlad Buslov's avatar
      net: sched: extend Qdisc with rcu · 3a7d0d07
      Vlad Buslov authored
      Currently, Qdisc API functions assume that users have rtnl lock taken. To
      implement rtnl unlocked classifiers update interface, Qdisc API must be
      extended with functions that do not require rtnl lock.
      
      Extend Qdisc structure with rcu. Implement special version of put function
      qdisc_put_unlocked() that is called without rtnl lock taken. This function
      only takes rtnl lock if Qdisc reference counter reached zero and is
      intended to be used as optimization.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a7d0d07
    • Vlad Buslov's avatar
      net: sched: rename qdisc_destroy() to qdisc_put() · 86bd446b
      Vlad Buslov authored
      Current implementation of qdisc_destroy() decrements Qdisc reference
      counter and only actually destroy Qdisc if reference counter value reached
      zero. Rename qdisc_destroy() to qdisc_put() in order for it to better
      describe the way in which this function currently implemented and used.
      
      Extract code that deallocates Qdisc into new private qdisc_destroy()
      function. It is intended to be shared between regular qdisc_put() and its
      unlocked version that is introduced in next patch in this series.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86bd446b
    • Vlad Buslov's avatar
      net: core: netlink: add helper refcount dec and lock function · 6f99528e
      Vlad Buslov authored
      Rtnl lock is encapsulated in netlink and cannot be accessed by other
      modules directly. This means that reference counted objects that rely on
      rtnl lock cannot use it with refcounter helper function that atomically
      releases decrements reference and obtains mutex.
      
      This patch implements simple wrapper function around refcount_dec_and_lock
      that obtains rtnl lock if reference counter value reached 0.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f99528e
  2. 25 Sep, 2018 14 commits
    • Vakul Garg's avatar
      tls: Fixed a memory leak during socket close · c774973e
      Vakul Garg authored
      During socket close, if there is a open record with tx context, it needs
      to be be freed apart from freeing up plaintext and encrypted scatter
      lists. This patch frees up the open record if present in tx context.
      
      Also tls_free_both_sg() has been renamed to tls_free_open_rec() to
      indicate that the free record in tx context is being freed inside the
      function.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: default avatarVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c774973e
    • Vakul Garg's avatar
      tls: Fix socket mem accounting error under async encryption · b85135b5
      Vakul Garg authored
      Current async encryption implementation sometimes showed up socket
      memory accounting error during socket close. This results in kernel
      warning calltrace. The root cause of the problem is that socket var
      sk_forward_alloc gets corrupted due to access in sk_mem_charge()
      and sk_mem_uncharge() being invoked from multiple concurrent contexts
      in multicore processor. The apis sk_mem_charge() and sk_mem_uncharge()
      are called from functions alloc_plaintext_sg(), free_sg() etc. It is
      required that memory accounting apis are called under a socket lock.
      
      The plaintext sg data sent for encryption is freed using free_sg() in
      tls_encryption_done(). It is wrong to call free_sg() from this function.
      This is because this function may run in irq context. We cannot acquire
      socket lock in this function.
      
      We remove calling of function free_sg() for plaintext data from
      tls_encryption_done() and defer freeing up of plaintext data to the time
      when the record is picked up from tx_list and transmitted/freed. When
      tls_tx_records() gets called, socket is already locked and thus there is
      no concurrent access problem.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: default avatarVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b85135b5
    • David S. Miller's avatar
      Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net · a06ee256
      David S. Miller authored
      Version bump conflict in batman-adv, take what's in net-next.
      
      iavf conflict, adjustment of netdev_ops in net-next conflicting
      with poll controller method removal in net.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a06ee256
    • Michal Simek's avatar
      net: macb: Clean 64b dma addresses if they are not detected · bd620720
      Michal Simek authored
      Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is
      not detected on 64bit system.
      The issue was observed when bootloader(u-boot) does not check macb
      feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support
      by default. Then macb driver is reading DMACFG register back and only
      adding 64bit dma configuration but not cleaning it out.
      Signed-off-by: default avatarMichal Simek <michal.simek@xilinx.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd620720
    • David S. Miller's avatar
      Merge branch 'r8169-series-with-smaller-improvements' · 9da90297
      David S. Miller authored
      Heiner Kallweit says:
      
      ====================
      r8169: series with smaller improvements
      
      This series includes smaller improvements, nothing exciting.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9da90297
    • Heiner Kallweit's avatar
      r8169: improve a check in rtl_init_one · a0456790
      Heiner Kallweit authored
      The check for pci_is_pcie() is redundant here because all
      chip versions >=18 are PCIe only anyway. In addition use
      dma_set_mask_and_coherent() instead of separate calls to
      pci_set_dma_mask() and pci_set_consistent_dma_mask().
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0456790
    • Heiner Kallweit's avatar
      r8169: improve rtl8169_irq_mask_and_ack · de20e12f
      Heiner Kallweit authored
      Code can be slightly simplified by acking even events we're not
      interested in. In addition add a comment making clear that the
      read has no functional purpose and is just a PCI commit.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de20e12f
    • Heiner Kallweit's avatar
      r8169: use default watchdog timeout · 4bee64b4
      Heiner Kallweit authored
      The networking core has a default watchdog timeout of 5s. I see no
      need to define an own timeout of 6s which is basically the same.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bee64b4
    • Greg Kroah-Hartman's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 846e8dd4
      Greg Kroah-Hartman authored
      James writes:
        "SCSI fixes on 20180925
      
         Nine obvious bug fixes mostly in individual drivers.  The target fix
         is of particular importance because it's CVE related."
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sd: don't crash the host on invalid commands
        scsi: ipr: System hung while dlpar adding primary ipr adapter back
        scsi: target: iscsi: Use bin2hex instead of a re-implementation
        scsi: target: iscsi: Use hex2bin instead of a re-implementation
        scsi: lpfc: Synchronize access to remoteport via rport
        scsi: ufs: Disable blk-mq for now
        scsi: sd: Contribute to randomness when running rotational device
        scsi: ibmvscsis: Ensure partition name is properly NUL terminated
        scsi: ibmvscsis: Fix a stringop-overflow warning
      846e8dd4
    • Greg Kroah-Hartman's avatar
      Merge tag 'usb-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · bfb0e9b4
      Greg Kroah-Hartman authored
      I wrote:
        "USB fixes for 4.19-rc6
      
         Here are some small USB core and driver fixes for reported issues for
         4.19-rc6.
      
         The most visible is the oops fix for when the USB core is built into the
         kernel that is present in 4.18.  Turns out not many people actually do
         that so it went unnoticed for a while.  The rest is some tiny typec,
         musb, and other core fixes.
      
         All have been in linux-next with no reported issues."
      
      * tag 'usb-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: mux: Take care of driver module reference counting
        usb: core: safely deal with the dynamic quirk lists
        usb: roles: Take care of driver module reference counting
        USB: handle NULL config in usb_find_alt_setting()
        USB: fix error handling in usb_driver_claim_interface()
        USB: remove LPM management from usb_driver_claim_interface()
        USB: usbdevfs: restore warning for nonsensical flags
        USB: usbdevfs: sanitize flags more
        Revert "usb: cdc-wdm: Fix a sleep-in-atomic-context bug in service_outstanding_interrupt()"
        usb: musb: dsps: do not disable CPPI41 irq in driver teardown
      bfb0e9b4
    • Greg Kroah-Hartman's avatar
      Merge tag 'tty-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · ccf791e5
      Greg Kroah-Hartman authored
      I wrote:
        "TTY/Serial driver fixes for 4.19-rc6
      
         Here are a number of small tty and serial driver fixes for reported
         issues for 4.19-rc6.
      
         One should hopefully resolve a much-reported issue that syzbot has found
         in the tty layer.  Although there are still more issues there, getting
         this fixed is nice to see finally happen.
      
         All of these have been in linux-next for a while with no reported
         issues."
      
      * tag 'tty-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: imx: restore handshaking irq for imx1
        tty: vt_ioctl: fix potential Spectre v1
        tty: Drop tty->count on tty_reopen() failure
        serial: cpm_uart: return immediately from console poll
        tty: serial: lpuart: avoid leaking struct tty_struct
        serial: mvebu-uart: Fix reporting of effective CSIZE to userspace
      ccf791e5
    • Greg Kroah-Hartman's avatar
      Merge tag 'char-misc-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · fc0c8146
      Greg Kroah-Hartman authored
      Greg (well I), wrote:
        "Char/Misc driver fixes for 4.19-rc6
      
         Here are some soundwire and intel_th (tracing) driver fixes for some
         reported issues.
      
         All of these have been in linux-next for a week with no reported issues."
      
      * tag 'char-misc-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        intel_th: pci: Add Ice Lake PCH support
        intel_th: Fix resource handling for ACPI glue layer
        intel_th: Fix device removal logic
        soundwire: Fix acquiring bus lock twice during master release
        soundwire: Fix incorrect exit after configuring stream
        soundwire: Fix duplicate stream state assignment
      fc0c8146
    • Lubomir Rintel's avatar
      Revert "uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name" · 8c0f9f5b
      Lubomir Rintel authored
      This changes UAPI, breaking iwd and libell:
      
        ell/key.c: In function 'kernel_dh_compute':
        ell/key.c:205:38: error: 'struct keyctl_dh_params' has no member named 'private'; did you mean 'dh_private'?
          struct keyctl_dh_params params = { .private = private,
                                              ^~~~~~~
                                              dh_private
      
      This reverts commit 8a2336e5.
      
      Fixes: 8a2336e5 ("uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name")
      Signed-off-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Randy Dunlap <rdunlap@infradead.org>
      cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
      cc: Stephan Mueller <smueller@chronox.de>
      cc: James Morris <jmorris@namei.org>
      cc: "Serge E. Hallyn" <serge@hallyn.com>
      cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
      cc: Andrew Morton <akpm@linux-foundation.org>
      cc: Linus Torvalds <torvalds@linux-foundation.org>
      cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJames Morris <james.morris@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c0f9f5b
    • Greg Kroah-Hartman's avatar
      Merge gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net · 2dd68cc7
      Greg Kroah-Hartman authored
      Dave writes:
        "Networking fixes:
      
        1) Fix multiqueue handling of coalesce timer in stmmac, from Jose
           Abreu.
      
         2) Fix memory corruption in NFC, from Suren Baghdasaryan.
      
         3) Don't write reserved bits in ravb driver, from Kazuya Mizuguchi.
      
         4) SMC bug fixes from Karsten Graul, YueHaibing, and Ursula Braun.
      
         5) Fix TX done race in mvpp2, from Antoine Tenart.
      
         6) ipv6 metrics leak, from Wei Wang.
      
         7) Adjust firmware version requirements in mlxsw, from Petr Machata.
      
         8) Fix autonegotiation on resume in r8169, from Heiner Kallweit.
      
         9) Fixed missing entries when dumping /proc/net/if_inet6, from Jeff
            Barnhill.
      
         10) Fix double free in devlink, from Dan Carpenter.
      
         11) Fix ethtool regression from UFO feature removal, from Maciej
             Żenczykowski.
      
         12) Fix drivers that have a ndo_poll_controller() that captures the
             cpu entirely on loaded hosts by trying to drain all rx and tx
             queues, from Eric Dumazet.
      
         13) Fix memory corruption with jumbo frames in aquantia driver, from
             Friedemann Gerold."
      
      * gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (79 commits)
        net: mvneta: fix the remaining Rx descriptor unmapping issues
        ip_tunnel: be careful when accessing the inner header
        mpls: allow routes on ip6gre devices
        net: aquantia: memory corruption on jumbo frames
        tun: remove ndo_poll_controller
        nfp: remove ndo_poll_controller
        bnxt: remove ndo_poll_controller
        bnx2x: remove ndo_poll_controller
        mlx5: remove ndo_poll_controller
        mlx4: remove ndo_poll_controller
        i40evf: remove ndo_poll_controller
        ice: remove ndo_poll_controller
        igb: remove ndo_poll_controller
        ixgb: remove ndo_poll_controller
        fm10k: remove ndo_poll_controller
        ixgbevf: remove ndo_poll_controller
        ixgbe: remove ndo_poll_controller
        bonding: use netpoll_poll_dev() helper
        netpoll: make ndo_poll_controller() optional
        rds: Fix build regression.
        ...
      2dd68cc7
  3. 24 Sep, 2018 15 commits
    • Ioana Ciocoi Radulescu's avatar
      dpaa2-eth: Make Rx flow hash key configurable · edad8d26
      Ioana Ciocoi Radulescu authored
      Until now, the Rx flow hash key was a 5-tuple (IP src, IP dst,
      IP nextproto, L4 src port, L4 dst port) fixed value that we
      configured at probe.
      
      Add support for configuring this hash key at runtime.
      We support all standard header fields configurable through ethtool,
      but cannot differentiate between flow types, so the same hash key
      is applied regardless of protocol.
      
      We also don't support the discard option.
      Signed-off-by: default avatarIoana Radulescu <ruxandra.radulescu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edad8d26
    • Antoine Tenart's avatar
      net: mvneta: fix the remaining Rx descriptor unmapping issues · f4a51879
      Antoine Tenart authored
      With CONFIG_DMA_API_DEBUG enabled we get DMA unmapping warning in
      various places of the mvneta driver, for example when putting down an
      interface while traffic is passing through.
      
      The issue is when using s/w buffer management, the Rx buffers are mapped
      using dma_map_page but unmapped with dma_unmap_single. This patch fixes
      this by using the right unmapping function.
      
      Fixes: 562e2f46 ("net: mvneta: Improve the buffer allocation method for SWBM")
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Reviewed-by: default avatarGregory CLEMENT <gregory.clement@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4a51879
    • Paolo Abeni's avatar
      ip_tunnel: be careful when accessing the inner header · ccfec9e5
      Paolo Abeni authored
      Cong noted that we need the same checks introduced by commit 76c0ddd8
      ("ip6_tunnel: be careful when accessing the inner header")
      even for ipv4 tunnels.
      
      Fixes: c5441932 ("GRE: Refactor GRE tunneling code.")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccfec9e5
    • Stefan Wahren's avatar
      net: qca_spi: Introduce write register verification · 48c1699e
      Stefan Wahren authored
      The SPI protocol for the QCA7000 doesn't have any fault detection.
      In order to increase the drivers reliability in noisy environments,
      we could implement a write verification inspired by the enc28j60.
      This should avoid situations were the driver wrongly assumes the
      receive interrupt is enabled and miss all incoming packets.
      
      This function is disabled per default and can be controlled via module
      parameter wr_verify.
      Signed-off-by: default avatarMichael Heimpold <michael.heimpold@i2se.com>
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48c1699e
    • Vakul Garg's avatar
      tls: Fixed uninitialised vars warning · 4128c0cf
      Vakul Garg authored
      In tls_sw_sendmsg() and tls_sw_sendpage(), it is possible that the
      uninitialised variable 'ret' gets passed to sk_stream_error(). So
      initialise local variable 'ret' to '0. The warnings were detected by
      'smatch' tool.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: default avatarVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4128c0cf
    • Vakul Garg's avatar
      net/tls: Fixed race condition in async encryption · 9932a29a
      Vakul Garg authored
      On processors with multi-engine crypto accelerators, it is possible that
      multiple records get encrypted in parallel and their encryption
      completion is notified to different cpus in multicore processor. This
      leads to the situation where tls_encrypt_done() starts executing in
      parallel on different cores. In current implementation, encrypted
      records are queued to tx_ready_list in tls_encrypt_done(). This requires
      addition to linked list 'tx_ready_list' to be protected. As
      tls_decrypt_done() could be executing in irq content, it is not possible
      to protect linked list addition operation using a lock.
      
      To fix the problem, we remove linked list addition operation from the
      irq context. We do tx_ready_list addition/removal operation from
      application context only and get rid of possible multiple access to
      the linked list. Before starting encryption on the record, we add it to
      the tail of tx_ready_list. To prevent tls_tx_records() from transmitting
      it, we mark the record with a new flag 'tx_ready' in 'struct tls_rec'.
      When record encryption gets completed, tls_encrypt_done() has to only
      update the 'tx_ready' flag to true & linked list add operation is not
      required.
      
      The changed logic brings some other side benefits. Since the records
      are always submitted in tls sequence number order for encryption, the
      tx_ready_list always remains sorted and addition of new records to it
      does not have to traverse the linked list.
      
      Lastly, we renamed tx_ready_list in 'struct tls_sw_context_tx' to
      'tx_list'. This is because now, the some of the records at the tail are
      not ready to transmit.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: default avatarVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9932a29a
    • David S. Miller's avatar
      Merge branch 'few-NTF_ROUTER-related-updates' · 094fe739
      David S. Miller authored
      Roopa Prabhu says:
      
      ====================
      few NTF_ROUTER related updates
      
      This series allows setting of NTF_ROUTER by an external
      entity (eg BGP E-VPN control plane). Also fixes missing
      netlink notification on neigh NTF_ROUTER flag changes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      094fe739
    • Roopa Prabhu's avatar
      neighbour: send netlink notification if NTF_ROUTER changes · fc6e8073
      Roopa Prabhu authored
      send netlink notification if neigh_update results in NTF_ROUTER
      change and if NEIGH_UPDATE_F_ISROUTER is on. Also move the
      NTF_ROUTER change function into a helper.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc6e8073
    • Roopa Prabhu's avatar
      neighbour: allow admin to set NTF_ROUTER · f7aa74e4
      Roopa Prabhu authored
      This patch allows admin setting of NTF_ROUTER flag
      on a neighbour entry. This enables external control
      plane (like bgp evpn) to manage neigh entries with
      NTF_ROUTER flag.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7aa74e4
    • Saif Hasan's avatar
      mpls: allow routes on ip6gre devices · d8e2262a
      Saif Hasan authored
      Summary:
      
      This appears to be necessary and sufficient change to enable `MPLS` on
      `ip6gre` tunnels (RFC4023).
      
      This diff allows IP6GRE devices to be recognized by MPLS kernel module
      and hence user can configure interface to accept packets with mpls
      headers as well setup mpls routes on them.
      
      Test Plan:
      
      Test plan consists of multiple containers connected via GRE-V6 tunnel.
      Then carrying out testing steps as below.
      
      - Carry out necessary sysctl settings on all containers
      
      ```
      sysctl -w net.mpls.platform_labels=65536
      sysctl -w net.mpls.ip_ttl_propagate=1
      sysctl -w net.mpls.conf.lo.input=1
      ```
      
      - Establish IP6GRE tunnels
      
      ```
      ip -6 tunnel add name if_1_2_1 mode ip6gre \
        local 2401:db00:21:6048:feed:0::1 \
        remote 2401:db00:21:6048:feed:0::2 key 1
      ip link set dev if_1_2_1 up
      sysctl -w net.mpls.conf.if_1_2_1.input=1
      ip -4 addr add 169.254.0.2/31 dev if_1_2_1 scope link
      
      ip -6 tunnel add name if_1_3_1 mode ip6gre \
        local 2401:db00:21:6048:feed:0::1 \
        remote 2401:db00:21:6048:feed:0::3 key 1
      ip link set dev if_1_3_1 up
      sysctl -w net.mpls.conf.if_1_3_1.input=1
      ip -4 addr add 169.254.0.4/31 dev if_1_3_1 scope link
      ```
      
      - Install MPLS encap rules on node-1 towards node-2
      
      ```
      ip route add 192.168.0.11/32 nexthop encap mpls 32/64 \
        via inet 169.254.0.3 dev if_1_2_1
      ```
      
      - Install MPLS forwarding rules on node-2 and node-3
      ```
      // node2
      ip -f mpls route add 32 via inet 169.254.0.7 dev if_2_4_1
      
      // node3
      ip -f mpls route add 64 via inet 169.254.0.12 dev if_4_3_1
      ```
      
      - Ping 192.168.0.11 (node4) from 192.168.0.1 (node1) (where routing
        towards 192.168.0.1 is via IP route directly towards node1 from node4)
      ```
      ping 192.168.0.11
      ```
      
      - tcpdump on interface to capture ping packets wrapped within MPLS
        header which inturn wrapped within IP6GRE header
      
      ```
      16:43:41.121073 IP6
        2401:db00:21:6048:feed::1 > 2401:db00:21:6048:feed::2:
        DSTOPT GREv0, key=0x1, length 100:
        MPLS (label 32, exp 0, ttl 255) (label 64, exp 0, [S], ttl 255)
        IP 192.168.0.1 > 192.168.0.11:
        ICMP echo request, id 1208, seq 45, length 64
      
      0x0000:  6000 2cdb 006c 3c3f 2401 db00 0021 6048  `.,..l<?$....!`H
      0x0010:  feed 0000 0000 0001 2401 db00 0021 6048  ........$....!`H
      0x0020:  feed 0000 0000 0002 2f00 0401 0401 0100  ......../.......
      0x0030:  2000 8847 0000 0001 0002 00ff 0004 01ff  ...G............
      0x0040:  4500 0054 3280 4000 ff01 c7cb c0a8 0001  E..T2.@.........
      0x0050:  c0a8 000b 0800 a8d7 04b8 002d 2d3c a05b  ...........--<.[
      0x0060:  0000 0000 bcd8 0100 0000 0000 1011 1213  ................
      0x0070:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
      0x0080:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
      0x0090:  3435 3637                                4567
      ```
      Signed-off-by: default avatarSaif Hasan <has@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8e2262a
    • David S. Miller's avatar
      Merge branch 'net-sched-Add-hardware-specific-counters-to-TC-actions' · ea49c6f0
      David S. Miller authored
      Eelco Chaudron says:
      
      ====================
      net/sched: Add hardware specific counters to TC actions
      
      Add hardware specific counters to TC actions which will be exported
      through the netlink API. This makes troubleshooting TC flower offload
      easier, as it possible to differentiate the packets being offloaded.
      
      v2 - Rebased on latest net-next
      ====================
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea49c6f0
    • Eelco Chaudron's avatar
      net/sched: Add hardware specific counters to TC actions · 28169aba
      Eelco Chaudron authored
      Add additional counters that will store the bytes/packets processed by
      hardware. These will be exported through the netlink interface for
      displaying by the iproute2 tc tool
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28169aba
    • Eelco Chaudron's avatar
      net/core: Add new basic hardware counter · 5e111210
      Eelco Chaudron authored
      Add a new hardware specific basic counter, TCA_STATS_BASIC_HW. This can
      be used to count packets/bytes processed by hardware offload.
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e111210
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · d2f85c9e
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-09-24
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Several fixes for BPF sockmap to only allow sockets being attached in
         ESTABLISHED state, from John.
      
      2) Fix up the license to LGPL/BSD for the libc compat header which contains
         fallback helpers that libbpf and bpftool is using, from Jakub.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2f85c9e
    • David S. Miller's avatar
      Merge branch 'mvpp2-Add-txq-to-CPU-mapping' · 7ff2ea0b
      David S. Miller authored
      Maxime Chevallier says:
      
      ====================
      net: mvpp2: Add txq to CPU mapping
      
      This short series adds XPS support to the mvpp2 driver, by mapping
      txqs and CPUs. This comes with a patch using round-robin scheduling
      for the HW to pick the next txq to transmit from, instead of the default
      fixed-priority scheduling.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ff2ea0b