1. 13 Feb, 2020 1 commit
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 89e960b5
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2020-02-12
      
      This series contains fixes to only the ice driver.
      
      Dave fixes logic flaws in the DCB rebuild function which is used after a
      reset.  Also fixed a configuration issue when switching between firmware
      and software LLDP mode where the number of TLV's configured was getting
      out of sync with what lldpad thinks is configured.
      
      Paul fixes how the driver displayed all the supported and advertised
      link modes by basing it on the PHY capabilities, and in the process
      cleaned up a lot of code.
      
      Brett fixes duplicate receive tail bumps by comparing the value we are
      writing to tail with the previously written tail value.  Also cleaned up
      workarounds that are no longer needed with the latest NVM images.
      
      Anirudh cleaned up unnecessary CONFIG_PCI_IOV wrappers.  Updated the
      driver to use ice_pf_to_dev() instead of &pf->pdev->dev or
      &vsi->back->pdev->dev.  Cleaned up the string format in print function
      calls to remove newlines where applicable.
      
      Akeem updates the link message logging to include "Full Duplex" and
      "Negotiated", to help distinguish from "Requested" for FEC.
      
      Bruce fixes and consolidates the logging of firmware/NVM information
      during driver load, since the information is duplicate of what is
      available via ethtool.  Fixed the checking of the Unit Load Status bits
      after reset to ensure they are 0x7FF before continuing, by updating the
      mask.  Cleanup up possible NULL dereferences that were created by a
      previous commit.
      
      Ben fixes the driver to use the correct netif_msg_tx/rx_error() to
      determine whether to print the MDD event type.
      
      Tony provides several trivial fixes, which include whitespace, typos,
      function header comments, reverse Christmas tree issues.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89e960b5
  2. 12 Feb, 2020 33 commits
    • Tony Nguyen's avatar
      ice: Trivial fixes · 4ee656bb
      Tony Nguyen authored
      This is a collection of trivial fixes including fixing whitespace, typos,
      function headers, reverse Christmas tree, etc.
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4ee656bb
    • Ben Shelton's avatar
      ice: Use correct netif error function · 1d8bd992
      Ben Shelton authored
      Use the correct netif_msg_[tx,rx]_error() function to determine whether to
      print the MDD event type.
      Signed-off-by: default avatarBen Shelton <benjamin.h.shelton@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      1d8bd992
    • Anirudh Venkataramanan's avatar
      ice: Cleanup ice_vsi_alloc_q_vectors · 3306f79f
      Anirudh Venkataramanan authored
      1. Remove local variable num_q_vectors and use vsi->num_q_vectors instead
      2. Remove local variable pf and pass vsi->back to ice_pf_to_dev
      Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3306f79f
    • Anirudh Venkataramanan's avatar
      ice: Make print statements more compact · 19cce2c6
      Anirudh Venkataramanan authored
      Formatting strings in print function calls (like dev_info, dev_err, etc.)
      can exceed 80 columns without making checkpatch unhappy. So remove
      newlines where applicable and make print statements more compact.
      Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      19cce2c6
    • Anirudh Venkataramanan's avatar
      ice: Use ice_pf_to_dev · 9a946843
      Anirudh Venkataramanan authored
      Use ice_pf_to_dev(pf) instead of &pf->pdev->dev
      Use ice_pf_to_dev(vsi->back) instead of &vsi->back->pdev->dev
      When a pointer to the pf instance is available, use ice_pf_to_dev
      instead of ice_hw_to_dev
      Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      9a946843
    • Tony Nguyen's avatar
      ice: Remove possible null dereference · 0a6ea04e
      Tony Nguyen authored
      Commit 1f45ebe0 ("ice: add extra check for null Rx descriptor") moved
      the call to ice_construct_skb() under a null check as Coverity reported a
      possible use of null skb. However, the original call was not deleted, do so
      now.
      
      Fixes: 1f45ebe0 ("ice: add extra check for null Rx descriptor")
      Reported-by: default avatarBruce Allan <bruce.w.allan@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0a6ea04e
    • Bruce Allan's avatar
      ice: update Unit Load Status bitmask to check after reset · cf8fc2a0
      Bruce Allan authored
      After a reset the Unit Load Status bits in the GLNVM_ULD register to check
      for completion should be 0x7FF before continuing.  Update the mask to check
      (minus the three reserved bits that are always set).
      Signed-off-by: default avatarBruce Allan <bruce.w.allan@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      cf8fc2a0
    • Bruce Allan's avatar
      ice: fix and consolidate logging of NVM/firmware version information · fbf1e1f6
      Bruce Allan authored
      Logging the firmware/NVM information during driver load is redundant since
      that information is also available via ethtool.  Move the functionality
      found in ice_nvm_version_str() directly into ice_get_drvinfo() and remove
      calling the former and logging that info during driver probe.  This also
      gets rid of a bug in ice_nvm_version_str() where it returns a pointer to
      a buffer which is free'ed when that function exits.
      Signed-off-by: default avatarBruce Allan <bruce.w.allan@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      fbf1e1f6
    • Akeem G Abodunrin's avatar
      ice: Modify link message logging · b55e6032
      Akeem G Abodunrin authored
      This patch modifies link message logging to include "Full Duplex" and
      "Negotiated" for FEC, so as to distinguish it from "Requested" FEC.
      Signed-off-by: default avatarAkeem G Abodunrin <akeem.g.abodunrin@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b55e6032
    • Anirudh Venkataramanan's avatar
      ice: Remove CONFIG_PCI_IOV wrap in ice_set_pf_caps · a8b72ce0
      Anirudh Venkataramanan authored
      Remove unnecessary CONFIG_PCI_IOV wrapping in ice_set_pf_caps. None
      of the data structures accessed within the block are wrapped with
      this flag. When CONFIG_PCI_IOV is undefined, pf->num_vfs_supported
      will be 0 anyway.
      Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      a8b72ce0
    • Brett Creeley's avatar
      ice: Remove ice_dev_onetime_setup() · 3d9f9990
      Brett Creeley authored
      ice_dev_onetime_setup contains driver workarounds needed for
      firmware limitations. These issues have now been resolved in newer
      NVMs so remove the function.
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3d9f9990
    • Brett Creeley's avatar
      ice: Don't allow same value for Rx tail to be written twice · 168983a8
      Brett Creeley authored
      Currently we compare the value we are about to write to the Rx tail
      register with the previous value of next_to_use. The problem with this
      is we only write tail on 8 descriptor boundaries, but next_to_use is
      updated whenever we clean Rx descriptors. Fix this by comparing the
      value we are about to write to tail with the previously written tail
      value. This will prevent duplicate Rx tail bumps.
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      168983a8
    • Paul Greenwalt's avatar
      ice: display supported and advertised link modes · ad9a87be
      Paul Greenwalt authored
      Display all of the supported and advertised link modes based on the PHY
      capability with media.
      
      Displaying all supported modes is more informative then only displaying
      the current link mode.
      Signed-off-by: default avatarPaul Greenwalt <paul.greenwalt@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ad9a87be
    • Dave Ertman's avatar
      ice: Fix switch between FW and SW LLDP · 53977ee4
      Dave Ertman authored
      When switching between FW and SW LLDP mode, the
      number of configured TLV apps in the driver's
      DCB configuration is getting out of synch with
      what lldpad thinks is configured.  This is causing
      a problem when shutting down lldpad.  The cleanup
      is trying to delete TLV apps that are not defined
      in the kernel.
      
      Since the driver is keeping an accurate account
      of the apps defined, use the drivers number of
      apps to determine if there is an app to delete.
      If the number of apps is <= 1, then do not
      attempt to delete.
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      53977ee4
    • Dave Ertman's avatar
      ice: Fix DCB rebuild after reset · 242b5e06
      Dave Ertman authored
      The function ice_dcb_rebuild had some logic
      flaws in it, and also didn't differentiate
      between FW and SW modes needs.
      
      For FW flow, the willing setting was being
      forced to OFF and left that way.  Unwilling
      in DCB FW mode is not a supported model.
      
      Leave the config alone and use the return value
      from the set command to determine if setting the
      config was successful.
      
      The SW DCB flow does not need to need to register
      for MIB change events (as they are not used in
      SW mode).
      
      Use !is_sw_lldp checks to only perform FW specific
      task while in FW mode.
      
      Also adding a reapplication of the current DCB
      config after a link event.  Some NVMs are not
      maintaining their DCB configs across link events.
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      242b5e06
    • Kunihiko Hayashi's avatar
      net: ethernet: ave: Add capability of rgmii-id mode · b9287f2a
      Kunihiko Hayashi authored
      This allows you to specify the type of rgmii-id that will enable phy
      internal delay in ethernet phy-mode.
      
      This adds all RGMII cases to all of get_pinmode() except LD11, because LD11
      SoC doesn't support RGMII due to the constraint of the hardware. When RGMII
      phy mode is specified in the devicetree for LD11, the driver will abort
      with an error.
      Signed-off-by: default avatarKunihiko Hayashi <hayashi.kunihiko@socionext.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9287f2a
    • Firo Yang's avatar
      enic: prevent waking up stopped tx queues over watchdog reset · 0f905225
      Firo Yang authored
      Recent months, our customer reported several kernel crashes all
      preceding with following message:
      NETDEV WATCHDOG: eth2 (enic): transmit queue 0 timed out
      Error message of one of those crashes:
      BUG: unable to handle kernel paging request at ffffffffa007e090
      
      After analyzing severl vmcores, I found that most of crashes are
      caused by memory corruption. And all the corrupted memory areas
      are overwritten by data of network packets. Moreover, I also found
      that the tx queues were enabled over watchdog reset.
      
      After going through the source code, I found that in enic_stop(),
      the tx queues stopped by netif_tx_disable() could be woken up over
      a small time window between netif_tx_disable() and the
      napi_disable() by the following code path:
      napi_poll->
        enic_poll_msix_wq->
           vnic_cq_service->
              enic_wq_service->
                 netif_wake_subqueue(enic->netdev, q_number)->
                    test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state)
      In turn, upper netowrk stack could queue skb to ENIC NIC though
      enic_hard_start_xmit(). And this might introduce some race condition.
      
      Our customer comfirmed that this kind of kernel crash doesn't occur over
      90 days since they applied this patch.
      Signed-off-by: default avatarFiro Yang <firo.yang@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f905225
    • David S. Miller's avatar
      Merge branch 'Bug-fixes-for-ENA-Ethernet-driver' · b44beb8a
      David S. Miller authored
      Sameeh Jubran says:
      
      ====================
      Bug fixes for ENA Ethernet driver
      
      Difference from V1:
      * Started using netdev_rss_key_fill()
      * Dropped superflous changes that are not related to bug fixes as
        requested by Jakub
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b44beb8a
    • Arthur Kiyanovski's avatar
      net: ena: ena-com.c: prevent NULL pointer dereference · c207979f
      Arthur Kiyanovski authored
      comp_ctx can be NULL in a very rare case when an admin command is executed
      during the execution of ena_remove().
      
      The bug scenario is as follows:
      
      * ena_destroy_device() sets the comp_ctx to be NULL
      * An admin command is executed before executing unregister_netdev(),
        this can still happen because our device can still receive callbacks
        from the netdev infrastructure such as ethtool commands.
      * When attempting to access the comp_ctx, the bug occurs since it's set
        to NULL
      
      Fix:
      Added a check that comp_ctx is not NULL
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c207979f
    • Sameeh Jubran's avatar
      net: ena: ethtool: use correct value for crc32 hash · 886d2089
      Sameeh Jubran authored
      Up till kernel 4.11 there was no enum defined for crc32 hash in ethtool,
      thus the xor enum was used for supporting crc32.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      886d2089
    • Arthur Kiyanovski's avatar
      net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE · 470793a7
      Arthur Kiyanovski authored
      As the name suggests ETH_RSS_HASH_NO_CHANGE is received upon changing
      the key or indirection table using ethtool while keeping the same hash
      function.
      
      Also add a function for retrieving the current hash function from
      the ena-com layer.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarSaeed Bshara <saeedb@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      470793a7
    • Arthur Kiyanovski's avatar
      net: ena: fix corruption of dev_idx_to_host_tbl · e3f89f91
      Arthur Kiyanovski authored
      The function ena_com_ind_tbl_convert_from_device() has an overflow
      bug as explained below. Either way, this function is not needed at
      all since we don't retrieve the indirection table from the device
      at any point which means that this conversion is not needed.
      
      The bug:
      The for loop iterates over all io_sq_queues, when passing the actual
      number of used queues the io_sq_queues[i].idx equals 0 since they are
      uninitialized which results in the following code to be executed till
      the end of the loop:
      
      dev_idx_to_host_tbl[0] = i;
      
      This results dev_idx_to_host_tbl[0] in being equal to
      ENA_TOTAL_NUM_QUEUES - 1.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3f89f91
    • Arthur Kiyanovski's avatar
      net: ena: fix incorrectly saving queue numbers when setting RSS indirection table · 92569fd2
      Arthur Kiyanovski authored
      The indirection table has the indices of the Rx queues. When we store it
      during set indirection operation, we convert the indices to our internal
      representation of the indices.
      
      Our internal representation of the indices is: even indices for Tx and
      uneven indices for Rx, where every Tx/Rx pair are in a consecutive order
      starting from 0. For example if the driver has 3 queues (3 for Tx and 3
      for Rx) then the indices are as follows:
      0  1  2  3  4  5
      Tx Rx Tx Rx Tx Rx
      
      The BUG:
      The issue is that when we satisfy a get request for the indirection
      table, we don't convert the indices back to the original representation.
      
      The FIX:
      Simply apply the inverse function for the indices of the indirection
      table after we set it.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92569fd2
    • Arthur Kiyanovski's avatar
      net: ena: rss: store hash function as values and not bits · 4844470d
      Arthur Kiyanovski authored
      The device receives, stores and retrieves the hash function value as bits
      and not as their enum value.
      
      The bug:
      * In ena_com_set_hash_function() we set
        cmd.u.flow_hash_func.selected_func to the bit value of rss->hash_func.
       (1 << rss->hash_func)
      * In ena_com_get_hash_function() we retrieve the hash function and store
        it's bit value in rss->hash_func. (Now the bit value of rss->hash_func
        is stored in rss->hash_func instead of it's enum value)
      
      The fix:
      This commit fixes the issue by converting the retrieved hash function
      values from the device to the matching enum value of the set bit using
      ffs(). ffs() finds the first set bit's index in a word. Since the function
      returns 1 for the LSB's index, we need to subtract 1 from the returned
      value (note that BIT(0) is 1).
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4844470d
    • Sameeh Jubran's avatar
      net: ena: rss: fix failure to get indirection table · 0c8923c0
      Sameeh Jubran authored
      On old hardware, getting / setting the hash function is not supported while
      gettting / setting the indirection table is.
      
      This commit enables us to still show the indirection table on older
      hardwares by setting the hash function and key to NULL.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c8923c0
    • Sameeh Jubran's avatar
      net: ena: rss: do not allocate key when not supported · 6a4f7dc8
      Sameeh Jubran authored
      Currently we allocate the key whether the device supports setting the
      key or not. This commit adds a check to the allocation function and
      handles the error accordingly.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a4f7dc8
    • Arthur Kiyanovski's avatar
      net: ena: fix incorrect default RSS key · 0d1c3de7
      Arthur Kiyanovski authored
      Bug description:
      When running "ethtool -x <if_name>" the key shows up as all zeros.
      
      When we use "ethtool -X <if_name> hfunc toeplitz hkey <some:random:key>" to
      set the key and then try to retrieve it using "ethtool -x <if_name>" then
      we return the correct key because we return the one we saved.
      
      Bug cause:
      We don't fetch the key from the device but instead return the key
      that we have saved internally which is by default set to zero upon
      allocation.
      
      Fix:
      This commit fixes the issue by initializing the key to a random value
      using netdev_rss_key_fill().
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d1c3de7
    • Arthur Kiyanovski's avatar
      net: ena: add missing ethtool TX timestamping indication · cf6d17fd
      Arthur Kiyanovski authored
      Current implementation of the driver calls skb_tx_timestamp()to add a
      software tx timestamp to the skb, however the software-transmit capability
      is not reported in ethtool -T.
      
      This commit updates the ethtool structure to report the software-transmit
      capability in ethtool -T using the standard ethtool_op_get_ts_info().
      This function reports all software timestamping capabilities (tx and rx),
      as well as setting phc_index = -1. phc_index is the index of the PTP
      hardware clock device that will be used for hardware timestamps. Since we
      don't have such a device in ENA, using the default -1 value is the correct
      setting.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarEzequiel Lara Gomez <ezegomez@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf6d17fd
    • Arthur Kiyanovski's avatar
      net: ena: fix uses of round_jiffies() · 2a6e5fa2
      Arthur Kiyanovski authored
      >From the documentation of round_jiffies():
      "Rounds a time delta  in the future (in jiffies) up or down to
      (approximately) full seconds. This is useful for timers for which
      the exact time they fire does not matter too much, as long as
      they fire approximately every X seconds.
      By rounding these timers to whole seconds, all such timers will fire
      at the same time, rather than at various times spread out. The goal
      of this is to have the CPU wake up less, which saves power."
      
      There are 2 parts to this patch:
      ================================
      Part 1:
      -------
      In our case we need timer_service to be called approximately every
      X=1 seconds, and the exact time does not matter, so using round_jiffies()
      is the right way to go.
      
      Therefore we add round_jiffies() to the mod_timer() in ena_timer_service().
      
      Part 2:
      -------
      round_jiffies() is used in check_for_missing_keep_alive() when
      getting the jiffies of the expiration of the keep_alive timeout. Here it
      is actually a mistake to use round_jiffies() because we want the exact
      time when keep_alive should expire and not an approximate rounded time,
      which can cause early, false positive, timeouts.
      
      Therefore we remove round_jiffies() in the calculation of
      keep_alive_expired() in check_for_missing_keep_alive().
      
      Fixes: 82ef30f1 ("net: ena: add hardware hints capability to the driver")
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a6e5fa2
    • Arthur Kiyanovski's avatar
      net: ena: fix potential crash when rxfh key is NULL · 91a65b7d
      Arthur Kiyanovski authored
      When ethtool -X is called without an hkey, ena_com_fill_hash_function()
      is called with key=NULL, which is passed to memcpy causing a crash.
      
      This commit fixes this issue by checking key is not NULL.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91a65b7d
    • Eric Dumazet's avatar
      net/smc: fix leak of kernel memory to user space · 457fed77
      Eric Dumazet authored
      As nlmsg_put() does not clear the memory that is reserved,
      it this the caller responsability to make sure all of this
      memory will be written, in order to not reveal prior content.
      
      While we are at it, we can provide the socket cookie even
      if clsock is not set.
      
      syzbot reported :
      
      BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
      BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
      BUG: KMSAN: uninit-value in __swab32p include/uapi/linux/swab.h:179 [inline]
      BUG: KMSAN: uninit-value in __be32_to_cpup include/uapi/linux/byteorder/little_endian.h:82 [inline]
      BUG: KMSAN: uninit-value in get_unaligned_be32 include/linux/unaligned/access_ok.h:30 [inline]
      BUG: KMSAN: uninit-value in ____bpf_skb_load_helper_32 net/core/filter.c:240 [inline]
      BUG: KMSAN: uninit-value in ____bpf_skb_load_helper_32_no_cache net/core/filter.c:255 [inline]
      BUG: KMSAN: uninit-value in bpf_skb_load_helper_32_no_cache+0x14a/0x390 net/core/filter.c:252
      CPU: 1 PID: 5262 Comm: syz-executor.5 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
       __fswab32 include/uapi/linux/swab.h:59 [inline]
       __swab32p include/uapi/linux/swab.h:179 [inline]
       __be32_to_cpup include/uapi/linux/byteorder/little_endian.h:82 [inline]
       get_unaligned_be32 include/linux/unaligned/access_ok.h:30 [inline]
       ____bpf_skb_load_helper_32 net/core/filter.c:240 [inline]
       ____bpf_skb_load_helper_32_no_cache net/core/filter.c:255 [inline]
       bpf_skb_load_helper_32_no_cache+0x14a/0x390 net/core/filter.c:252
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_kmalloc_large+0x73/0xc0 mm/kmsan/kmsan_hooks.c:128
       kmalloc_large_node_hook mm/slub.c:1406 [inline]
       kmalloc_large_node+0x282/0x2c0 mm/slub.c:3841
       __kmalloc_node_track_caller+0x44b/0x1200 mm/slub.c:4368
       __kmalloc_reserve net/core/skbuff.c:141 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
       alloc_skb include/linux/skbuff.h:1049 [inline]
       netlink_dump+0x44b/0x1ab0 net/netlink/af_netlink.c:2224
       __netlink_dump_start+0xbb2/0xcf0 net/netlink/af_netlink.c:2352
       netlink_dump_start include/linux/netlink.h:233 [inline]
       smc_diag_handler_dump+0x2ba/0x300 net/smc/smc_diag.c:242
       sock_diag_rcv_msg+0x211/0x610 net/core/sock_diag.c:256
       netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
       sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       kernel_sendmsg+0x433/0x440 net/socket.c:679
       sock_no_sendpage+0x235/0x300 net/core/sock.c:2740
       kernel_sendpage net/socket.c:3776 [inline]
       sock_sendpage+0x1e1/0x2c0 net/socket.c:937
       pipe_to_sendpage+0x38c/0x4c0 fs/splice.c:458
       splice_from_pipe_feed fs/splice.c:512 [inline]
       __splice_from_pipe+0x539/0xed0 fs/splice.c:636
       splice_from_pipe fs/splice.c:671 [inline]
       generic_splice_sendpage+0x1d5/0x2d0 fs/splice.c:844
       do_splice_from fs/splice.c:863 [inline]
       do_splice fs/splice.c:1170 [inline]
       __do_sys_splice fs/splice.c:1447 [inline]
       __se_sys_splice+0x2380/0x3350 fs/splice.c:1427
       __x64_sys_splice+0x6e/0x90 fs/splice.c:1427
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: f16a7dd5 ("smc: netlink interface for SMC sockets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      457fed77
    • Brett Creeley's avatar
      i40e: Fix the conditional for i40e_vc_validate_vqs_bitmaps · f27f37a0
      Brett Creeley authored
      Commit d9d6a9ae ("i40e: Fix virtchnl_queue_select bitmap
      validation") introduced a necessary change for verifying how queue
      bitmaps from the iavf driver get validated. Unfortunately, the
      conditional was reversed. Fix this.
      
      Fixes: d9d6a9ae ("i40e: Fix virtchnl_queue_select bitmap validation")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f27f37a0
    • Toke Høiland-Jørgensen's avatar
      core: Don't skip generic XDP program execution for cloned SKBs · ad1e03b2
      Toke Høiland-Jørgensen authored
      The current generic XDP handler skips execution of XDP programs entirely if
      an SKB is marked as cloned. This leads to some surprising behaviour, as
      packets can end up being cloned in various ways, which will make an XDP
      program not see all the traffic on an interface.
      
      This was discovered by a simple test case where an XDP program that always
      returns XDP_DROP is installed on a veth device. When combining this with
      the Scapy packet sniffer (which uses an AF_PACKET) socket on the sending
      side, SKBs reliably end up in the cloned state, causing them to be passed
      through to the receiving interface instead of being dropped. A minimal
      reproducer script for this is included below.
      
      This patch fixed the issue by simply triggering the existing linearisation
      code for cloned SKBs instead of skipping the XDP program execution. This
      behaviour is in line with the behaviour of the native XDP implementation
      for the veth driver, which will reallocate and copy the SKB data if the SKB
      is marked as shared.
      
      Reproducer Python script (requires BCC and Scapy):
      
      from scapy.all import TCP, IP, Ether, sendp, sniff, AsyncSniffer, Raw, UDP
      from bcc import BPF
      import time, sys, subprocess, shlex
      
      SKB_MODE = (1 << 1)
      DRV_MODE = (1 << 2)
      PYTHON=sys.executable
      
      def client():
          time.sleep(2)
          # Sniffing on the sender causes skb_cloned() to be set
          s = AsyncSniffer()
          s.start()
      
          for p in range(10):
              sendp(Ether(dst="aa:aa:aa:aa:aa:aa", src="cc:cc:cc:cc:cc:cc")/IP()/UDP()/Raw("Test"),
                    verbose=False)
              time.sleep(0.1)
      
          s.stop()
          return 0
      
      def server(mode):
          prog = BPF(text="int dummy_drop(struct xdp_md *ctx) {return XDP_DROP;}")
          func = prog.load_func("dummy_drop", BPF.XDP)
          prog.attach_xdp("a_to_b", func, mode)
      
          time.sleep(1)
      
          s = sniff(iface="a_to_b", count=10, timeout=15)
          if len(s):
              print(f"Got {len(s)} packets - should have gotten 0")
              return 1
          else:
              print("Got no packets - as expected")
              return 0
      
      if len(sys.argv) < 2:
          print(f"Usage: {sys.argv[0]} <skb|drv>")
          sys.exit(1)
      
      if sys.argv[1] == "client":
          sys.exit(client())
      elif sys.argv[1] == "server":
          mode = SKB_MODE if sys.argv[2] == 'skb' else DRV_MODE
          sys.exit(server(mode))
      else:
          try:
              mode = sys.argv[1]
              if mode not in ('skb', 'drv'):
                  print(f"Usage: {sys.argv[0]} <skb|drv>")
                  sys.exit(1)
              print(f"Running in {mode} mode")
      
              for cmd in [
                      'ip netns add netns_a',
                      'ip netns add netns_b',
                      'ip -n netns_a link add a_to_b type veth peer name b_to_a netns netns_b',
                      # Disable ipv6 to make sure there's no address autoconf traffic
                      'ip netns exec netns_a sysctl -qw net.ipv6.conf.a_to_b.disable_ipv6=1',
                      'ip netns exec netns_b sysctl -qw net.ipv6.conf.b_to_a.disable_ipv6=1',
                      'ip -n netns_a link set dev a_to_b address aa:aa:aa:aa:aa:aa',
                      'ip -n netns_b link set dev b_to_a address cc:cc:cc:cc:cc:cc',
                      'ip -n netns_a link set dev a_to_b up',
                      'ip -n netns_b link set dev b_to_a up']:
                  subprocess.check_call(shlex.split(cmd))
      
              server = subprocess.Popen(shlex.split(f"ip netns exec netns_a {PYTHON} {sys.argv[0]} server {mode}"))
              client = subprocess.Popen(shlex.split(f"ip netns exec netns_b {PYTHON} {sys.argv[0]} client"))
      
              client.wait()
              server.wait()
              sys.exit(server.returncode)
      
          finally:
              subprocess.run(shlex.split("ip netns delete netns_a"))
              subprocess.run(shlex.split("ip netns delete netns_b"))
      
      Fixes: d4455169 ("net: xdp: support xdp generic on virtual devices")
      Reported-by: default avatarStepan Horacek <shoracek@redhat.com>
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad1e03b2
  3. 10 Feb, 2020 6 commits
    • Bjørn Mork's avatar
      qmi_wwan: unconditionally reject 2 ep interfaces · 00516d13
      Bjørn Mork authored
      We have been using the fact that the QMI and DIAG functions
      usually are the only ones with class/subclass/protocol being
      ff/ff/ff on Quectel modems. This has allowed us to match the
      QMI function without knowing the exact interface number,
      which can vary depending on firmware configuration.
      
      The ability to silently reject the DIAG function, which is
      usually handled by the option driver, is important for this
      method to work.  This is done based on the knowledge that it
      has exactly 2 bulk endpoints.  QMI function control interfaces
      will have either 3 or 1 endpoint. This rule is universal so
      the quirk condition can be removed.
      
      The fixed layouts known from the Gobi1k and Gobi2k modems
      have been gradually replaced by more dynamic layouts, and
      many vendors now use configurable layouts without changing
      device IDs.  Renaming the class/subclass/protocol matching
      macro makes it more obvious that this is now not Quectel
      specific anymore.
      
      Cc: Kristian Evensen <kristian.evensen@gmail.com>
      Cc: Aleksander Morgado <aleksander@aleksander.es>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00516d13
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Prevent truncation of longer interrupt names · 5d1fbdf2
      Andrew Lunn authored
      When adding support for unique interrupt names, after testing on a few
      devices, it was assumed 32 characters would be sufficient. This
      assumption turned out to be incorrect, ZII RDU2 for example uses a
      device base name of mv88e6xxx-30be0000.ethernet-1:0, leaving no space
      for post fixes such as -g1-atu-prob and -watchdog. The names then
      become identical, defeating the point of the patch.
      
      Increase the length of the string to 64 charactoes.
      Reported-by: default avatarChris Healy <Chris.Healy@zii.aero>
      Fixes: 3095383a ("net: dsa: mv88e6xxx: Unique IRQ name")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d1fbdf2
    • Bjørn Mork's avatar
      qmi_wwan: re-add DW5821e pre-production variant · 88bf5460
      Bjørn Mork authored
      Commit f25e1392 removed the support for the pre-production variant
      of the Dell DW5821e to avoid probing another USB interface unnecessarily.
      However, the pre-production samples are found in the wild, and this lack
      of support is causing problems for users of such samples.  It is therefore
      necessary to support both variants.
      
      Matching on both interfaces 0 and 1 is not expected to cause any problem
      with either variant, as only the QMI function will be probed successfully
      on either.  Interface 1 will be rejected based on the HID class for the
      production variant:
      
      T:  Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 16 Spd=480 MxCh= 0
      D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  2
      P:  Vendor=413c ProdID=81d7 Rev=03.18
      S:  Manufacturer=DELL
      S:  Product=DW5821e Snapdragon X20 LTE
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I:  If#= 1 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      
      And interface 0 will be rejected based on too few endpoints for the
      pre-production variant:
      
      T: Bus=01 Lev=02 Prnt=02 Port=03 Cnt=03 Dev#= 7 Spd=480 MxCh= 0
      D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2
      P: Vendor=413c ProdID=81d7 Rev= 3.18
      S: Manufacturer=DELL
      S: Product=DW5821e Snapdragon X20 LTE
      S: SerialNumber=0123456789ABCDEF
      C: #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
      I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=
      I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      
      Fixes: f25e1392 ("qmi_wwan: fix interface number for DW5821e production firmware")
      Link: https://whrl.pl/Rf0vNkReported-by: default avatarLars Melin <larsm17@gmail.com>
      Cc: Aleksander Morgado <aleksander@aleksander.es>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88bf5460
    • Tuong Lien's avatar
      tipc: fix successful connect() but timed out · 5391a877
      Tuong Lien authored
      In commit 9546a0b7 ("tipc: fix wrong connect() return code"), we
      fixed the issue with the 'connect()' that returns zero even though the
      connecting has failed by waiting for the connection to be 'ESTABLISHED'
      really. However, the approach has one drawback in conjunction with our
      'lightweight' connection setup mechanism that the following scenario
      can happen:
      
                (server)                        (client)
      
         +- accept()|                      |             wait_for_conn()
         |          |                      |connect() -------+
         |          |<-------[SYN]---------|                 > sleeping
         |          |                      *CONNECTING       |
         |--------->*ESTABLISHED           |                 |
                    |--------[ACK]-------->*ESTABLISHED      > wakeup()
              send()|--------[DATA]------->|\                > wakeup()
              send()|--------[DATA]------->| |               > wakeup()
                .   .          .           . |-> recvq       .
                .   .          .           . |               .
              send()|--------[DATA]------->|/                > wakeup()
             close()|--------[FIN]-------->*DISCONNECTING    |
                    *DISCONNECTING         |                 |
                    |                      ~~~~~~~~~~~~~~~~~~> schedule()
                                                             | wait again
                                                             .
                                                             .
                                                             | ETIMEDOUT
      
      Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
      and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
      the server starts to send a number of data following by a 'close()'
      shortly without waiting any response from the client, which then forces
      the client socket to be 'DISCONNECTING' immediately. When the wait
      process is switched to be running, it continues to wait until the timer
      expires because of the unexpected socket state. The client 'connect()'
      will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
      there remains the messages in its receive queue.
      
      Obviously the issue would not happen if the server had some delay prior
      to its 'close()' (or the number of 'DATA' messages is large enough),
      but any kind of delay would make the connection setup/shutdown "heavy".
      We solve this by simply allowing the 'connect()' returns zero in this
      particular case. The socket is already 'DISCONNECTING', so any further
      write will get '-EPIPE' but the socket is still able to read the
      messages existing in its receive queue.
      
      Note: This solution doesn't break the previous one as it deals with a
      different situation that the socket state is 'DISCONNECTING' but has no
      error (i.e. sk->sk_err = 0).
      
      Fixes: 9546a0b7 ("tipc: fix wrong connect() return code")
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5391a877
    • Chen Wandun's avatar
      mptcp: make the symbol 'mptcp_sk_clone_lock' static · 5609e2bb
      Chen Wandun authored
      Fix the following sparse warning:
      net/mptcp/protocol.c:646:13: warning: symbol 'mptcp_sk_clone_lock' was not declared. Should it be static?
      
      Fixes: b0519de8 ("mptcp: fix use-after-free for ipv6")
      Signed-off-by: default avatarChen Wandun <chenwandun@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5609e2bb
    • Chen Wandun's avatar
      tipc: make three functions static · 2437fd7b
      Chen Wandun authored
      Fix the following sparse warning:
      
      net/tipc/node.c:281:6: warning: symbol 'tipc_node_free' was not declared. Should it be static?
      net/tipc/node.c:2801:5: warning: symbol '__tipc_nl_node_set_key' was not declared. Should it be static?
      net/tipc/node.c:2878:5: warning: symbol '__tipc_nl_node_flush_key' was not declared. Should it be static?
      
      Fixes: fc1b6d6d ("tipc: introduce TIPC encryption & authentication")
      Fixes: e1f32190 ("tipc: add support for AEAD key setting via netlink")
      Signed-off-by: default avatarChen Wandun <chenwandun@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2437fd7b