1. 08 Apr, 2019 20 commits
    • David Ahern's avatar
      ipv4: Refactor fib_check_nh · 448d7248
      David Ahern authored
      fib_check_nh is currently huge covering multiple uses cases - device only,
      device + gateway, and device + gateway with ONLINK. The next patch adds
      validation checks for IPv6 which only further complicates it. So, break
      fib_check_nh into 2 helpers - one for gateway validation and one for device
      only.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      448d7248
    • David Ahern's avatar
      ipv4: Add support to fib_config for IPv6 gateway · a4ea5d43
      David Ahern authored
      Add support for an IPv6 gateway to fib_config. Since a gateway is either
      IPv4 or IPv6, make it a union with fc_gw4 where fc_gw_family decides
      which address is in use. Update current checks on family and gw4 to
      handle ipv6 as well.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4ea5d43
    • David Ahern's avatar
      ipv4: Add support to rtable for ipv6 gateway · 0f5f7d7b
      David Ahern authored
      Add support for an IPv6 gateway to rtable. Since a gateway is either
      IPv4 or IPv6, make it a union with rt_gw4 where rt_gw_family decides
      which address is in use.
      
      When dumping the route data, encode an ipv6 nexthop using RTA_VIA.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f5f7d7b
    • David Ahern's avatar
      ipv4: Prepare fib_config for IPv6 gateway · f35b794b
      David Ahern authored
      Similar to rtable, fib_config needs to allow the gateway to be either an
      IPv4 or an IPv6 address. To that end, rename fc_gw to fc_gw4 to mean an
      IPv4 address and add fc_gw_family. Checks on 'is a gateway set' are changed
      to see if fc_gw_family is set. In the process prepare the code for a
      fc_gw_family == AF_INET6.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f35b794b
    • David Ahern's avatar
      ipv4: Prepare rtable for IPv6 gateway · 1550c171
      David Ahern authored
      To allow the gateway to be either an IPv4 or IPv6 address, remove
      rt_uses_gateway from rtable and replace with rt_gw_family. If
      rt_gw_family is set it implies rt_uses_gateway. Rename rt_gateway
      to rt_gw4 to represent the IPv4 version.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1550c171
    • David Ahern's avatar
      net: Replace nhc_has_gw with nhc_gw_family · bdf00467
      David Ahern authored
      Allow the gateway in a fib_nh_common to be from a different address
      family than the outer fib{6}_nh. To that end, replace nhc_has_gw with
      nhc_gw_family and update users of nhc_has_gw to check nhc_gw_family.
      Now nhc_family is used to know if the nh_common is part of a fib_nh
      or fib6_nh (used for container_of to get to route family specific data),
      and nhc_gw_family represents the address family for the gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdf00467
    • David Ahern's avatar
      ipv6: Add neighbor helpers that use the ipv6 stub · 71df5777
      David Ahern authored
      Add ipv6 helpers to handle ndisc references via the stub. Update
      bpf_ipv6_fib_lookup to use __ipv6_neigh_lookup_noref_stub instead of
      the open code ___neigh_lookup_noref with the stub.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71df5777
    • David Ahern's avatar
      ipv6: Add fib6_nh_init and release to stubs · 1aefd3de
      David Ahern authored
      Add fib6_nh_init and fib6_nh_release to ipv6_stubs. If fib6_nh_init fails,
      callers should not invoke fib6_nh_release, so there is no reason to have
      a dummy stub for the IPv6 is not enabled case.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1aefd3de
    • Heiner Kallweit's avatar
      net: phy: improve link partner capability detection · 3b8b11f9
      Heiner Kallweit authored
      genphy_read_status() so far checks phydev->supported, not the actual
      PHY capabilities. This can make a difference if the supported speeds
      have been limited by of_set_phy_supported() or phy_set_max_speed().
      
      It seems that this issue only affects the link partner advertisements
      as displayed by ethtool. Also this patch wouldn't apply to older
      kernels because linkmode bitmaps have been introduced recently.
      Therefore net-next.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b8b11f9
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2019-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 8bb309e6
      David S. Miller authored
      Saeed Mamameed says:
      
      ====================
      mlx5-updates-2019-04-02
      
      This series provides misc updates to mlx5 driver
      
      1) Aya Levin (1): Handle event of power detection in the PCIE slot
      
      2) Eli Britstein (6):
        Some TC VLAN related updates and fixes to the previous VLAN modify action
        support patchset.
        Offload TC e-switch rules with egress/ingress VLAN devices
      
      3) Max Gurtovoy (1): Fix double mutex initialization in esiwtch.c
      
      4) Tariq Toukan (3): Misc small updates
        A write memory barrier is sufficient in EQ ci update
        Obsolete param field holding a constant value
        Unify logic of MTU boundaries
      
      5) Tonghao Zhang (4): Misc updates to en_tc.c
        Make the log friendly when decapsulation offload not supported
        Remove 'parse_attr' argument in parse_tc_fdb_actions()
        Deletes unnecessary setting of esw_attr->parse_attr
        Return -EOPNOTSUPP when attempting to offload an unsupported action
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8bb309e6
    • Vishal Kulkarni's avatar
      cxgb4: Don't return EAGAIN when TCAM is full. · ed514fc5
      Vishal Kulkarni authored
      During hash filter programming, driver needs to return ENOSPC error
      intead of EAGAIN when TCAM is full.
      Signed-off-by: default avatarVishal Kulkarni <vishal@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed514fc5
    • Alexandru Ardelean's avatar
      net: xilinx: emaclite: add minimal ndo_do_ioctl hook · fcf97825
      Alexandru Ardelean authored
      This hook only implements a minimal set of ioctl hooks to be able to access
      MII regs by using phytool.
      When using this simple MAC controller, it's pretty difficult to do
      debugging of the PHY chip without checking MII regs.
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcf97825
    • Alexandru Ardelean's avatar
      net: xilinx: emaclite: add minimal ethtool ops · 9a80ba06
      Alexandru Ardelean authored
      This set adds a minimal set of ethtool hooks to the driver, which provide a
      decent amount of link information via ethtool.
      With this change, running `ethtool ethX` in user-space provides all the
      neatly-formatted information about the link (what was negotiated, what is
      advertised, etc).
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a80ba06
    • Paolo Abeni's avatar
      datagram: remove rendundant 'peeked' argument · fd69c399
      Paolo Abeni authored
      After commit a297569f ("net/udp: do not touch skb->peeked unless
      really needed") the 'peeked' argument of __skb_try_recv_datagram()
      and friends is always equal to !!'flags & MSG_PEEK'.
      
      Since such argument is really a boolean info, and the callers have
      already 'flags & MSG_PEEK' handy, we can remove it and clean-up the
      code a bit.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd69c399
    • Vlad Buslov's avatar
      net: sched: flower: insert filter to ht before offloading it to hw · 1f17f774
      Vlad Buslov authored
      John reports:
      
      Recent refactoring of fl_change aims to use the classifier spinlock to
      avoid the need for rtnl lock. In doing so, the fl_hw_replace_filer()
      function was moved to before the lock is taken. This can create problems
      for drivers if duplicate filters are created (commmon in ovs tc offload
      due to filters being triggered by user-space matches).
      
      Drivers registered for such filters will now receive multiple copies of
      the same rule, each with a different cookie value. This means that the
      drivers would need to do a full match field lookup to determine
      duplicates, repeating work that will happen in flower __fl_lookup().
      Currently, drivers do not expect to receive duplicate filters.
      
      To fix this, verify that filter with same key is not present in flower
      classifier hash table and insert the new filter to the flower hash table
      before offloading it to hardware. Implement helper function
      fl_ht_insert_unique() to atomically verify/insert a filter.
      
      This change makes filter visible to fast path at the beginning of
      fl_change() function, which means it can no longer be freed directly in
      case of error. Refactor fl_change() error handling code to deallocate the
      filter with rcu timeout.
      
      Fixes: 620da486 ("net: sched: flower: refactor fl_change")
      Reported-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f17f774
    • David S. Miller's avatar
      Merge branch 'rhashtable-bitlocks' · 9186c90b
      David S. Miller authored
      NeilBrown says:
      
      ====================
      Convert rhashtable to use bitlocks
      
      This series converts rhashtable to use a per-bucket bitlock
      rather than a separate array of spinlocks.
      This:
        reduces memory usage
        results in slightly fewer memory accesses
        slightly improves parallelism
        makes a configuration option unnecessary
      
      The main change from previous version is to use a distinct type for
      the pointer in the bucket which has a bit-lock in it.  This
      helped find two places where rht_ptr() was missed, one
      in  rhashtable_free_and_destroy() in print_ht in the test code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9186c90b
    • NeilBrown's avatar
      rhashtable: add lockdep tracking to bucket bit-spin-locks. · 149212f0
      NeilBrown authored
      Native bit_spin_locks are not tracked by lockdep.
      
      The bit_spin_locks used for rhashtable buckets are local
      to the rhashtable implementation, so there is little opportunity
      for the sort of misuse that lockdep might detect.
      However locks are held while a hash function or compare
      function is called, and if one of these took a lock,
      a misbehaviour is possible.
      
      As it is quite easy to add lockdep support this unlikely
      possibility seems to be enough justification.
      
      So create a lockdep class for bucket bit_spin_lock and attach
      through a lockdep_map in each bucket_table.
      
      Without the 'nested' annotation in rhashtable_rehash_one(), lockdep
      correctly reports a possible problem as this lock is taken
      while another bucket lock (in another table) is held.  This
      confirms that the added support works.
      With the correct nested annotation in place, lockdep reports
      no problems.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      149212f0
    • NeilBrown's avatar
      rhashtable: use bit_spin_locks to protect hash bucket. · 8f0db018
      NeilBrown authored
      This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
      bucket pointer to lock the hash chain for that bucket.
      
      The benefits of a bit spin_lock are:
       - no need to allocate a separate array of locks.
       - no need to have a configuration option to guide the
         choice of the size of this array
       - locking cost is often a single test-and-set in a cache line
         that will have to be loaded anyway.  When inserting at, or removing
         from, the head of the chain, the unlock is free - writing the new
         address in the bucket head implicitly clears the lock bit.
         For __rhashtable_insert_fast() we ensure this always happens
         when adding a new key.
       - even when lockings costs 2 updates (lock and unlock), they are
         in a cacheline that needs to be read anyway.
      
      The cost of using a bit spin_lock is a little bit of code complexity,
      which I think is quite manageable.
      
      Bit spin_locks are sometimes inappropriate because they are not fair -
      if multiple CPUs repeatedly contend of the same lock, one CPU can
      easily be starved.  This is not a credible situation with rhashtable.
      Multiple CPUs may want to repeatedly add or remove objects, but they
      will typically do so at different buckets, so they will attempt to
      acquire different locks.
      
      As we have more bit-locks than we previously had spinlocks (by at
      least a factor of two) we can expect slightly less contention to
      go with the slightly better cache behavior and reduced memory
      consumption.
      
      To enhance type checking, a new struct is introduced to represent the
        pointer plus lock-bit
      that is stored in the bucket-table.  This is "struct rhash_lock_head"
      and is empty.  A pointer to this needs to be cast to either an
      unsigned lock, or a "struct rhash_head *" to be useful.
      Variables of this type are most often called "bkt".
      
      Previously "pprev" would sometimes point to a bucket, and sometimes a
      ->next pointer in an rhash_head.  As these are now different types,
      pprev is NULL when it would have pointed to the bucket. In that case,
      'blk' is used, together with correct locking protocol.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f0db018
    • NeilBrown's avatar
      rhashtable: allow rht_bucket_var to return NULL. · ff302db9
      NeilBrown authored
      Rather than returning a pointer to a static nulls, rht_bucket_var()
      now returns NULL if the bucket doesn't exist.
      This will make the next patch, which stores a bitlock in the
      bucket pointer, somewhat cleaner.
      
      This change involves introducing __rht_bucket_nested() which is
      like rht_bucket_nested(), but doesn't provide the static nulls,
      and changing rht_bucket_nested() to call this and possible
      provide a static nulls - as is still needed for the non-var case.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff302db9
    • NeilBrown's avatar
      rhashtable: use cmpxchg() in nested_table_alloc() · 7a41c294
      NeilBrown authored
      nested_table_alloc() relies on the fact that there is
      at most one spinlock allocated for every slot in the top
      level nested table, so it is not possible for two threads
      to try to allocate the same table at the same time.
      
      This assumption is a little fragile (it is not explicit) and is
      unnecessary as cmpxchg() can be used instead.
      
      A future patch will replace the spinlocks by per-bucket bitlocks,
      and then we won't be able to protect the slot pointer with a spinlock.
      
      So replace rcu_assign_pointer() with cmpxchg() - which has equivalent
      barrier properties.
      If it the cmp fails, free the table that was just allocated.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a41c294
  2. 07 Apr, 2019 20 commits