1. 17 Jan, 2020 9 commits
    • Alex Marginean's avatar
      net: dsa: felix: Don't restart PCS SGMII AN if not needed · 8c6123e1
      Alex Marginean authored
      Some PHYs like VSC8234 don't like it when AN restarts on their system side
      and they restart line side AN too, going into an endless link up/down loop.
      Don't restart PCS AN if link is up already.
      
      Although in theory this feedback loop should be possible with the other
      in-band AN modes too, for some reason it was not seen with the VSC8514
      QSGMII and AQR412 USXGMII PHYs. So keep this logic only for SGMII where
      the problem was found.
      
      Fixes: bdeced75 ("net: dsa: felix: Add PCS operations for PHYLINK")
      Suggested-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarAlex Marginean <alexandru.marginean@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c6123e1
    • Alex Marginean's avatar
      net: dsa: felix: Set USXGMII link based on BMSR, not LPA · 062a33b1
      Alex Marginean authored
      At least some PHYs (AQR412) don't advertise copper-side link status
      during system side AN.
      
      So remove this duplicate assignment to pcs->link and rely on the
      previous one for link state: the local indication from the MAC PCS.
      
      Fixes: bdeced75 ("net: dsa: felix: Add PCS operations for PHYLINK")
      Signed-off-by: default avatarAlex Marginean <alexandru.marginean@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      062a33b1
    • Ido Schimmel's avatar
      Documentation: Fix typo in devlink documentation · 1d0ee02b
      Ido Schimmel authored
      The driver is named "mlxsw", not "mlx5".
      
      Fixes: d4255d75 ("devlink: document info versions for each driver")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d0ee02b
    • Vladimir Oltean's avatar
      enetc: Don't print from enetc_sched_speed_set when link goes down · 90f29f0e
      Vladimir Oltean authored
      It is not an error to unplug a cable from the ENETC port even with TSN
      offloads, so don't spam the log with link-related messages from the
      tc-taprio offload subsystem, a single notification is sufficient:
      
      [10972.351859] fsl_enetc 0000:00:00.0 eno0: Qbv PSPEED set speed link down.
      [10972.360241] fsl_enetc 0000:00:00.0 eno0: Link is Down
      
      Fixes: 2e47cb41 ("enetc: update TSN Qbv PSPEED set according to adjust link speed")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90f29f0e
    • Alexandru Ardelean's avatar
      net: phy: adin: const-ify static data · aa63b947
      Alexandru Ardelean authored
      Some bits of static data should have been made const from the start.
      This change adds the const qualifier where appropriate.
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa63b947
    • Hongbo Yao's avatar
      drivers/net: netdevsim depends on INET · 1f399fc7
      Hongbo Yao authored
      If CONFIG_INET is not set and CONFIG_NETDEVSIM=y.
      Building drivers/net/netdevsim/fib.o will get the following error:
      
      drivers/net/netdevsim/fib.o: In function `nsim_fib4_rt_hw_flags_set':
      fib.c:(.text+0x12b): undefined reference to `fib_alias_hw_flags_set'
      drivers/net/netdevsim/fib.o: In function `nsim_fib4_rt_destroy':
      fib.c:(.text+0xb11): undefined reference to `free_fib_info'
      
      Correct the Kconfig for netdevsim.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: 48bb9eb4 ("netdevsim: fib: Add dummy implementation for FIB offload")
      Signed-off-by: default avatarHongbo Yao <yaohongbo@huawei.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f399fc7
    • Florian Fainelli's avatar
      net: phy: Maintain MDIO device and bus statistics · 080bb352
      Florian Fainelli authored
      We maintain global statistics for an entire MDIO bus, as well as broken
      down, per MDIO bus address statistics. Given that it is possible for
      MDIO devices such as switches to access MDIO bus addresses for which
      there is not a mdio_device instance created (therefore not a a
      corresponding device directory in sysfs either), we also maintain
      per-address statistics under the statistics folder. The layout looks
      like this:
      
      /sys/class/mdio_bus/../statistics/
      	transfers
      	errrors
      	writes
      	reads
      	transfers_<addr>
      	errors_<addr>
      	writes_<addr>
      	reads_<addr>
      
      When a mdio_device instance is registered, a statistics/ folder is
      created with the tranfers, errors, writes and reads attributes which
      point to the appropriate MDIO bus statistics structure.
      
      Statistics are 64-bit unsigned quantities and maintained through the
      u64_stats_sync.h helper functions.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      080bb352
    • Eric Dumazet's avatar
      netdevsim: fix nsim_fib6_rt_create() error path · 41cdc741
      Eric Dumazet authored
      It seems nsim_fib6_rt_create() intent was to return
      either a valid pointer or an embedded error code.
      
      BUG: unable to handle page fault for address: fffffffffffffff4
      PGD 9870067 P4D 9870067 PUD 9872067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 22851 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:jhash2 include/linux/jhash.h:125 [inline]
      RIP: 0010:rhashtable_jhash2+0x76/0x2c0 lib/rhashtable.c:963
      Code: b9 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 14 08 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 30 02 00 00 49 8d 7e 04 <41> 8b 06 48 be 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f b6
      RSP: 0018:ffffc90016127190 EFLAGS: 00010246
      RAX: 0000000000000007 RBX: 00000000dfb3ab49 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: ffffffff839ba7c8 RDI: fffffffffffffff8
      RBP: ffffc900161271c0 R08: ffff8880951f8640 R09: ffffed1015d0703d
      R10: ffffed1015d0703c R11: ffff8880ae8381e3 R12: 00000000dfb3ab49
      R13: 00000000dfb3ab49 R14: fffffffffffffff4 R15: 0000000000000007
      FS:  00007f40bfbc6700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: fffffffffffffff4 CR3: 0000000093660000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       rht_key_get_hash include/linux/rhashtable.h:133 [inline]
       rht_key_hashfn include/linux/rhashtable.h:159 [inline]
       rht_head_hashfn include/linux/rhashtable.h:174 [inline]
       __rhashtable_insert_fast.constprop.0+0xe15/0x1180 include/linux/rhashtable.h:723
       rhashtable_insert_fast include/linux/rhashtable.h:832 [inline]
       nsim_fib6_rt_add drivers/net/netdevsim/fib.c:603 [inline]
       nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:658 [inline]
       nsim_fib6_event drivers/net/netdevsim/fib.c:719 [inline]
       nsim_fib_event drivers/net/netdevsim/fib.c:744 [inline]
       nsim_fib_event_nb+0x1b16/0x2600 drivers/net/netdevsim/fib.c:772
       notifier_call_chain+0xc2/0x230 kernel/notifier.c:83
       __atomic_notifier_call_chain+0xa6/0x1a0 kernel/notifier.c:173
       atomic_notifier_call_chain+0x2e/0x40 kernel/notifier.c:183
       call_fib_notifiers+0x173/0x2a0 net/core/fib_notifier.c:35
       call_fib6_notifiers+0x4b/0x60 net/ipv6/fib6_notifier.c:22
       call_fib6_entry_notifiers+0xfb/0x150 net/ipv6/ip6_fib.c:399
       fib6_add_rt2node net/ipv6/ip6_fib.c:1216 [inline]
       fib6_add+0x20cd/0x3ec0 net/ipv6/ip6_fib.c:1471
       __ip6_ins_rt+0x54/0x80 net/ipv6/route.c:1315
       ip6_ins_rt+0x96/0xd0 net/ipv6/route.c:1325
       __ipv6_dev_ac_inc+0x76f/0xb20 net/ipv6/anycast.c:324
       ipv6_sock_ac_join+0x4c1/0x790 net/ipv6/anycast.c:139
       do_ipv6_setsockopt.isra.0+0x3908/0x4290 net/ipv6/ipv6_sockglue.c:670
       ipv6_setsockopt+0xff/0x180 net/ipv6/ipv6_sockglue.c:944
       udpv6_setsockopt+0x68/0xb0 net/ipv6/udp.c:1564
       sock_common_setsockopt+0x94/0xd0 net/core/sock.c:3149
       __sys_setsockopt+0x261/0x4c0 net/socket.c:2130
       __do_sys_setsockopt net/socket.c:2146 [inline]
       __se_sys_setsockopt net/socket.c:2143 [inline]
       __x64_sys_setsockopt+0xbe/0x150 net/socket.c:2143
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x45aff9
      
      Fixes: 48bb9eb4 ("netdevsim: fib: Add dummy implementation for FIB offload")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Ido Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41cdc741
    • Madhuparna Bhowmik's avatar
      net: xen-netback: hash.c: Use built-in RCU list checking · f3265971
      Madhuparna Bhowmik authored
      list_for_each_entry_rcu has built-in RCU and lock checking.
      Pass cond argument to list_for_each_entry_rcu.
      Signed-off-by: default avatarMadhuparna Bhowmik <madhuparnabhowmik04@gmail.com>
      Acked-by: default avatarWei Liu <wei.liu@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3265971
  2. 16 Jan, 2020 1 commit
  3. 15 Jan, 2020 30 commits
    • Jacob Keller's avatar
      devlink: fix typos in qed documentation · 1ccf6c13
      Jacob Keller authored
      Review of the recently added documentation file for the qed driver
      noticed a couple of typos. Fix them now.
      Noticed-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Fixes: 0f261c3c ("devlink: add a driver-specific file for the qed driver")
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ccf6c13
    • Ulrich Weber's avatar
      pptp: support sockets bound to an interface · 43d28c61
      Ulrich Weber authored
      use sk_bound_dev_if for route lookup as already done
      in most of the other ip_route_output_ports() calls.
      
      Since most PPPoA providers use 10.0.0.138 as default gateway IP
      this will allow connections to multiple PPTP providers with the
      same IP address over different interfaces.
      Signed-off-by: default avatarUlrich Weber <ulrich.weber@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43d28c61
    • David S. Miller's avatar
      Merge tag 'batadv-next-for-davem-20200114' of git://git.open-mesh.org/linux-merge · 8fec380a
      David S. Miller authored
      Simon Wunderlich says:
      
      ====================
      This feature/cleanup patchset includes the following patches:
      
       - bump version strings, by Simon Wunderlich
      
       - fix typo and kerneldocs, by Sven Eckelmann
      
       - use WiFi txbitrate for B.A.T.M.A.N. V as fallback, by René Treffer
      
       - silence some endian sparse warnings by adding annotations,
         by Sven Eckelmann
      
       - Update copyright years to 2020, by Sven Eckelmann
      
       - Disable deprecated sysfs configuration by default, by Sven Eckelmann
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8fec380a
    • David S. Miller's avatar
      Merge branch 'bridge-add-vlan-notifications-and-rtm-support' · 4e2fa6b9
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: add vlan notifications and rtm support
      
      This patch-set is a prerequisite for adding per-vlan options support
      because we need to be able to send vlan-only notifications and do larger
      vlan netlink dumps. Per-vlan options are needed as we move the control
      more to vlans and would like to add per-vlan state (needed for per-vlan
      STP and EVPN), per-vlan multicast options and control, and I'm sure
      there would be many more per-vlan options coming.
      Now we create/delete/dump vlans with the device AF_SPEC attribute which is
      fine since we support vlan ranges or use a compact bridge_vlan_info
      structure, but that cannot really be extended to support per-vlan options
      well. The biggest issue is dumping them - we tried using the af_spec with
      a new vlan option attribute but that led to insufficient message size
      quickly, also another minor problem with that is we have to dump all vlans
      always when notifying which, with options present, can be huge if they have
      different options set, so we decided to add new rtm message types
      specifically for vlans and register handlers for them and a new bridge vlan
      notification nl group for vlan-only notifications.
      The new RTM NEW/DEL/GETVLAN types introduced match the current af spec
      bridge functionality and in fact use the same helpers.
      The new nl format is:
       [BRIDGE_VLANDB_ENTRY]
          [BRIDGE_VLANDB_ENTRY_INFO] - bridge_vlan_info (either 1 vlan or
                                                         range start)
          [BRIDGE_VLANDB_ENTRY_RANGE] - range end
      
      This allows to encapsulate a range in a single attribute and also to
      create vlans and immediately set options on all of them with a single
      attribute. The GETVLAN dump can span multiple messages and dump all the
      necessary information. The vlan-only notifications are sent on
      NEW/DELVLAN events or when vlan options change (currently only flags),
      we try hard to compress the vlans into ranges in the notifications as
      well. When the per-vlan options are added we'll add helpers to check for
      option equality between neighbor vlans and will keep compressing them
      when possible.
      
      Note patch 02 is not really required, it's just a nice addition to have
      human-readable error messages from the different vlan checks.
      
      iproute2 changes and selftests will be sent with the next set which adds
      the first per-vlan option - per-vlan state similar to the port state.
      
      v2: changed patch 03 and patch 04 to use nlmsg_parse() in order to
          strictly validate the msg and make sure there are no remaining bytes
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e2fa6b9
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: notify on vlan add/delete/change flags · f545923b
      Nikolay Aleksandrov authored
      Now that we can notify, send a notification on add/del or change of flags.
      Notifications are also compressed when possible to reduce their number
      and relieve user-space of extra processing, due to that we have to
      manually notify after each add/del in order to avoid double
      notifications. We try hard to notify only about the vlans which actually
      changed, thus a single command can result in multiple notifications
      about disjoint ranges if there were vlans which didn't change inside.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f545923b
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: add rtnetlink group and notify support · cf5bddb9
      Nikolay Aleksandrov authored
      Add a new rtnetlink group for bridge vlan notifications - RTNLGRP_BRVLAN
      and add support for sending vlan notifications (both single and ranges).
      No functional changes intended, the notification support will be used by
      later patches.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf5bddb9
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: add rtm range support · 0ab55879
      Nikolay Aleksandrov authored
      Add a new vlandb nl attribute - BRIDGE_VLANDB_ENTRY_RANGE which causes
      RTM_NEWVLAN/DELVAN to act on a range. Dumps now automatically compress
      similar vlans into ranges. This will be also used when per-vlan options
      are introduced and vlans' options match, they will be put into a single
      range which is encapsulated in one netlink attribute. We need to run
      similar checks as br_process_vlan_info() does because these ranges will
      be used for options setting and they'll be able to skip
      br_process_vlan_info().
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ab55879
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: add del rtm message support · adb3ce9b
      Nikolay Aleksandrov authored
      Adding RTM_DELVLAN support similar to RTM_NEWVLAN is simple, just need to
      map DELVLAN to DELLINK and register the handler.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adb3ce9b
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: add new rtm message support · f26b2965
      Nikolay Aleksandrov authored
      Add initial RTM_NEWVLAN support which can only create vlans, operating
      similar to the current br_afspec(). We will use it later to also change
      per-vlan options. Old-style (flag-based) vlan ranges are not allowed
      when using RTM messages, we will introduce vlan ranges later via a new
      nested attribute which would allow us to have all the information about a
      range encapsulated into a single nl attribute.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f26b2965
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: add rtm definitions and dump support · 8dcea187
      Nikolay Aleksandrov authored
      This patch adds vlan rtm definitions:
       - NEWVLAN: to be used for creating vlans, setting options and
         notifications
       - DELVLAN: to be used for deleting vlans
       - GETVLAN: used for dumping vlan information
      
      Dumping vlans which can span multiple messages is added now with basic
      information (vid and flags). We use nlmsg_parse() to validate the header
      length in order to be able to extend the message with filtering
      attributes later.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8dcea187
    • Nikolay Aleksandrov's avatar
      net: bridge: netlink: add extack error messages when processing vlans · 8f4cc940
      Nikolay Aleksandrov authored
      Add extack messages on vlan processing errors. We need to move the flags
      missing check after the "last" check since we may have "last" set but
      lack a range end flag in the next entry.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f4cc940
    • Nikolay Aleksandrov's avatar
      net: bridge: vlan: add helpers to check for vlan id/range validity · 5a46facb
      Nikolay Aleksandrov authored
      Add helpers to check if a vlan id or range are valid. The range helper
      must be called when range start or end are detected.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a46facb
    • David S. Miller's avatar
      Merge branch 'net-Add-route-offload-indication' · f6310b61
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      net: Add route offload indication
      
      This patch set adds offload indication to IPv4 and IPv6 routes. So far
      offload indication was only available for the nexthop via
      'RTNH_F_OFFLOAD', which is problematic as a nexthop is usually shared
      between multiple routes.
      
      Based on feedback from Roopa and David on the RFC [1], the indication is
      split to 'offload' and 'trap'. This is done because not all the routes
      present in hardware actually offload traffic from the kernel. For
      example, host routes merely trap packets to the kernel. The two flags
      are dumped to user space via the 'rtm_flags' field in the ancillary
      header of the rtnetlink message.
      
      In addition, the patch set uses the new flags in order to test the FIB
      offload API by adding a dummy FIB offload implementation to netdevsim.
      The new tests are added to a shared library and can be therefore shared
      between different drivers.
      
      Patches #1-#3 add offload indication to IPv4 routes.
      Patches #4 adds offload indication to IPv6 routes.
      Patches #5-#6 add support for the offload indication in mlxsw.
      Patch #7 adds dummy FIB offload implementation in netdevsim.
      Patches #8-#10 add selftests.
      
      v2 (feedback from David Ahern):
      * Patch #2: Name last argument of fib_dump_info()
      * Patch #2: Move 'struct fib_rt_info' to include/net/ip_fib.h so that it
        could later be passed to fib_alias_hw_flags_set()
      * Patch #3: Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set()
      * Patch #6: Convert to new fib_alias_hw_flags_set() interface
      * Patch #7: Convert to new fib_alias_hw_flags_set() interface
      
      [1] https://patchwork.ozlabs.org/cover/1170530/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6310b61
    • Ido Schimmel's avatar
      selftests: mlxsw: Add test for FIB offload API · 212a37c2
      Ido Schimmel authored
      The test reuses the common FIB offload tests in order to make sure that
      mlxsw correctly implements FIB offload.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      212a37c2
    • Ido Schimmel's avatar
      selftests: netdevsim: Add test for FIB offload API · ffdc5149
      Ido Schimmel authored
      Test various aspects of the FIB offload API on top of the netdevsim
      implementation. Both good and bad flows are tested.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ffdc5149
    • Ido Schimmel's avatar
      selftests: forwarding: Add helpers and tests for FIB offload · c662455b
      Ido Schimmel authored
      Implement a set of common helpers and tests for FIB offload that can be
      used by multiple drivers to check their FIB offload implementations.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c662455b
    • Ido Schimmel's avatar
      netdevsim: fib: Add dummy implementation for FIB offload · 48bb9eb4
      Ido Schimmel authored
      Implement dummy IPv4 and IPv6 FIB "offload" in the driver by storing
      currently "programmed" routes in a hash table. Each route in the hash
      table is marked with "trap" indication. The indication is cleared when
      the route is replaced or when the netdevsim instance is deleted.
      
      This will later allow us to test the route offload API on top of
      netdevsim.
      
      v2:
      * Convert to new fib_alias_hw_flags_set() interface
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48bb9eb4
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Set hardware flags for routes · ee5a0448
      Ido Schimmel authored
      Previous patches added support for two hardware flags for IPv4 and IPv6
      routes: 'RTM_F_OFFLOAD' and 'RTM_F_TRAP'. Both indicate the presence of
      the route in hardware. The first indicates that traffic is actually
      offloaded from the kernel, whereas the second indicates that packets
      hitting such routes are trapped to the kernel for processing (e.g., host
      routes).
      
      Use these two flags in mlxsw. The flags are modified in two places.
      Firstly, whenever a route is updated in the device's table. This
      includes the addition, deletion or update of a route. For example, when
      a host route is promoted to perform NVE decapsulation, its action in the
      device is updated, the 'RTM_F_OFFLOAD' flag set and the 'RTM_F_TRAP'
      flag cleared.
      
      Secondly, when a route is replaced and overwritten by another route, its
      flags are cleared.
      
      v2:
      * Convert to new fib_alias_hw_flags_set() interface
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee5a0448
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Separate nexthop offload indication from route · 8c5a5b9b
      Ido Schimmel authored
      The driver currently uses the 'RTNH_F_OFFLOAD' flag for both routes and
      nexthops, which is cumbersome and unnecessary now that we have separate
      flag for the route itself.
      
      Separate the offload indication for nexthops from routes and call it
      whenever the offload state within the nexthop group changes.
      
      Note that IPv6 (unlike IPv4) does not share the same nexthop group
      between different routes, whereas mlxsw does. Therefore, whenever the
      offload indication within an IPv6 nexthop group changes, all the linked
      routes need to be updated.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c5a5b9b
    • Ido Schimmel's avatar
      ipv6: Add "offload" and "trap" indications to routes · bb3c4ab9
      Ido Schimmel authored
      In a similar fashion to previous patch, add "offload" and "trap"
      indication to IPv6 routes.
      
      This is done by using two unused bits in 'struct fib6_info' to hold
      these indications. Capable drivers are expected to set these when
      processing the various in-kernel route notifications.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb3c4ab9
    • Ido Schimmel's avatar
      ipv4: Add "offload" and "trap" indications to routes · 90b93f1b
      Ido Schimmel authored
      When performing L3 offload, routes and nexthops are usually programmed
      into two different tables in the underlying device. Therefore, the fact
      that a nexthop resides in hardware does not necessarily mean that all
      the associated routes also reside in hardware and vice-versa.
      
      While the kernel can signal to user space the presence of a nexthop in
      hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag
      for routes. In addition, the fact that a route resides in hardware does
      not necessarily mean that the traffic is offloaded. For example,
      unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap
      packets to the CPU so that the kernel will be able to generate the
      appropriate ICMP error packet.
      
      This patch adds an "offload" and "trap" indications to IPv4 routes, so
      that users will have better visibility into the offload process.
      
      'struct fib_alias' is extended with two new fields that indicate if the
      route resides in hardware or not and if it is offloading traffic from
      the kernel or trapping packets to it. Note that the new fields are added
      in the 6 bytes hole and therefore the struct still fits in a single
      cache line [1].
      
      Capable drivers are expected to invoke fib_alias_hw_flags_set() with the
      route's key in order to set the flags.
      
      The indications are dumped to user space via a new flags (i.e.,
      'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the
      ancillary header.
      
      v2:
      * Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set()
      
      [1]
      struct fib_alias {
              struct hlist_node  fa_list;                      /*     0    16 */
              struct fib_info *          fa_info;              /*    16     8 */
              u8                         fa_tos;               /*    24     1 */
              u8                         fa_type;              /*    25     1 */
              u8                         fa_state;             /*    26     1 */
              u8                         fa_slen;              /*    27     1 */
              u32                        tb_id;                /*    28     4 */
              s16                        fa_default;           /*    32     2 */
              u8                         offload:1;            /*    34: 0  1 */
              u8                         trap:1;               /*    34: 1  1 */
              u8                         unused:6;             /*    34: 2  1 */
      
              /* XXX 5 bytes hole, try to pack */
      
              struct callback_head rcu __attribute__((__aligned__(8))); /*    40    16 */
      
              /* size: 56, cachelines: 1, members: 12 */
              /* sum members: 50, holes: 1, sum holes: 5 */
              /* sum bitfield members: 8 bits (1 bytes) */
              /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */
              /* last cacheline: 56 bytes */
      } __attribute__((__aligned__(8)));
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90b93f1b
    • Ido Schimmel's avatar
      ipv4: Encapsulate function arguments in a struct · 1e301fd0
      Ido Schimmel authored
      fib_dump_info() is used to prepare RTM_{NEW,DEL}ROUTE netlink messages
      using the passed arguments. Currently, the function takes 11 arguments,
      6 of which are attributes of the route being dumped (e.g., prefix, TOS).
      
      The next patch will need the function to also dump to user space an
      indication if the route is present in hardware or not. Instead of
      passing yet another argument, change the function to take a struct
      containing the different route attributes.
      
      v2:
      * Name last argument of fib_dump_info()
      * Move 'struct fib_rt_info' to include/net/ip_fib.h so that it could
        later be passed to fib_alias_hw_flags_set()
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e301fd0
    • Ido Schimmel's avatar
      ipv4: Replace route in list before notifying · 6324d0fa
      Ido Schimmel authored
      Subsequent patches will add an offload / trap indication to routes which
      will signal if the route is present in hardware or not.
      
      After programming the route to the hardware, drivers will have to ask
      the IPv4 code to set the flags by passing the route's key.
      
      In the case of route replace, the new route is notified before it is
      actually inserted into the FIB alias list. This can prevent simple
      drivers (e.g., netdevsim) that program the route to the hardware in the
      same context it is notified in from being able to set the flag.
      
      Solve this by first inserting the new route to the list and rollback the
      operation in case the route was vetoed.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6324d0fa
    • Lorenzo Bianconi's avatar
      net: socionext: get rid of huge dma sync in netsec_alloc_rx_data · 0fadc0a2
      Lorenzo Bianconi authored
      Socionext driver can run on dma coherent and non-coherent devices.
      Get rid of huge dma_sync_single_for_device in netsec_alloc_rx_data since
      now the driver can let page_pool API to managed needed DMA sync
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fadc0a2
    • David S. Miller's avatar
      Merge branch 'QRTR-flow-control-improvements' · 0c73ffc7
      David S. Miller authored
      Bjorn Andersson says:
      
      ====================
      QRTR flow control improvements
      
      In order to prevent overconsumption of resources on the remote side QRTR
      implements a flow control mechanism.
      
      Move the handling of the incoming confirm_rx to the receiving process to
      ensure incoming flow is controlled. Then implement outgoing flow
      control, using the recommended algorithm of counting outstanding
      non-confirmed messages and blocking when hitting a limit. The last three
      patches refactors the node assignment and port lookup, in order to
      remove the worker in the receive path.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c73ffc7
    • Bjorn Andersson's avatar
      net: qrtr: Remove receive worker · e04df98a
      Bjorn Andersson authored
      Rather than enqueuing messages and scheduling a worker to deliver them
      to the individual sockets we can now, thanks to the previous work, move
      this directly into the endpoint callback.
      
      This saves us a context switch per incoming message and removes the
      possibility of an opportunistic suspend to happen between the message is
      coming from the endpoint until it ends up in the socket's receive
      buffer.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e04df98a
    • Bjorn Andersson's avatar
      net: qrtr: Make qrtr_port_lookup() use RCU · f16a4b26
      Bjorn Andersson authored
      The important part of qrtr_port_lookup() wrt synchronization is that the
      function returns a reference counted struct qrtr_sock, or fail.
      
      As such we need only to ensure that an decrement of the object's
      refcount happens inbetween the finding of the object in the idr and
      qrtr_port_lookup()'s own increment of the object.
      
      By using RCU and putting a synchronization point after we remove the
      mapping from the idr, but before it can be released we achieve this -
      with the benefit of not having to hold the mutex in qrtr_port_lookup().
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f16a4b26
    • Bjorn Andersson's avatar
      net: qrtr: Migrate node lookup tree to spinlock · 0a7e0d0e
      Bjorn Andersson authored
      Move operations on the qrtr_nodes radix tree under a separate spinlock
      and make the qrtr_nodes tree GFP_ATOMIC, to allow operation from atomic
      context in a subsequent patch.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a7e0d0e
    • Bjorn Andersson's avatar
      net: qrtr: Implement outgoing flow control · 5fdeb0d3
      Bjorn Andersson authored
      In order to prevent overconsumption of resources on the remote side QRTR
      implements a flow control mechanism.
      
      The mechanism works by the sender keeping track of the number of
      outstanding unconfirmed messages that has been transmitted to a
      particular node/port pair.
      
      Upon count reaching a low watermark (L) the confirm_rx bit is set in the
      outgoing message and when the count reaching a high watermark (H)
      transmission will be blocked upon the reception of a resume_tx message
      from the remote, that resets the counter to 0.
      
      This guarantees that there will be at most 2H - L messages in flight.
      Values chosen for L and H are 5 and 10 respectively.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fdeb0d3
    • Bjorn Andersson's avatar
      net: qrtr: Move resume-tx transmission to recvmsg · cb6530b9
      Bjorn Andersson authored
      The confirm-rx bit is used to implement a per port flow control, in
      order to make sure that no messages are dropped due to resource
      exhaustion. Move the resume-tx transmission to recvmsg to only confirm
      messages as they are consumed by the application.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb6530b9