1. 08 Jan, 2019 13 commits
    • Ido Schimmel's avatar
      net: bridge: Fix VLANs memory leak · 27973793
      Ido Schimmel authored
      When adding / deleting VLANs to / from a bridge port, the bridge driver
      first tries to propagate the information via switchdev and falls back to
      the 8021q driver in case the underlying driver does not support
      switchdev. This can result in a memory leak [1] when VXLAN and mlxsw
      ports are enslaved to the bridge:
      
      $ ip link set dev vxlan0 master br0
      # No mlxsw ports are enslaved to 'br0', so mlxsw ignores the switchdev
      # notification and the bridge driver adds the VLAN on 'vxlan0' via the
      # 8021q driver
      $ bridge vlan add vid 10 dev vxlan0 pvid untagged
      # mlxsw port is enslaved to the bridge
      $ ip link set dev swp1 master br0
      # mlxsw processes the switchdev notification and the 8021q driver is
      # skipped
      $ bridge vlan del vid 10 dev vxlan0
      
      This results in 'struct vlan_info' and 'struct vlan_vid_info' being
      leaked, as they were allocated by the 8021q driver during VLAN addition,
      but never freed as the 8021q driver was skipped during deletion.
      
      Fix this by introducing a new VLAN private flag that indicates whether
      the VLAN was added on the port by switchdev or the 8021q driver. If the
      VLAN was added by the 8021q driver, then we make sure to delete it via
      the 8021q driver as well.
      
      [1]
      unreferenced object 0xffff88822d20b1e8 (size 256):
        comm "bridge", pid 2532, jiffies 4295216998 (age 1188.830s)
        hex dump (first 32 bytes):
          e0 42 97 ce 81 88 ff ff 00 00 00 00 00 00 00 00  .B..............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000f82d851d>] kmem_cache_alloc_trace+0x1be/0x330
          [<00000000e0178b02>] vlan_vid_add+0x661/0x920
          [<00000000218ebd5f>] __vlan_add+0x1be9/0x3a00
          [<000000006eafa1ca>] nbp_vlan_add+0x8b3/0xd90
          [<000000003535392c>] br_vlan_info+0x132/0x410
          [<00000000aedaa9dc>] br_afspec+0x75c/0x870
          [<00000000f5716133>] br_setlink+0x3dc/0x6d0
          [<00000000aceca5e2>] rtnl_bridge_setlink+0x615/0xb30
          [<00000000a2f2d23e>] rtnetlink_rcv_msg+0x3a3/0xa80
          [<0000000064097e69>] netlink_rcv_skb+0x152/0x3c0
          [<000000008be8d614>] rtnetlink_rcv+0x21/0x30
          [<000000009ab2ca25>] netlink_unicast+0x52f/0x740
          [<00000000e7d9ac96>] netlink_sendmsg+0x9c7/0xf50
          [<000000005d1e2050>] sock_sendmsg+0xbe/0x120
          [<00000000d51426bc>] ___sys_sendmsg+0x778/0x8f0
          [<00000000b9d7b2cc>] __sys_sendmsg+0x112/0x270
      unreferenced object 0xffff888227454308 (size 32):
        comm "bridge", pid 2532, jiffies 4295216998 (age 1188.882s)
        hex dump (first 32 bytes):
          88 b2 20 2d 82 88 ff ff 88 b2 20 2d 82 88 ff ff  .. -...... -....
          81 00 0a 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000f82d851d>] kmem_cache_alloc_trace+0x1be/0x330
          [<0000000018050631>] vlan_vid_add+0x3e6/0x920
          [<00000000218ebd5f>] __vlan_add+0x1be9/0x3a00
          [<000000006eafa1ca>] nbp_vlan_add+0x8b3/0xd90
          [<000000003535392c>] br_vlan_info+0x132/0x410
          [<00000000aedaa9dc>] br_afspec+0x75c/0x870
          [<00000000f5716133>] br_setlink+0x3dc/0x6d0
          [<00000000aceca5e2>] rtnl_bridge_setlink+0x615/0xb30
          [<00000000a2f2d23e>] rtnetlink_rcv_msg+0x3a3/0xa80
          [<0000000064097e69>] netlink_rcv_skb+0x152/0x3c0
          [<000000008be8d614>] rtnetlink_rcv+0x21/0x30
          [<000000009ab2ca25>] netlink_unicast+0x52f/0x740
          [<00000000e7d9ac96>] netlink_sendmsg+0x9c7/0xf50
          [<000000005d1e2050>] sock_sendmsg+0xbe/0x120
          [<00000000d51426bc>] ___sys_sendmsg+0x778/0x8f0
          [<00000000b9d7b2cc>] __sys_sendmsg+0x112/0x270
      
      Fixes: d70e42b2 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Cc: bridge@lists.linux-foundation.org
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27973793
    • Ido Schimmel's avatar
      selftests: mlxsw: Add a test case for VLAN addition error flow · 16dc42e4
      Ido Schimmel authored
      Add a test case for the issue fixed by previous commit. In case the
      offloading of an unsupported VxLAN tunnel was triggered by adding the
      mapped VLAN to a local port, then error should be returned to the user.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16dc42e4
    • Ido Schimmel's avatar
      mlxsw: spectrum_nve: Replace error code with EINVAL · 412283ee
      Ido Schimmel authored
      Adding a VLAN on a port can trigger the offload of a VXLAN tunnel which
      is already a member in the VLAN. In case the configuration of the VXLAN
      is not supported, the driver would return -EOPNOTSUPP.
      
      This is problematic since bridge code does not interpret this as error,
      but rather that it should try to setup the VLAN using the 8021q driver
      instead of switchdev.
      
      Fixes: d70e42b2 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      412283ee
    • Ido Schimmel's avatar
      mlxsw: spectrum_switchdev: Avoid returning errors in commit phase · 457e20d6
      Ido Schimmel authored
      Drivers are not supposed to return errors in switchdev commit phase if
      they returned OK in prepare phase. Otherwise, a WARNING is emitted.
      However, when the offloading of a VXLAN tunnel is triggered by the
      addition of a VLAN on a local port, it is not possible to guarantee that
      the commit phase will succeed without doing a lot of work.
      
      In these cases, the artificial division between prepare and commit phase
      does not make sense, so simply do the work in the prepare phase.
      
      Fixes: d70e42b2 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      457e20d6
    • Ido Schimmel's avatar
      mlxsw: spectrum: Add VXLAN dependency for spectrum · 143a8e03
      Ido Schimmel authored
      When VXLAN is a loadable module, MLXSW_SPECTRUM must not be built-in:
      
      drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:2547: undefined
      reference to `vxlan_fdb_find_uc'
      
      Add Kconfig dependency to enforce usable configurations.
      
      Fixes: 1231e04f ("mlxsw: spectrum_switchdev: Add support for VxLAN encapsulation")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      143a8e03
    • Jiri Pirko's avatar
      mlxsw: spectrum: Disable lag port TX before removing it · 8adbe212
      Jiri Pirko authored
      Make sure that lag port TX is disabled before mlxsw_sp_port_lag_leave()
      is called and prevent from possible EMAD error.
      
      Fixes: 0d65fc13 ("mlxsw: spectrum: Implement LAG port join/leave")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8adbe212
    • Nir Dotan's avatar
      mlxsw: spectrum_acl: Remove ASSERT_RTNL()s in module removal flow · 04d075b7
      Nir Dotan authored
      Removal of the mlxsw driver on Spectrum-2 platforms hits an ASSERT_RTNL()
      in Spectrum-2 ACL Bloom filter and in ERP removal paths. This happens
      because the multicast router implementation in Spectrum-2 relies on ACLs.
      Taking the RTNL lock upon driver removal is useless since the driver first
      removes its ports and unregisters from notifiers so concurrent writes
      cannot happen at that time. The assertions were originally put as a
      reminder for future work involving ERP background optimization, but having
      these assertions only during addition serves this purpose as well.
      
      Therefore remove the ASSERT_RTNL() in both places related to ERP and Bloom
      filter removal.
      
      Fixes: cf7221a4 ("mlxsw: spectrum_router: Add Multicast routing support for Spectrum-2")
      Signed-off-by: default avatarNir Dotan <nird@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04d075b7
    • Nir Dotan's avatar
      mlxsw: spectrum_acl: Add cleanup after C-TCAM update error condition · ff0db43c
      Nir Dotan authored
      When writing to C-TCAM, mlxsw driver uses cregion->ops->entry_insert().
      In case of C-TCAM HW insertion error, the opposite action should take
      place.
      Add error handling case in which the C-TCAM region entry is removed, by
      calling cregion->ops->entry_remove().
      
      Fixes: a0a777b9 ("mlxsw: spectrum_acl: Start using A-TCAM")
      Signed-off-by: default avatarNir Dotan <nird@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff0db43c
    • Heiner Kallweit's avatar
      r8169: load Realtek PHY driver module before r8169 · 11287b69
      Heiner Kallweit authored
      This soft dependency works around an issue where sometimes the genphy
      driver is used instead of the dedicated PHY driver. The root cause of
      the issue isn't clear yet. People reported the unloading/re-loading
      module r8169 helps, and also configuring this soft dependency in
      the modprobe config files. Important just seems to be that the
      realtek module is loaded before r8169.
      
      Once this has been applied preliminary fix 38af4b90 ("net: phy:
      add workaround for issue where PHY driver doesn't bind to the device")
      will be removed.
      
      Fixes: f1e911d5 ("r8169: add basic phylib support")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11287b69
    • Bryan Whitehead's avatar
      lan743x: Remove phy_read from link status change function · a0071840
      Bryan Whitehead authored
      It has been noticed that some phys do not have the registers
      required by the previous implementation.
      
      To fix this, instead of using phy_read, the required information
      is extracted from the phy_device structure.
      
      fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Signed-off-by: default avatarBryan Whitehead <Bryan.Whitehead@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0071840
    • Eugene Syromiatnikov's avatar
      ptp: uapi: change _IOW to IOWR in PTP_SYS_OFFSET_EXTENDED definition · b7ea4894
      Eugene Syromiatnikov authored
      The ioctl command is read/write (or just read, if the fact that user space
      writes n_samples field is ignored).
      Signed-off-by: default avatarEugene Syromiatnikov <esyr@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7ea4894
    • Eugene Syromiatnikov's avatar
      ptp: check that rsv field is zero in struct ptp_sys_offset_extended · 895ac137
      Eugene Syromiatnikov authored
      Otherwise it is impossible to use it for something else, as it will break
      userspace that puts garbage there.
      
      The same check should be done in other structures, but the fact that
      data in reserved fields is ignored is already part of the kernel ABI.
      Signed-off-by: default avatarEugene Syromiatnikov <esyr@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      895ac137
    • David S. Miller's avatar
      Merge ra.kernel.org:/pub/scm/linux/kernel/git/bpf/bpf · 977e4899
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-01-08
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix BSD'ism in sendmsg(2) to rewrite unspecified IPv6 dst for
         unconnected UDP sockets with [::1] _after_ cgroup BPF invocation,
         from Andrey.
      
      2) Follow-up fix to the speculation fix where we need to reject a
         corner case for sanitation when ptr and scalars are mixed in the
         same alu op. Also, some unrelated minor doc fixes, from Daniel.
      
      3) Fix BPF kselftest's incorrect uses of create_and_get_cgroup()
         by not assuming fd of zero value to be the result of an error
         case, from Stanislav.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      977e4899
  2. 07 Jan, 2019 14 commits
  3. 06 Jan, 2019 3 commits
  4. 05 Jan, 2019 5 commits
    • David Ahern's avatar
      ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses · d4a7e9bb
      David Ahern authored
      I realized the last patch calls dev_get_by_index_rcu in a branch not
      holding the rcu lock. Add the calls to rcu_read_lock and rcu_read_unlock.
      
      Fixes: ec90ad33 ("ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4a7e9bb
    • Alexei Starovoitov's avatar
      Merge branch 'udpv6_sendmsg-addr_any-fix' · 466f89e9
      Alexei Starovoitov authored
      Andrey Ignatov says:
      
      ====================
      The patch set fixes BSD'ism in sys_sendmsg to rewrite unspecified
      destination IPv6 for unconnected UDP sockets in sys_sendmsg with [::1] in
      case when either CONFIG_CGROUP_BPF is enabled or when sys_sendmsg BPF hook
      sets destination IPv6 to [::].
      
      Patch 1 is the fix and provides more details.
      Patch 2 adds two test cases to verify the fix.
      
      v1->v2:
      * Fix compile error in patch 1.
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      466f89e9
    • Andrey Ignatov's avatar
      selftests/bpf: Test [::] -> [::1] rewrite in sys_sendmsg in test_sock_addr · 976b4f3a
      Andrey Ignatov authored
      Test that sys_sendmsg BPF hook doesn't break sys_sendmsg behaviour to
      rewrite destination IPv6 = [::] with [::1] (BSD'ism).
      
      Two test cases are added:
      
      1) User passes dst IPv6 = [::] and BPF_CGROUP_UDP6_SENDMSG program
         doesn't touch it.
      
      2) User passes dst IPv6 != [::], but BPF_CGROUP_UDP6_SENDMSG program
         rewrites it with [::].
      
      In both cases [::1] is used by sys_sendmsg code eventually and datagram
      is sent successfully for unconnected UDP socket.
      
      Example of relevant output:
        Test case: sendmsg6: set dst IP = [::] (BSD'ism) .. [PASS]
        Test case: sendmsg6: preserve dst IP = [::] (BSD'ism) .. [PASS]
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      976b4f3a
    • Andrey Ignatov's avatar
      bpf: Fix [::] -> [::1] rewrite in sys_sendmsg · e8e36984
      Andrey Ignatov authored
      sys_sendmsg has supported unspecified destination IPv6 (wildcard) for
      unconnected UDP sockets since 876c7f41. When [::] is passed by user as
      destination, sys_sendmsg rewrites it with [::1] to be consistent with
      BSD (see "BSD'ism" comment in the code).
      
      This didn't work when cgroup-bpf was enabled though since the rewrite
      [::] -> [::1] happened before passing control to cgroup-bpf block where
      fl6.daddr was updated with passed by user sockaddr_in6.sin6_addr (that
      might or might not be changed by BPF program). That way if user passed
      [::] as dst IPv6 it was first rewritten with [::1] by original code from
      876c7f41, but then rewritten back with [::] by cgroup-bpf block.
      
      It happened even when BPF_CGROUP_UDP6_SENDMSG program was not present
      (CONFIG_CGROUP_BPF=y was enough).
      
      The fix is to apply BSD'ism after cgroup-bpf block so that [::] is
      replaced with [::1] no matter where it came from: passed by user to
      sys_sendmsg or set by BPF_CGROUP_UDP6_SENDMSG program.
      
      Fixes: 1cedee13 ("bpf: Hooks for sys_sendmsg")
      Reported-by: default avatarNitin Rawat <nitin.rawat@intel.com>
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e8e36984
    • David Ahern's avatar
      ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address · ec90ad33
      David Ahern authored
      Similar to c5ee0663 ("ipv6: Consider sk_bound_dev_if when binding a
      socket to an address"), binding a socket to v4 mapped addresses needs to
      consider if the socket is bound to a device.
      
      This problem also exists from the beginning of git history.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec90ad33
  5. 04 Jan, 2019 5 commits
    • Jeff Kirsher's avatar
      ixgbe: fix Kconfig when driver is not a module · ae84e4a8
      Jeff Kirsher authored
      The new ability added to the driver to use mii_bus to handle MII related
      ioctls is causing compile issues when the driver is compiled into the
      kernel (i.e. not a module).
      
      The problem was in selecting MDIO_DEVICE instead of the preferred PHYLIB
      Kconfig option.  The reason being that MDIO_DEVICE had a dependency on
      PHYLIB and would be compiled as a module when PHYLIB was a module, no
      matter whether ixgbe was compiled into the kernel.
      
      CC: Dave Jones <davej@codemonkey.org.uk>
      CC: Steve Douthit <stephend@silicom-usa.com>
      CC: Florian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Reviewed-by: default avatarStephen Douthit <stephend@silicom-usa.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae84e4a8
    • Eric Dumazet's avatar
      ipv6: make icmp6_send() robust against null skb->dev · 8d933670
      Eric Dumazet authored
      syzbot was able to crash one host with the following stack trace :
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8
      RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline]
      RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426
       icmpv6_send
       smack_socket_sock_rcv_skb
       security_sock_rcv_skb
       sk_filter_trim_cap
       __sk_receive_skb
       dccp_v6_do_rcv
       release_sock
      
      This is because a RX packet found socket owned by user and
      was stored into socket backlog. Before leaving RCU protected section,
      skb->dev was cleared in __sk_receive_skb(). When socket backlog
      was finally handled at release_sock() time, skb was fed to
      smack_socket_sock_rcv_skb() then icmp6_send()
      
      We could fix the bug in smack_socket_sock_rcv_skb(), or simply
      make icmp6_send() more robust against such possibility.
      
      In the future we might provide to icmp6_send() the net pointer
      instead of infering it.
      
      Fixes: d66a8acb ("Smack: Inform peer that IPv6 traffic has been blocked")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Piotr Sawicki <p.sawicki2@partner.samsung.com>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d933670
    • Peter Oskolkov's avatar
      selftests: net: fix/improve ip_defrag selftest · 3271a482
      Peter Oskolkov authored
      Commit ade44640 ("net: ipv4: do not handle duplicate fragments as
      overlapping") changed IPv4 defragmentation so that duplicate fragments,
      as well as _some_ fragments completely covered by previously delivered
      fragments, do not lead to the whole frag queue being discarded. This
      makes the existing ip_defrag selftest flaky.
      
      This patch
      * makes sure that negative IPv4 defrag tests generate truly overlapping
        fragments that trigger defrag queue drops;
      * tests that duplicate IPv4 fragments do not trigger defrag queue drops;
      * makes a couple of minor tweaks to the test aimed at increasing its code
        coverage and reduce flakiness.
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3271a482
    • Daniele Palmas's avatar
      qmi_wwan: add MTU default to qmap network interface · f87118d5
      Daniele Palmas authored
      This patch adds MTU default value to qmap network interface in
      order to avoid "RTNETLINK answers: No buffer space available"
      error when setting an ipv6 address.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f87118d5
    • David S. Miller's avatar
      Merge branch 'hns-fixes' · 75e7fb0a
      David S. Miller authored
      Huazhong Tan says:
      
      ====================
      net: hns: Bugfixes for HNS driver
      
      This patchset includes bugfixes for the HNS ethernet controller driver.
      
      Every patch is independent.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75e7fb0a