1. 28 Nov, 2017 20 commits
    • Eric Dumazet's avatar
      net/packet: fix a race in packet_bind() and packet_notifier() · 15fe076e
      Eric Dumazet authored
      syzbot reported crashes [1] and provided a C repro easing bug hunting.
      
      When/if packet_do_bind() calls __unregister_prot_hook() and releases
      po->bind_lock, another thread can run packet_notifier() and process an
      NETDEV_UP event.
      
      This calls register_prot_hook() and hooks again the socket right before
      first thread is able to grab again po->bind_lock.
      
      Fixes this issue by temporarily setting po->num to 0, as suggested by
      David Miller.
      
      [1]
      dev_remove_pack: ffff8801bf16fa80 not found
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:7945!  ( BUG_ON(!list_empty(&dev->ptype_all)); )
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      device syz0 entered promiscuous mode
      CPU: 0 PID: 3161 Comm: syzkaller404108 Not tainted 4.14.0+ #190
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      task: ffff8801cc57a500 task.stack: ffff8801cc588000
      RIP: 0010:netdev_run_todo+0x772/0xae0 net/core/dev.c:7945
      RSP: 0018:ffff8801cc58f598 EFLAGS: 00010293
      RAX: ffff8801cc57a500 RBX: dffffc0000000000 RCX: ffffffff841f75b2
      RDX: 0000000000000000 RSI: 1ffff100398b1ede RDI: ffff8801bf1f8810
      device syz0 entered promiscuous mode
      RBP: ffff8801cc58f898 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801bf1f8cd8
      R13: ffff8801cc58f870 R14: ffff8801bf1f8780 R15: ffff8801cc58f7f0
      FS:  0000000001716880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020b13000 CR3: 0000000005e25000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:106
       tun_detach drivers/net/tun.c:670 [inline]
       tun_chr_close+0x49/0x60 drivers/net/tun.c:2845
       __fput+0x333/0x7f0 fs/file_table.c:210
       ____fput+0x15/0x20 fs/file_table.c:244
       task_work_run+0x199/0x270 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x9bb/0x1ae0 kernel/exit.c:865
       do_group_exit+0x149/0x400 kernel/exit.c:968
       SYSC_exit_group kernel/exit.c:979 [inline]
       SyS_exit_group+0x1d/0x20 kernel/exit.c:977
       entry_SYSCALL_64_fastpath+0x1f/0x96
      RIP: 0033:0x44ad19
      
      Fixes: 30f7ea1c ("packet: race condition in packet_bind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Francesco Ruggeri <fruggeri@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15fe076e
    • Mike Maloney's avatar
      packet: fix crash in fanout_demux_rollover() · 57f015f5
      Mike Maloney authored
      syzkaller found a race condition fanout_demux_rollover() while removing
      a packet socket from a fanout group.
      
      po->rollover is read and operated on during packet_rcv_fanout(), via
      fanout_demux_rollover(), but the pointer is currently cleared before the
      synchronization in packet_release().   It is safer to delay the cleanup
      until after synchronize_net() has been called, ensuring all calls to
      packet_rcv_fanout() for this socket have finished.
      
      To further simplify synchronization around the rollover structure, set
      po->rollover in fanout_add() only if there are no errors.  This removes
      the need for rcu in the struct and in the call to
      packet_getsockopt(..., PACKET_ROLLOVER_STATS, ...).
      
      Crashing stack trace:
       fanout_demux_rollover+0xb6/0x4d0 net/packet/af_packet.c:1392
       packet_rcv_fanout+0x649/0x7c8 net/packet/af_packet.c:1487
       dev_queue_xmit_nit+0x835/0xc10 net/core/dev.c:1953
       xmit_one net/core/dev.c:2975 [inline]
       dev_hard_start_xmit+0x16b/0xac0 net/core/dev.c:2995
       __dev_queue_xmit+0x17a4/0x2050 net/core/dev.c:3476
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3509
       neigh_connected_output+0x489/0x720 net/core/neighbour.c:1379
       neigh_output include/net/neighbour.h:482 [inline]
       ip6_finish_output2+0xad1/0x22a0 net/ipv6/ip6_output.c:120
       ip6_finish_output+0x2f9/0x920 net/ipv6/ip6_output.c:146
       NF_HOOK_COND include/linux/netfilter.h:239 [inline]
       ip6_output+0x1f4/0x850 net/ipv6/ip6_output.c:163
       dst_output include/net/dst.h:459 [inline]
       NF_HOOK.constprop.35+0xff/0x630 include/linux/netfilter.h:250
       mld_sendpack+0x6a8/0xcc0 net/ipv6/mcast.c:1660
       mld_send_initial_cr.part.24+0x103/0x150 net/ipv6/mcast.c:2072
       mld_send_initial_cr net/ipv6/mcast.c:2056 [inline]
       ipv6_mc_dad_complete+0x99/0x130 net/ipv6/mcast.c:2079
       addrconf_dad_completed+0x595/0x970 net/ipv6/addrconf.c:4039
       addrconf_dad_work+0xac9/0x1160 net/ipv6/addrconf.c:3971
       process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2113
       worker_thread+0x223/0x1990 kernel/workqueue.c:2247
       kthread+0x35e/0x430 kernel/kthread.c:231
       ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432
      
      Fixes: 0648ab70 ("packet: rollover prepare: per-socket state")
      Fixes: 509c7a1e ("packet: avoid panic in packet_getsockopt()")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarMike Maloney <maloney@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57f015f5
    • David S. Miller's avatar
      Merge branch 'sctp-fix-sparse-errors' · a51a40b7
      David S. Miller authored
      Xin Long says:
      
      ====================
      sctp: fix some other sparse errors
      
      After the last fixes for sparse errors, there are still three sparse
      errors in sctp codes, two of them are type cast, and the other one
      is using extern.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a51a40b7
    • Xin Long's avatar
      sctp: remove extern from stream sched · 1ba896f6
      Xin Long authored
      Now each stream sched ops is defined in different .c file and
      added into the global ops in another .c file, it uses extern
      to make this work.
      
      However extern is not good coding style to get them in and
      even make C=2 reports errors for this.
      
      This patch adds sctp_sched_ops_xxx_init for each stream sched
      ops in their .c file, then get them into the global ops by
      calling them when initializing sctp module.
      
      Fixes: 637784ad ("sctp: introduce priority based stream scheduler")
      Fixes: ac1ed8b8 ("sctp: introduce round robin stream scheduler")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ba896f6
    • Xin Long's avatar
      sctp: force the params with right types for sctp csum apis · af2697a0
      Xin Long authored
      Now sctp_csum_xxx doesn't really match the param types of these common
      csum apis. As sctp_csum_xxx is defined in sctp/checksum.h, many sparse
      errors occur when make C=2 not only with M=net/sctp but also with other
      modules that include this header file.
      
      This patch is to force them fit in csum apis with the right types.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af2697a0
    • Xin Long's avatar
      sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail · 08f46070
      Xin Long authored
      This patch is to force SCTP_ERROR_INV_STRM with right type to
      fit in sctp_chunk_fail to avoid the sparse error.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08f46070
    • Vasyl Gomonovych's avatar
      lmc: Use memdup_user() as a cleanup · f95d5bf0
      Vasyl Gomonovych authored
      Fix coccicheck warning which recommends to use memdup_user():
      drivers/net/wan/lmc/lmc_main.c:497:27-34: WARNING opportunity for memdup_user
      Generated by: scripts/coccinelle/memdup_user/memdup_user.cocci
      Signed-off-by: default avatarVasyl Gomonovych <gomonovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f95d5bf0
    • Christophe JAILLET's avatar
      bnxt_en: Fix an error handling path in 'bnxt_get_module_eeprom()' · dea521a2
      Christophe JAILLET authored
      Error code returned by 'bnxt_read_sfp_module_eeprom_info()' is handled a
      few lines above when reading the A0 portion of the EEPROM.
      The same should be done when reading the A2 portion of the EEPROM.
      
      In order to correctly propagate an error, update 'rc' in this 2nd call as
      well, otherwise 0 (success) is returned.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dea521a2
    • Antoine Tenart's avatar
      net: phy: marvell10g: fix the PHY id mask · 952b6b3b
      Antoine Tenart authored
      The Marvell 10G PHY driver supports different hardware revisions, which
      have their bits 3..0 differing. To get the correct revision number these
      bits should be ignored. This patch fixes this by using the already
      defined MARVELL_PHY_ID_MASK (0xfffffff0) instead of the custom
      0xffffffff mask.
      
      Fixes: 20b2af32 ("net: phy: add Marvell Alaska X 88X3310 10Gigabit PHY support")
      Suggested-by: default avatarYan Markman <ymarkman@marvell.com>
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@free-electrons.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      952b6b3b
    • David S. Miller's avatar
      Merge branch 'mvpp2-fixes' · f40b55ab
      David S. Miller authored
      Antoine Tenart says:
      
      ====================
      net: mvpp2: set of fixes
      
      This series fixes various issues with the Marvell PPv2 driver. The
      patches are sent together to avoid any possible conflict. The series is
      based on today's net tree.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f40b55ab
    • Antoine Tenart's avatar
      net: mvpp2: check ethtool sets the Tx ring size is to a valid min value · 76e583c5
      Antoine Tenart authored
      This patch fixes the Tx ring size checks when using ethtool, by adding
      an extra check in the PPv2 check_ringparam_valid helper. The Tx ring
      size cannot be set to a value smaller than the minimum number of
      descriptors needed for TSO.
      
      Fixes: 1d17db08 ("net: mvpp2: limit TSO segments and use stop/wake thresholds")
      Suggested-by: default avatarYan Markman <ymarkman@marvell.com>
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76e583c5
    • Yan Markman's avatar
      net: mvpp2: do not disable GMAC padding · e749aca8
      Yan Markman authored
      Short fragmented packets may never be sent by the hardware when padding
      is disabled. This patch stop modifying the GMAC padding bits, to leave
      them to their reset value (disabled).
      
      Fixes: 3919357f ("net: mvpp2: initialize the GMAC when using a port")
      Signed-off-by: default avatarYan Markman <ymarkman@marvell.com>
      [Antoine: commit message]
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e749aca8
    • Antoine Tenart's avatar
      net: mvpp2: cleanup probed ports in the probe error path · 26146b0e
      Antoine Tenart authored
      This patches fixes the probe error path by cleaning up probed ports, to
      avoid leaving registered net devices when the driver failed to probe.
      
      Fixes: 3f518509 ("ethernet: Add new driver for Marvell Armada 375 network unit")
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26146b0e
    • Antoine Tenart's avatar
      net: mvpp2: fix the txq_init error path · ba2d8d88
      Antoine Tenart authored
      When an allocation in the txq_init path fails, the allocated buffers
      end-up being freed twice: in the txq_init error path, and in txq_deinit.
      This lead to issues as txq_deinit would work on already freed memory
      regions:
      
          kernel BUG at mm/slub.c:3915!
          Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      
      This patch fixes this by removing the txq_init own error path, as the
      txq_deinit function is always called on errors. This was introduced by
      TSO as way more buffers are allocated.
      
      Fixes: 186cd4d4 ("net: mvpp2: software tso support")
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba2d8d88
    • David S. Miller's avatar
      Merge branch 'mlxsw-GRE-offloading-fixes' · e2549970
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      mlxsw: GRE offloading fixes
      
      Petr says:
      
      This patchset fixes a couple bugs in offloading GRE tunnels in mlxsw
      driver.
      
      Patch #1 fixes a problem that local routes pointing at a GRE tunnel
      device are offloaded even if that netdevice is down.
      
      Patch #2 detects that as a result of moving a GRE netdevice to a
      different VRF, two tunnels now have a conflict of local addresses,
      something that the mlxsw driver can't offload.
      
      Patch #3 fixes a FIB abort caused by forming a route pointing at a
      GRE tunnel that is eligible for offloading but already onloaded.
      
      Patch #4 fixes a problem that next hops migrated to a new RIF kept the
      old RIF reference, which went dangling shortly afterwards.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2549970
    • Petr Machata's avatar
      mlxsw: spectrum_router: Update nexthop RIF on update · 09dbf629
      Petr Machata authored
      The function mlxsw_sp_nexthop_rif_update() walks the list of nexthops
      associated with a RIF, and updates the corresponding entries in the
      switch. It is used in particular when a tunnel underlay netdevice moves
      to a different VRF, and all the nexthops are migrated over to a new RIF.
      The problem is that each nexthop holds a reference to its RIF, and that
      is not updated. So after the old RIF is gone, further activity on these
      nexthops (such as downing the underlay netdevice) dereferences a
      dangling pointer.
      
      Fix the issue by updating rif of impacted nexthops before calling
      mlxsw_sp_nexthop_rif_update().
      
      Fixes: 0c5f1cd5 ("mlxsw: spectrum_router: Generalize __mlxsw_sp_ipip_entry_update_tunnel()")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09dbf629
    • Petr Machata's avatar
      mlxsw: spectrum_router: Handle encap to demoted tunnels · d97cda5f
      Petr Machata authored
      Some tunnels that are offloadable on their own can nonetheless be
      demoted to slow path if their local address is in conflict with that of
      another tunnel. When a route is formed for such a tunnel,
      mlxsw_sp_nexthop_ipip_init() fails to find the corresponding IPIP entry,
      and that triggers a FIB abort.
      
      Resolve the problem by not assuming that a tunnel for which
      mlxsw_sp_ipip_ops.can_offload() holds also automatically has an IPIP
      entry.
      
      Fixes: af641713 ("mlxsw: spectrum_router: Onload conflicting tunnels")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d97cda5f
    • Petr Machata's avatar
      mlxsw: spectrum_router: Demote tunnels on VRF migration · cab43d9c
      Petr Machata authored
      The mlxsw driver currently doesn't offload GRE tunnels if they have the
      same local address and use the same underlay VRF. When such a situation
      arises, the tunnels in conflict are demoted to slow path.
      
      However, the current code only verifies this condition on tunnel
      creation and tunnel change, not when a tunnel is moved to a different
      VRF. When the tunnel has no bound device, underlay and overlay are the
      same. Thus moving a tunnel moves the underlay as well, and that can
      cause local address conflict.
      
      So modify mlxsw_sp_netdevice_ipip_ol_vrf_event() to check if there are
      any conflicting tunnels, and demote them if yes.
      
      Fixes: af641713 ("mlxsw: spectrum_router: Onload conflicting tunnels")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cab43d9c
    • Petr Machata's avatar
      mlxsw: spectrum_router: Offload decap only for up tunnels · 57c77ce4
      Petr Machata authored
      When a new local route is added, an IPIP entry is looked up to determine
      whether the route should be offloaded as a tunnel decap or as a trap.
      That decision should take into account whether the tunnel netdevice in
      question is actually IFF_UP, and only install a decap offload if it is.
      
      Fixes: 0063587d ("mlxsw: spectrum: Support decap-only IP-in-IP tunnels")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57c77ce4
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 32f0160c
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2017-11-27
      
      This series contains updates to e1000, e1000e and i40e.
      
      Gustavo A. R. Silva fixes a sizeof() issue where we were taking the size of
      the pointer (which is always the size of the pointer).
      
      Sasha does a follow up fix to a previous fix for buffer overrun, to resolve
      community feedback from David Laight and the use of magic numbers.
      
      Amritha fixes the reporting of error codes for when adding a cloud filter
      fails.
      
      Ahmad Fatoum brushes the dust off the e1000 driver to fix a code comment
      and debug message which was incorrect about what the code was really doing.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32f0160c
  2. 27 Nov, 2017 16 commits
  3. 26 Nov, 2017 2 commits
    • zhangliping's avatar
      openvswitch: fix the incorrect flow action alloc size · 67c8d22a
      zhangliping authored
      If we want to add a datapath flow, which has more than 500 vxlan outputs'
      action, we will get the following error reports:
        openvswitch: netlink: Flow action size 32832 bytes exceeds max
        openvswitch: netlink: Flow action size 32832 bytes exceeds max
        openvswitch: netlink: Actions may not be safe on all matching packets
        ... ...
      
      It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but
      this is not the root cause. For example, for a vxlan output action, we need
      about 60 bytes for the nlattr, but after it is converted to the flow
      action, it only occupies 24 bytes. This means that we can still support
      more than 1000 vxlan output actions for a single datapath flow under the
      the current 32k max limitation.
      
      So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we
      shouldn't report EINVAL and keep it move on, as the judgement can be
      done by the reserve_sfa_size.
      Signed-off-by: default avatarzhangliping <zhangliping02@baidu.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67c8d22a
    • Gustavo A. R. Silva's avatar
      net: openvswitch: datapath: fix data type in queue_gso_packets · 2734166e
      Gustavo A. R. Silva authored
      gso_type is being used in binary AND operations together with SKB_GSO_UDP.
      The issue is that variable gso_type is of type unsigned short and
      SKB_GSO_UDP expands to more than 16 bits:
      
      SKB_GSO_UDP = 1 << 16
      
      this makes any binary AND operation between gso_type and SKB_GSO_UDP to
      be always zero, hence making some code unreachable and likely causing
      undesired behavior.
      
      Fix this by changing the data type of variable gso_type to unsigned int.
      
      Addresses-Coverity-ID: 1462223
      Fixes: 0c19f846 ("net: accept UFO datagrams from tuntap and packet")
      Signed-off-by: default avatarGustavo A. R. Silva <garsilva@embeddedor.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2734166e
  4. 25 Nov, 2017 2 commits