1. 08 Mar, 2018 1 commit
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · cfda06d7
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-03-08
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix various BPF helpers which adjust the skb and its GSO information
         with regards to SCTP GSO. The latter is a special case where gso_size
         is of value GSO_BY_FRAGS, so mangling that will end up corrupting
         the skb, thus bail out when seeing SCTP GSO packets, from Daniel(s).
      
      2) Fix a compilation error in bpftool where BPF_FS_MAGIC is not defined
         due to too old kernel headers in the system, from Jiri.
      
      3) Increase the number of x64 JIT passes in order to allow larger images
         to converge instead of punting them to interpreter or having them
         rejected when the interpreter is not built into the kernel, from Daniel.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfda06d7
  2. 07 Mar, 2018 21 commits
    • Daniel Borkmann's avatar
      bpf, x64: increase number of passes · 6007b080
      Daniel Borkmann authored
      In Cilium some of the main programs we run today are hitting 9 passes
      on x64's JIT compiler, and we've had cases already where we surpassed
      the limit where the JIT then punts the program to the interpreter
      instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
      or insertion failures due to the prog array owner being JITed but the
      program to insert not (both must have the same JITed/non-JITed property).
      
      One concrete case the program image shrunk from 12,767 bytes down to
      10,288 bytes where the image converged after 16 steps. I've measured
      that this took 340us in the JIT until it converges on my i7-6600U. Thus,
      increase the original limit we had from day one where the JIT covered
      cBPF only back then before we run into the case (as similar with the
      complexity limit) where we trip over this and hit program rejections.
      Also add a cond_resched() into the compilation loop, the JIT process
      runs without any locks and may sleep anyway.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6007b080
    • Ganesh Goudar's avatar
      cxgb4: do not set needs_free_netdev for mgmt dev's · b06ef18a
      Ganesh Goudar authored
      Do not set 'needs_free_netdev' as we do call free_netdev
      for mgmt net devices, doing both hits BUG_ON.
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b06ef18a
    • Ganesh Goudar's avatar
      cxgb4: copy adap index to PF0-3 adapter instances · 016764de
      Ganesh Goudar authored
      instantiation of VF's on different adapters fails, copy
      adapter index and chip type to PF0-3 adapter instances
      to fix the issue.
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      016764de
    • Paul Moore's avatar
      net: don't unnecessarily load kernel modules in dev_ioctl() · b51f26b1
      Paul Moore authored
      Starting with v4.16-rc1 we've been seeing a higher than usual number
      of requests for the kernel to load networking modules, even on events
      which shouldn't trigger a module load (e.g. ioctl(TCGETS)).  Stephen
      Smalley suggested the problem may lie in commit 44c02a2c
      ("dev_ioctl(): move copyin/copyout to callers") which moves changes
      the network dev_ioctl() function to always call dev_load(),
      regardless of the requested ioctl.
      
      This patch moves the dev_load() calls back into the individual ioctls
      while preserving the rest of the original patch.
      Reported-by: default avatarDominick Grift <dac.override@gmail.com>
      Suggested-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b51f26b1
    • Soheil Hassas Yeganeh's avatar
      tcp: purge write queue upon aborting the connection · e05836ac
      Soheil Hassas Yeganeh authored
      When the connection is aborted, there is no point in
      keeping the packets on the write queue until the connection
      is closed.
      
      Similar to a27fd7a8 ('tcp: purge write queue upon RST'),
      this is essential for a correct MSG_ZEROCOPY implementation,
      because userspace cannot call close(fd) before receiving
      zerocopy signals even when the connection is aborted.
      
      Fixes: f214f915 ("tcp: enable MSG_ZEROCOPY")
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e05836ac
    • Alexey Kodanev's avatar
      dccp: check sk for closed state in dccp_sendmsg() · 67f93df7
      Alexey Kodanev authored
      dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
      therefore if DCCP socket is disconnected and dccp_sendmsg() is
      called after it, it will cause a NULL pointer dereference in
      dccp_write_xmit().
      
      This crash and the reproducer was reported by syzbot. Looks like
      it is reproduced if commit 69c64866 ("dccp: CVE-2017-8824:
      use-after-free in DCCP code") is applied.
      
      Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67f93df7
    • Eric Dumazet's avatar
      l2tp: do not accept arbitrary sockets · 17cfe79a
      Eric Dumazet authored
      syzkaller found an issue caused by lack of sufficient checks
      in l2tp_tunnel_create()
      
      RAW sockets can not be considered as UDP ones for instance.
      
      In another patch, we shall replace all pr_err() by less intrusive
      pr_debug() so that syzkaller can find other bugs faster.
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
      dst_release: dst:00000000d53d0d0f refcnt:-1
      Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242
      
      CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x24d lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report+0x23b/0x360 mm/kasan/report.c:412
       __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435
       setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
       l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596
       pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707
       SYSC_connect+0x213/0x4a0 net/socket.c:1640
       SyS_connect+0x24/0x30 net/socket.c:1621
       do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17cfe79a
    • Kirill Tkhai's avatar
      net: Fix hlist corruptions in inet_evict_bucket() · a5600024
      Kirill Tkhai authored
      inet_evict_bucket() iterates global list, and
      several tasks may call it in parallel. All of
      them hash the same fq->list_evictor to different
      lists, which leads to list corruption.
      
      This patch makes fq be hashed to expired list
      only if this has not been made yet by another
      task. Since inet_frag_alloc() allocates fq
      using kmem_cache_zalloc(), we may rely on
      list_evictor is initially unhashed.
      
      The problem seems to exist before async
      pernet_operations, as there was possible to have
      exit method to be executed in parallel with
      inet_frags::frags_work, so I add two Fixes tags.
      This also may go to stable.
      
      Fixes: d1fe1944 "inet: frag: don't re-use chainlist for evictor"
      Fixes: f84c6821 "net: Convert pernet_subsys, registered from inet_init()"
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5600024
    • Jeremy Linton's avatar
      net: smsc911x: Fix unload crash when link is up · e06513d7
      Jeremy Linton authored
      The smsc911x driver will crash if it is rmmod'ed while the netdev
      is up like:
      
      Call trace:
        phy_detach+0x94/0x150
        phy_disconnect+0x40/0x50
        smsc911x_stop+0x104/0x128 [smsc911x]
        __dev_close_many+0xb4/0x138
        dev_close_many+0xbc/0x190
        rollback_registered_many+0x140/0x460
        rollback_registered+0x68/0xb0
        unregister_netdevice_queue+0x100/0x118
        unregister_netdev+0x28/0x38
        smsc911x_drv_remove+0x58/0x130 [smsc911x]
        platform_drv_remove+0x30/0x50
        device_release_driver_internal+0x15c/0x1f8
        driver_detach+0x54/0x98
        bus_remove_driver+0x64/0xe8
        driver_unregister+0x34/0x60
        platform_driver_unregister+0x20/0x30
        smsc911x_cleanup_module+0x14/0xbca8 [smsc911x]
        SyS_delete_module+0x1e8/0x238
        __sys_trace_return+0x0/0x4
      
      This is caused by the mdiobus being unregistered/free'd
      and the code in phy_detach() attempting to manipulate mdio
      related structures from unregister_netdev() calling close()
      
      To fix this, we delay the mdiobus teardown until after
      the netdev is deregistered.
      Reported-by: default avatarMatt Sealey <matt.sealey@arm.com>
      Signed-off-by: default avatarJeremy Linton <jeremy.linton@arm.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e06513d7
    • Stefano Brivio's avatar
      ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes · e9fa1495
      Stefano Brivio authored
      Currently, administrative MTU changes on a given netdevice are
      not reflected on route exceptions for MTU-less routes, with a
      set PMTU value, for that device:
      
       # ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a proto kernel src 2001:db8::a metric 256 pref medium
       # ping6 -c 1 -q -s10000 2001:db8::b > /dev/null
       # ip netns exec a ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
           cache expires 571sec mtu 4926 pref medium
       # ip link set dev vti_a mtu 3000
       # ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
           cache expires 571sec mtu 4926 pref medium
       # ip link set dev vti_a mtu 9000
       # ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
           cache expires 571sec mtu 4926 pref medium
      
      The first issue is that since commit fb56be83 ("net-ipv6: on
      device mtu change do not add mtu to mtu-less routes") we don't
      call rt6_exceptions_update_pmtu() from rt6_mtu_change_route(),
      which handles administrative MTU changes, if the regular route
      is MTU-less.
      
      However, PMTU exceptions should be always updated, as long as
      RTAX_MTU is not locked. Keep the check for MTU-less main route,
      as introduced by that commit, but, for exceptions,
      call rt6_exceptions_update_pmtu() regardless of that check.
      
      Once that is fixed, one problem remains: MTU changes are not
      reflected if the new MTU is higher than the previous one,
      because rt6_exceptions_update_pmtu() doesn't allow that. We
      should instead allow PMTU increase if the old PMTU matches the
      local MTU, as that implies that the old MTU was the lowest in the
      path, and PMTU discovery might lead to different results.
      
      The existing check in rt6_mtu_change_route() correctly took that
      case into account (for regular routes only), so factor it out
      and re-use it also in rt6_exceptions_update_pmtu().
      
      While at it, fix comments style and grammar, and try to be a bit
      more descriptive.
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Fixes: fb56be83 ("net-ipv6: on device mtu change do not add mtu to mtu-less routes")
      Fixes: f5bbe7ee ("ipv6: prepare rt6_mtu_change() for exception table")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9fa1495
    • Hemanth Puranik's avatar
      net: qcom/emac: Use proper free methods during TX · cc5db315
      Hemanth Puranik authored
      This patch fixes the warning messages/call traces seen if DMA debug is
      enabled, In case of fragmented skb's memory was allocated using
      dma_map_page but freed using dma_unmap_single. This patch modifies buffer
      allocations in TX path to use dma_map_page in all the places and
      dma_unmap_page while freeing the buffers.
      Signed-off-by: default avatarHemanth Puranik <hpuranik@codeaurora.org>
      Acked-by: default avatarTimur Tabi <timur@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc5db315
    • Michal Kalderon's avatar
      qed: Free RoCE ILT Memory on rmmod qedr · 9de506a5
      Michal Kalderon authored
      Rdma requires ILT Memory to be allocated for it's QPs.
      Each ILT entry points to a page used by several Rdma QPs.
      To avoid allocating all the memory in advance, the rdma
      implementation dynamically allocates memory as more QPs are
      added, however it does not dynamically free the memory.
      The memory should have been freed on rmmod qedr, but isn't.
      This patch adds the memory freeing on rmmod qedr (currently
      it will be freed with qed is removed).
      
      An outcome of this bug, is that if qedr is unloaded and loaded
      without unloaded qed, there will be no more RoCE traffic.
      
      The reason these are related, is that the logic of detecting the
      first QP ever opened is by asking whether ILT memory for RoCE has
      been allocated.
      
      In addition, this patch modifies freeing of the Task context to
      always use the PROTOCOLID_ROCE and not the protocol passed,
      this is because task context for iWARP and ROCE both use the
      ROCE protocol id, as opposed to the connection context.
      
      Fixes: dbb799c3 ("qed: Initialize hardware for new protocols")
      Signed-off-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Signed-off-by: default avatarAriel Elior <Ariel.Elior@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9de506a5
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 87772fe6
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2018-03-05
      
      This series contains fixes to e1000e only.
      
      Benjamin Poirier provides all but one fix in this series, starting with
      workaround for a VMWare e1000e emulation issue where ICR reads 0x0 on
      the emulated device.  Partially reverted a previous commit dealing with
      the "Other" interrupt throttling to avoid unforeseen fallout from these
      changes that are not strictly necessary.  Restored the ICS write for
      receive and transmit queue interrupts in the case that txq or rxq bits
      were set in ICR and the Other interrupt handler read and cleared ICR
      before the queue interrupt was raised.  Fixed an bug where interrupts
      may be missed if ICR is read while INT_ASSERTED is not set, so avoid the
      problem by setting all bits related to events that can trigger the Other
      interrupt in IMS.  Fixed the return value for check_for_link() when
      auto-negotiation is off.
      
      Pierre-Yves Kerbrat fixes e1000e to use dma_zalloc_coherent() to make
      sure the ring is memset to 0 to prevent the area from containing
      garbage.
      
      v2: added an additional e1000e fix to the series
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      87772fe6
    • Eric Dumazet's avatar
      net: usbnet: fix potential deadlock on 32bit hosts · 2695578b
      Eric Dumazet authored
      Marek reported a LOCKDEP issue occurring on 32bit host,
      that we tracked down to the fact that usbnet could either
      run from soft or hard irqs.
      
      This patch adds u64_stats_update_begin_irqsave() and
      u64_stats_update_end_irqrestore() helpers to solve this case.
      
      [   17.768040] ================================
      [   17.772239] WARNING: inconsistent lock state
      [   17.776511] 4.16.0-rc3-next-20180227-00007-g876c53a7493c #453 Not tainted
      [   17.783329] --------------------------------
      [   17.787580] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
      [   17.793607] swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      [   17.798751]  (&syncp->seq#5){?.-.}, at: [<9b22e5f0>]
      asix_rx_fixup_internal+0x188/0x288
      [   17.806790] {IN-HARDIRQ-W} state was registered at:
      [   17.811677]   tx_complete+0x100/0x208
      [   17.815319]   __usb_hcd_giveback_urb+0x60/0xf0
      [   17.819770]   xhci_giveback_urb_in_irq+0xa8/0x240
      [   17.824469]   xhci_td_cleanup+0xf4/0x16c
      [   17.828367]   xhci_irq+0xe74/0x2240
      [   17.831827]   usb_hcd_irq+0x24/0x38
      [   17.835343]   __handle_irq_event_percpu+0x98/0x510
      [   17.840111]   handle_irq_event_percpu+0x1c/0x58
      [   17.844623]   handle_irq_event+0x38/0x5c
      [   17.848519]   handle_fasteoi_irq+0xa4/0x138
      [   17.852681]   generic_handle_irq+0x18/0x28
      [   17.856760]   __handle_domain_irq+0x6c/0xe4
      [   17.860941]   gic_handle_irq+0x54/0xa0
      [   17.864666]   __irq_svc+0x70/0xb0
      [   17.867964]   arch_cpu_idle+0x20/0x3c
      [   17.871578]   arch_cpu_idle+0x20/0x3c
      [   17.875190]   do_idle+0x144/0x218
      [   17.878468]   cpu_startup_entry+0x18/0x1c
      [   17.882454]   start_kernel+0x394/0x400
      [   17.886177] irq event stamp: 161912
      [   17.889616] hardirqs last  enabled at (161912): [<7bedfacf>]
      __netdev_alloc_skb+0xcc/0x140
      [   17.897893] hardirqs last disabled at (161911): [<d58261d0>]
      __netdev_alloc_skb+0x94/0x140
      [   17.904903] exynos5-hsi2c 12ca0000.i2c: tx timeout
      [   17.906116] softirqs last  enabled at (161904): [<387102ff>]
      irq_enter+0x78/0x80
      [   17.906123] softirqs last disabled at (161905): [<cf4c628e>]
      irq_exit+0x134/0x158
      [   17.925722].
      [   17.925722] other info that might help us debug this:
      [   17.933435]  Possible unsafe locking scenario:
      [   17.933435].
      [   17.940331]        CPU0
      [   17.942488]        ----
      [   17.944894]   lock(&syncp->seq#5);
      [   17.948274]   <Interrupt>
      [   17.950847]     lock(&syncp->seq#5);
      [   17.954386].
      [   17.954386]  *** DEADLOCK ***
      [   17.954386].
      [   17.962422] no locks held by swapper/0/0.
      
      Fixes: c8b5d129 ("net: usbnet: support 64bit stats")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2695578b
    • Alexey Kodanev's avatar
      sch_netem: fix skb leak in netem_enqueue() · 35d889d1
      Alexey Kodanev authored
      When we exceed current packets limit and we have more than one
      segment in the list returned by skb_gso_segment(), netem drops
      only the first one, skipping the rest, hence kmemleak reports:
      
      unreferenced object 0xffff880b5d23b600 (size 1024):
        comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s)
        hex dump (first 32 bytes):
          00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00  ..#]............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520
          [<000000001709b32f>] skb_segment+0x8c8/0x3710
          [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830
          [<00000000c921cba1>] inet_gso_segment+0x476/0x1370
          [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510
          [<000000002182660a>] __skb_gso_segment+0x1dd/0x620
          [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem]
          [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120
          [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00
          [<00000000d309e9d3>] ip_output+0x1aa/0x2c0
          [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670
          [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0
          [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540
          [<0000000013d06d02>] tasklet_action+0x1ca/0x250
          [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3
          [<00000000e7ed027c>] irq_exit+0x1e2/0x210
      
      Fix it by adding the rest of the segments, if any, to skb 'to_free'
      list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
      because they can be useful in the future if we need to drop segmented
      GSO packets in other places.
      
      Fixes: 6071bd1a ("netem: Segment GSO packets on enqueue")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35d889d1
    • David S. Miller's avatar
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 89036a2a
      David S. Miller authored
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth 2018-03-05
      
      Here are a few more Bluetooth fixes for the 4.16 kernel:
      
       - btusb: reset/resume fixes for Yoga 920 and Dell OptiPlex 3060
       - Fix for missing encryption refresh with the Security Manager protocol
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89036a2a
    • Denis Kirjanov's avatar
      fsl/fman: avoid sleeping in atomic context while adding an address · 803fafbe
      Denis Kirjanov authored
      __dev_mc_add grabs an adress spinlock so use
      atomic context in kmalloc.
      
      / # ifconfig eth0 inet 192.168.0.111
      [   89.331622] BUG: sleeping function called from invalid context at mm/slab.h:420
      [   89.339002] in_atomic(): 1, irqs_disabled(): 0, pid: 1035, name: ifconfig
      [   89.345799] 2 locks held by ifconfig/1035:
      [   89.349908]  #0:  (rtnl_mutex){+.+.}, at: [<(ptrval)>] devinet_ioctl+0xc0/0x8a0
      [   89.357258]  #1:  (_xmit_ETHER){+...}, at: [<(ptrval)>] __dev_mc_add+0x28/0x80
      [   89.364520] CPU: 1 PID: 1035 Comm: ifconfig Not tainted 4.16.0-rc3-dirty #8
      [   89.371464] Call Trace:
      [   89.373908] [e959db60] [c066f948] dump_stack+0xa4/0xfc (unreliable)
      [   89.380177] [e959db80] [c00671d8] ___might_sleep+0x248/0x280
      [   89.385833] [e959dba0] [c01aec34] kmem_cache_alloc_trace+0x174/0x320
      [   89.392179] [e959dbd0] [c04ab920] dtsec_add_hash_mac_address+0x130/0x240
      [   89.398874] [e959dc00] [c04a9d74] set_multi+0x174/0x1b0
      [   89.404093] [e959dc30] [c04afb68] dpaa_set_rx_mode+0x68/0xe0
      [   89.409745] [e959dc40] [c057baf8] __dev_mc_add+0x58/0x80
      [   89.415052] [e959dc60] [c060fd64] igmp_group_added+0x164/0x190
      [   89.420878] [e959dca0] [c060ffa8] ip_mc_inc_group+0x218/0x460
      [   89.426617] [e959dce0] [c06120fc] ip_mc_up+0x3c/0x190
      [   89.431662] [e959dd10] [c0607270] inetdev_event+0x250/0x620
      [   89.437227] [e959dd50] [c005f190] notifier_call_chain+0x80/0xf0
      [   89.443138] [e959dd80] [c0573a74] __dev_notify_flags+0x54/0xf0
      [   89.448964] [e959dda0] [c05743f8] dev_change_flags+0x48/0x60
      [   89.454615] [e959ddc0] [c0606744] devinet_ioctl+0x544/0x8a0
      [   89.460180] [e959de10] [c060987c] inet_ioctl+0x9c/0x1f0
      [   89.465400] [e959de80] [c05479a8] sock_ioctl+0x168/0x460
      [   89.470708] [e959ded0] [c01cf3ec] do_vfs_ioctl+0xac/0x8c0
      [   89.476099] [e959df20] [c01cfc40] SyS_ioctl+0x40/0xc0
      [   89.481147] [e959df40] [c0011318] ret_from_syscall+0x0/0x3c
      [   89.486715] --- interrupt: c01 at 0x1006943c
      [   89.486715]     LR = 0x100c45ec
      Signed-off-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Acked-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      803fafbe
    • David S. Miller's avatar
      Merge branch 'rhltable-dups' · 6f22c07f
      David S. Miller authored
      Paul Blakey says:
      
      ====================
      rhashtable: Fix rhltable duplicates insertion
      
      On our mlx5 driver fs_core.c, we use the rhltable interface to store
      flow groups. We noticed that sometimes we get a warning that flow group isn't
      found at removal. This rare case was caused when a specific scenario happened,
      insertion of a flow group with a similar match criteria (a duplicate),
      but only where the flow group rhash_head was second (or not first)
      on the relevant rhashtable bucket list.
      
      The first patch fixes it, and the second one adds a test that show
      it is now working.
      
      Paul.
      
      v3 --> v2 changes:
          * added missing fix in rhashtable_lookup_one code path as well.
      
      v1 --> v2 changes:
          * Changed commit messages to better reflect the change
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f22c07f
    • Paul Blakey's avatar
      test_rhashtable: add test case for rhltable with duplicate objects · 499ac3b6
      Paul Blakey authored
      Tries to insert duplicates in the middle of bucket's chain:
      bucket 1:  [[val 21 (tid=1)]] -> [[ val 1 (tid=2),  val 1 (tid=0) ]]
      
      Reuses tid to distinguish the elements insertion order.
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      499ac3b6
    • Paul Blakey's avatar
      rhashtable: Fix rhlist duplicates insertion · d3dcf8eb
      Paul Blakey authored
      When inserting duplicate objects (those with the same key),
      current rhlist implementation messes up the chain pointers by
      updating the bucket pointer instead of prev next pointer to the
      newly inserted node. This causes missing elements on removal and
      travesal.
      
      Fix that by properly updating pprev pointer to point to
      the correct rhash_head next pointer.
      
      Issue: 1241076
      Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
      Fixes: ca26893f ('rhashtable: Add rhlist interface')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3dcf8eb
    • Geert Uytterhoeven's avatar
      dt-bindings: net: renesas-ravb: Make stream buffer optional · 25b5cdfc
      Geert Uytterhoeven authored
      The Stream Buffer for EtherAVB-IF (STBE) is an optional component, and
      is not present on all SoCs.
      
      Document this in the DT bindings, including a list of SoCs that do have
      it.
      
      Fixes: 785ec874 ("ravb: document R8A77970 bindings")
      Fixes: f231c417 ("dt-bindings: net: renesas-ravb: Add support for R8A77995 RAVB")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25b5cdfc
  3. 06 Mar, 2018 9 commits
  4. 05 Mar, 2018 9 commits
    • Colin Ian King's avatar
      ia64/err-inject: fix spelling mistake: "capapbilities" -> "capabilities" · 48e362dd
      Colin Ian King authored
      Trivial fix to spelling mistake in debug message text.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      48e362dd
    • Davidlohr Bueso's avatar
      ia64/err-inject: Use get_user_pages_fast() · 69c90702
      Davidlohr Bueso authored
      At the point of sysfs callback, the call to gup is
      done without mmap_sem (or any lock for that matter).
      This is racy. As such, use the get_user_pages_fast()
      alternative and safely avoid taking the lock, if possible.
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      69c90702
    • Sergei Trofimovich's avatar
      ia64: doc: tweak whitespace for 'console=' parameter · 339d541a
      Sergei Trofimovich authored
      CC: Tony Luck <tony.luck@intel.com>
      CC: Fenghua Yu <fenghua.yu@intel.com>
      CC: linux-ia64@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      339d541a
    • Matthew Wilcox's avatar
      ia64: Convert remaining atomic operations · 2879b65f
      Matthew Wilcox authored
      While we've only seen inlining problems with atomic_sub_return(),
      the other atomic operations could have the same problem.  Convert all
      remaining operations to use the same solution as atomic_sub_return().
      Signed-off-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      2879b65f
    • Corentin Labbe's avatar
      ia64: convert unwcheck.py to python3 · bd5edbe6
      Corentin Labbe authored
      Since my system use python3 as default, arch/ia64/scripts/unwcheck.py no
      longer run.
      
      This patch convert it to the python3 syntax.
      I have ran it with python2/python3 while printing values of
      start/end/rlen_sum which could be impacted by this change and I see no difference.
      
      Fixes: 94a47083 ("scripts: change scripts to use system python instead of env")
      Signed-off-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      bd5edbe6
    • Pierre-Yves Kerbrat's avatar
      e1000e: allocate ring descriptors with dma_zalloc_coherent · aea3fca0
      Pierre-Yves Kerbrat authored
      Descriptor rings were not initialized at zero when allocated
      When area contained garbage data, it caused skb_over_panic in
      e1000_clean_rx_irq (if data had E1000_RXD_STAT_DD bit set)
      
      This patch makes use of dma_zalloc_coherent to make sure the
      ring is memset at 0 to prevent the area from containing garbage.
      
      Following is the signature of the panic:
      IODDR0@0.0: skbuff: skb_over_panic: text:80407b20 len:64010 put:64010 head:ab46d800 data:ab46d842 tail:0xab47d24c end:0xab46df40 dev:eth0
      IODDR0@0.0: BUG: failure at net/core/skbuff.c:105/skb_panic()!
      IODDR0@0.0: Kernel panic - not syncing: BUG!
      IODDR0@0.0:
      IODDR0@0.0: Process swapper/0 (pid: 0, threadinfo=81728000, task=8173cc00 ,cpu: 0)
      IODDR0@0.0: SP = <815a1c0c>
      IODDR0@0.0: Stack:      00000001
      IODDR0@0.0: b2d89800 815e33ac
      IODDR0@0.0: ea73c040 00000001
      IODDR0@0.0: 60040003 0000fa0a
      IODDR0@0.0: 00000002
      IODDR0@0.0:
      IODDR0@0.0: 804540c0 815a1c70
      IODDR0@0.0: b2744000 602ac070
      IODDR0@0.0: 815a1c44 b2d89800
      IODDR0@0.0: 8173cc00 815a1c08
      IODDR0@0.0:
      IODDR0@0.0:     00000006
      IODDR0@0.0: 815a1b50 00000000
      IODDR0@0.0: 80079434 00000001
      IODDR0@0.0: ab46df40 b2744000
      IODDR0@0.0: b2d89800
      IODDR0@0.0:
      IODDR0@0.0: 0000fa0a 8045745c
      IODDR0@0.0: 815a1c88 0000fa0a
      IODDR0@0.0: 80407b20 b2789f80
      IODDR0@0.0: 00000005 80407b20
      IODDR0@0.0:
      IODDR0@0.0:
      IODDR0@0.0: Call Trace:
      IODDR0@0.0: [<804540bc>] skb_panic+0xa4/0xa8
      IODDR0@0.0: [<80079430>] console_unlock+0x2f8/0x6d0
      IODDR0@0.0: [<80457458>] skb_put+0xa0/0xc0
      IODDR0@0.0: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
      IODDR0@0.0: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
      IODDR0@0.0: [<804079c8>] e1000_clean_rx_irq+0x188/0x3e8
      IODDR0@0.0: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
      IODDR0@0.0: [<80468b48>] __dev_kfree_skb_any+0x88/0xa8
      IODDR0@0.0: [<804101ac>] e1000e_poll+0x94/0x288
      IODDR0@0.0: [<8046e9d4>] net_rx_action+0x19c/0x4e8
      IODDR0@0.0:   ...
      IODDR0@0.0: Maximum depth to print reached. Use kstack=<maximum_depth_to_print> To specify a custom value (where 0 means to display the full backtrace)
      IODDR0@0.0: ---[ end Kernel panic - not syncing: BUG!
      Signed-off-by: default avatarPierre-Yves Kerbrat <pkerbrat@kalray.eu>
      Signed-off-by: default avatarMarius Gligor <mgligor@kalray.eu>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      aea3fca0
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-4.16-rc5' of... · 094b58e1
      Linus Torvalds authored
      Merge tag 'linux-kselftest-4.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fixes from Shuah Khan:
       "A fix for regression in memory-hotplug install script that prevents
        the test from running on the target"
      
      * tag 'linux-kselftest-4.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: memory-hotplug: fix emit_tests regression
      094b58e1
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 54704614
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Use an appropriate TSQ pacing shift in mac80211, from Toke
          Høiland-Jørgensen.
      
       2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk
          in ip6_route_me_harder, from Eric Dumazet.
      
       3) Fix several shutdown races and similar other problems in l2tp, from
          James Chapman.
      
       4) Handle missing XDP flush properly in tuntap, for real this time.
          From Jason Wang.
      
       5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel
          Borkmann.
      
       6) Fix phy_resume() locking, from Andrew Lunn.
      
       7) IFLA_MTU values are ignored on newlink for some tunnel types, fix
          from Xin Long.
      
       8) Revert F-RTO middle box workarounds, they only handle one dimension
          of the problem. From Yuchung Cheng.
      
       9) Fix socket refcounting in RDS, from Ka-Cheong Poon.
      
      10) Don't allow ppp unit registration to an unregistered channel, from
          Guillaume Nault.
      
      11) Various hv_netvsc fixes from Stephen Hemminger.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits)
        hv_netvsc: propagate rx filters to VF
        hv_netvsc: filter multicast/broadcast
        hv_netvsc: defer queue selection to VF
        hv_netvsc: use napi_schedule_irqoff
        hv_netvsc: fix race in napi poll when rescheduling
        hv_netvsc: cancel subchannel setup before halting device
        hv_netvsc: fix error unwind handling if vmbus_open fails
        hv_netvsc: only wake transmit queue if link is up
        hv_netvsc: avoid retry on send during shutdown
        virtio-net: re enable XDP_REDIRECT for mergeable buffer
        ppp: prevent unregistered channels from connecting to PPP units
        tc-testing: skbmod: fix match value of ethertype
        mlxsw: spectrum_switchdev: Check success of FDB add operation
        net: make skb_gso_*_seglen functions private
        net: xfrm: use skb_gso_validate_network_len() to check gso sizes
        net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
        net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len
        rds: Incorrect reference counting in TCP socket creation
        net: ethtool: don't ignore return from driver get_fecparam method
        vrf: check forwarding on the original netdevice when generating ICMP dest unreachable
        ...
      54704614
    • Benjamin Poirier's avatar
      e1000e: Fix check_for_link return value with autoneg off · 4e7dc08e
      Benjamin Poirier authored
      When autoneg is off, the .check_for_link callback functions clear the
      get_link_status flag and systematically return a "pseudo-error". This means
      that the link is not detected as up until the next execution of the
      e1000_watchdog_task() 2 seconds later.
      
      Fixes: 19110cfb ("e1000e: Separate signaling for link check/link up")
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.com>
      Acked-by: default avatarSasha Neftin <sasha.neftin@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4e7dc08e