1. 17 Apr, 2017 12 commits
    • Daniel Borkmann's avatar
      bpf: fix checking xdp_adjust_head on tail calls · c2002f98
      Daniel Borkmann authored
      Commit 17bedab2 ("bpf: xdp: Allow head adjustment in XDP prog")
      added the xdp_adjust_head bit to the BPF prog in order to tell drivers
      that the program that is to be attached requires support for the XDP
      bpf_xdp_adjust_head() helper such that drivers not supporting this
      helper can reject the program. There are also drivers that do support
      the helper, but need to check for xdp_adjust_head bit in order to move
      packet metadata prepended by the firmware away for making headroom.
      
      For these cases, the current check for xdp_adjust_head bit is insufficient
      since there can be cases where the program itself does not use the
      bpf_xdp_adjust_head() helper, but tail calls into another program that
      uses bpf_xdp_adjust_head(). As such, the xdp_adjust_head bit is still
      set to 0. Since the first program has no control over which program it
      calls into, we need to assume that bpf_xdp_adjust_head() helper is used
      upon tail calls. Thus, for the very same reasons in cb_access, set the
      xdp_adjust_head bit to 1 when the main program uses tail calls.
      
      Fixes: 17bedab2 ("bpf: xdp: Allow head adjustment in XDP prog")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2002f98
    • Daniel Borkmann's avatar
      bpf: fix cb access in socket filter programs on tail calls · 6b1bb01b
      Daniel Borkmann authored
      Commit ff936a04 ("bpf: fix cb access in socket filter programs")
      added a fix for socket filter programs such that in i) AF_PACKET the
      20 bytes of skb->cb[] area gets zeroed before use in order to not leak
      data, and ii) socket filter programs attached to TCP/UDP sockets need
      to save/restore these 20 bytes since they are also used by protocol
      layers at that time.
      
      The problem is that bpf_prog_run_save_cb() and bpf_prog_run_clear_cb()
      only look at the actual attached program to determine whether to zero
      or save/restore the skb->cb[] parts. There can be cases where the
      actual attached program does not access the skb->cb[], but the program
      tail calls into another program which does access this area. In such
      a case, the zero or save/restore is currently not performed.
      
      Since the programs we tail call into are unknown at verification time
      and can dynamically change, we need to assume that whenever the attached
      program performs a tail call, that later programs could access the
      skb->cb[], and therefore we need to always set cb_access to 1.
      
      Fixes: ff936a04 ("bpf: fix cb access in socket filter programs")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b1bb01b
    • Florian Westphal's avatar
      ipv6: drop non loopback packets claiming to originate from ::1 · 0aa8c13e
      Florian Westphal authored
      We lack a saddr check for ::1. This causes security issues e.g. with acls
      permitting connections from ::1 because of assumption that these originate
      from local machine.
      
      Assuming a source address of ::1 is local seems reasonable.
      RFC4291 doesn't allow such a source address either, so drop such packets.
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0aa8c13e
    • David S. Miller's avatar
      Merge branch 'mediatek-tx-bugs' · 71947f0f
      David S. Miller authored
      Sean Wang says:
      
      ====================
      mediatek: Fix crash caused by reporting inconsistent skb->len to BQL
      
      Changes since v1:
      - fix inconsistent enumeration which easily causes the potential bug
      
      The series fixes kernel BUG caused by inconsistent SKB length reported
      into BQL. The reason for inconsistent length comes from hardware BUG which
      results in different port number carried on the TXD within the lifecycle of
      SKB. So patch 2) is proposed for use a software way to track which port
      the SKB involving instead of hardware way. And patch 1) is given for another
      issue I found which causes TXD and SKB inconsistency that is not expected
      in the initial logic, so it is also being corrected it in the series.
      
      The log for the kernel BUG caused by the issue is posted as below.
      
      [  120.825955] kernel BUG at ... lib/dynamic_queue_limits.c:26!
      [  120.837684] Internal error: Oops - BUG: 0 [#1] SMP ARM
      [  120.842778] Modules linked in:
      [  120.845811] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-191576-gdbcef47 #35
      [  120.853488] Hardware name: Mediatek Cortex-A7 (Device Tree)
      [  120.859012] task: c1007480 task.stack: c1000000
      [  120.863510] PC is at dql_completed+0x108/0x17c
      [  120.867915] LR is at 0x46
      [  120.870512] pc : [<c03c19c8>]    lr : [<00000046>]    psr: 80000113
      [  120.870512] sp : c1001d58  ip : c1001d80  fp : c1001d7c
      [  120.881895] r10: 0000003e  r9 : df6b3400  r8 : 0ed86506
      [  120.887075] r7 : 00000001  r6 : 00000001  r5 : 0ed8654c  r4 : df0135d8
      [  120.893546] r3 : 00000001  r2 : df016800  r1 : 0000fece  r0 : df6b3480
      [  120.900018] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [  120.907093] Control: 10c5387d  Table: 9e27806a  DAC: 00000051
      [  120.912789] Process swapper/0 (pid: 0, stack limit = 0xc1000218)
      [  120.918744] Stack: (0xc1001d58 to 0xc1002000)
      
      ....
      
      121.085331] 1fc0: 00000000 c0a52a28 00000000 c10855d4 c1003c58 c0a52a24 c100885c 8000406a
      [  121.093444] 1fe0: 410fc073 00000000 00000000 c1001ff8 8000807c c0a009cc 00000000 00000000
      [  121.101575] [<c03c19c8>] (dql_completed) from [<c04cb010>] (mtk_napi_tx+0x1d0/0x37c)
      [  121.109263] [<c04cb010>] (mtk_napi_tx) from [<c05e28cc>] (net_rx_action+0x24c/0x3b8)
      [  121.116951] [<c05e28cc>] (net_rx_action) from [<c010152c>] (__do_softirq+0xe4/0x35c)
      [  121.124638] [<c010152c>] (__do_softirq) from [<c012a624>] (irq_exit+0xe8/0x150)
      [  121.131895] [<c012a624>] (irq_exit) from [<c017750c>] (__handle_domain_irq+0x70/0xc4)
      [  121.139666] [<c017750c>] (__handle_domain_irq) from [<c0101404>] (gic_handle_irq+0x58/0x9c)
      [  121.147953] [<c0101404>] (gic_handle_irq) from [<c010e18c>] (__irq_svc+0x6c/0x90)
      [  121.155373] Exception stack(0xc1001ef8 to 0xc1001f40)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71947f0f
    • Sean Wang's avatar
      net: ethernet: mediatek: fix inconsistency of port number carried in TXD · 134d2152
      Sean Wang authored
      Fix port inconsistency on TXD due to hardware BUG that would cause
      different port number is carried on the same TXD between tx_map()
      and tx_unmap() with the iperf test. It would cause confusing BQL
      logic which leads to kernel panic when dual GMAC runs concurrently.
      Signed-off-by: default avatarSean Wang <sean.wang@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      134d2152
    • Sean Wang's avatar
      net: ethernet: mediatek: fix inconsistency between TXD and the used buffer · 81d2dd09
      Sean Wang authored
      Fix inconsistency between the TXD descriptor and the used buffer that
      would cause unexpected logic at mtk_tx_unmap() during skb housekeeping.
      Signed-off-by: default avatarSean Wang <sean.wang@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81d2dd09
    • Grygorii Strashko's avatar
      net: phy: micrel: fix crash when statistic requested for KSZ9031 phy · bfe72442
      Grygorii Strashko authored
      Now the command:
      	ethtool --phy-statistics eth0
      will cause system crash with meassage "Unable to handle kernel NULL pointer
      dereference at virtual address 00000010" from:
      
       (kszphy_get_stats) from [<c069f1d8>] (ethtool_get_phy_stats+0xd8/0x210)
       (ethtool_get_phy_stats) from [<c06a0738>] (dev_ethtool+0x5b8/0x228c)
       (dev_ethtool) from [<c06b5484>] (dev_ioctl+0x3fc/0x964)
       (dev_ioctl) from [<c0679f7c>] (sock_ioctl+0x170/0x2c0)
       (sock_ioctl) from [<c02419d4>] (do_vfs_ioctl+0xa8/0x95c)
       (do_vfs_ioctl) from [<c02422c4>] (SyS_ioctl+0x3c/0x64)
       (SyS_ioctl) from [<c0107d60>] (ret_fast_syscall+0x0/0x44)
      
      The reason: phy_driver structure for KSZ9031 phy has no .probe() callback
      defined. As result, struct phy_device *phydev->priv pointer will not be
      initializes (null).
      This issue will affect also following phys:
       KSZ8795, KSZ886X, KSZ8873MLL, KSZ9031, KSZ9021, KSZ8061, KS8737
      
      Fix it by:
      - adding .probe() = kszphy_probe() callback to KSZ9031, KSZ9021
      phys. The kszphy_probe() can be re-used as it doesn't do any phy specific
      settings.
      - removing statistic callbacks from other phys (KSZ8795, KSZ886X,
      KSZ8873MLL, KSZ8061, KS8737) as they doesn't have corresponding
      statistic counters.
      
      Fixes: 2b2427d0 ("phy: micrel: Add ethtool statistics counters")
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfe72442
    • David Ahern's avatar
      net: vrf: Fix setting NLM_F_EXCL flag when adding l3mdev rule · 426c87ca
      David Ahern authored
      Only need 1 l3mdev FIB rule. Fix setting NLM_F_EXCL in the nlmsghdr.
      
      Fixes: 1aa6c4f6 ("net: vrf: Add l3mdev rules on first device create")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      426c87ca
    • George Cherian's avatar
      net: thunderx: Fix set_max_bgx_per_node for 81xx rgx · b47a57a2
      George Cherian authored
      Add the PCI_SUBSYS_DEVID_81XX_RGX and use the same to set
      the max bgx per node count.
      
      This fixes the issue intoduced by following commit
      78aacb6f net: thunderx: Fix invalid mac addresses for node1 interfaces
      With this commit the max_bgx_per_node for 81xx is set as 2 instead of 3
      because of which num_vfs is always calculated as zero.
      Signed-off-by: default avatarGeorge Cherian <george.cherian@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b47a57a2
    • Willem de Bruijn's avatar
      net-timestamp: avoid use-after-free in ip_recv_error · 1862d620
      Willem de Bruijn authored
      Syzkaller reported a use-after-free in ip_recv_error at line
      
          info->ipi_ifindex = skb->dev->ifindex;
      
      This function is called on dequeue from the error queue, at which
      point the device pointer may no longer be valid.
      
      Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
      pointer is valid or NULL. Store it in temporary storage skb->cb.
      
      It is safe to reference skb->dev here, as called from device drivers
      or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
      in that case it is NULL and ifindex is set to 0 (invalid).
      
      Do not return a pktinfo cmsg if ifindex is 0. This maintains the
      current behavior of not returning a cmsg if skb->dev was NULL.
      
      On dequeue, the ipv4 path will cast from sock_exterr_skb to
      in_pktinfo. Both have ifindex as their first element, so no explicit
      conversion is needed. This is by design, introduced in commit
      0b922b7a ("net: original ingress device index in PKTINFO"). For
      ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.
      
      Fixes: 829ae9d6 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1862d620
    • WANG Cong's avatar
      ipv4: fix a deadlock in ip_ra_control · 1215e51e
      WANG Cong authored
      Similar to commit 87e9f031
      ("ipv4: fix a potential deadlock in mcast getsockopt() path"),
      there is a deadlock scenario for IP_ROUTER_ALERT too:
      
             CPU0                    CPU1
             ----                    ----
        lock(rtnl_mutex);
                                     lock(sk_lock-AF_INET);
                                     lock(rtnl_mutex);
        lock(sk_lock-AF_INET);
      
      Fix this by always locking RTNL first on all setsockopt() paths.
      
      Note, after this patch ip_ra_lock is no longer needed either.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1215e51e
    • Bert Kenward's avatar
      sfc: limit the number of receive queues · 271a8b42
      Bert Kenward authored
      The number of rx queues is determined by the rss_cpus parameter
      or the cpu topology. If that is higher than EFX_MAX_RX_QUEUES the
      driver can corrupt state.
      
      Fixes: 8ceee660 ("New driver "sfc" for Solarstorm SFC4000 controller.")
      Signed-off-by: default avatarBert Kenward <bkenward@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      271a8b42
  2. 15 Apr, 2017 3 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 1bf4b126
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
       "Just a small update to xpad driver to recognize yet another gamepad,
        and another change making sure userio.h is exported"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: xpad - add support for Razer Wildcat gamepad
        uapi: add missing install of userio.h
      1bf4b126
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7e703ecc
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Things seem to be settling down as far as networking is concerned,
        let's hope this trend continues...
      
         1) Add iov_iter_revert() and use it to fix the behavior of
            skb_copy_datagram_msg() et al., from Al Viro.
      
         2) Fix the protocol used in the synthetic SKB we cons up for the
            purposes of doing a simulated route lookup for RTM_GETROUTE
            requests. From Florian Larysch.
      
         3) Don't add noop_qdisc to the per-device qdisc hashes, from Cong
            Wang.
      
         4) Don't call netdev_change_features with the team lock held, from
            Xin Long.
      
         5) Revert TCP F-RTO extension to catch more spurious timeouts because
            it interacts very badly with some middle-boxes. From Yuchung
            Cheng.
      
         6) Fix the loss of error values in l2tp {s,g}etsockopt calls, from
            Guillaume Nault.
      
         7) ctnetlink uses bit positions where it should be using bit masks,
            fix from Liping Zhang.
      
         8) Missing RCU locking in netfilter helper code, from Gao Feng.
      
         9) Avoid double frees and use-after-frees in tcp_disconnect(), from
            Eric Dumazet.
      
        10) Don't do a changelink before we register the netdevice in
            bridging, from Ido Schimmel.
      
        11) Lock the ipv6 device address list properly, from Rabin Vincent"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
        netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage
        netfilter: nft_hash: do not dump the auto generated seed
        drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201
        ipv6: Fix idev->addr_list corruption
        net: xdp: don't export dev_change_xdp_fd()
        bridge: netlink: register netdevice before executing changelink
        bridge: implement missing ndo_uninit()
        bpf: reference may_access_skb() from __bpf_prog_run()
        tcp: clear saved_syn in tcp_disconnect()
        netfilter: nf_ct_expect: use proper RCU list traversal/update APIs
        netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULL
        netfilter: make it safer during the inet6_dev->addr_list traversal
        netfilter: ctnetlink: make it safer when checking the ct helper name
        netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_find
        netfilter: ctnetlink: using bit to represent the ct event
        netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
        net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skb
        l2tp: don't mask errors in pppol2tp_getsockopt()
        l2tp: don't mask errors in pppol2tp_setsockopt()
        tcp: restrict F-RTO to work-around broken middle-boxes
        ...
      7e703ecc
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 91174391
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of small fixes for x86:
      
         - fix locking in RDT to prevent memory leaks and freeing in use
           memory
      
         - prevent setting invalid values for vdso32_enabled which cause
           inconsistencies for user space resulting in application crashes.
      
         - plug a race in the vdso32 code between fork and sysctl which causes
           inconsistencies for user space resulting in application crashes.
      
         - make MPX signal delivery work in compat mode
      
         - make the dmesg output of traps and faults readable again"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/intel_rdt: Fix locking in rdtgroup_schemata_write()
        x86/debug: Fix the printk() debug output of signal_fault(), do_trap() and do_general_protection()
        x86/vdso: Plug race between mapping and ELF header setup
        x86/vdso: Ensure vdso32_enabled gets set to valid values only
        x86/signals: Fix lower/upper bound reporting in compat siginfo
      91174391
  3. 14 Apr, 2017 25 commits