1. 31 Mar, 2018 28 commits
    • Yunsheng Lin's avatar
      net: hns: Fix a skb used after free bug · a8f4be01
      Yunsheng Lin authored
      commit 27463ad9 upstream.
      
      skb maybe freed in hns_nic_net_xmit_hw() and return NETDEV_TX_OK,
      which cause hns_nic_net_xmit to use a freed skb.
      
      BUG: KASAN: use-after-free in hns_nic_net_xmit_hw+0x62c/0x940...
      	[17659.112635]      alloc_debug_processing+0x18c/0x1a0
      	[17659.117208]      __slab_alloc+0x52c/0x560
      	[17659.120909]      kmem_cache_alloc_node+0xac/0x2c0
      	[17659.125309]      __alloc_skb+0x6c/0x260
      	[17659.128837]      tcp_send_ack+0x8c/0x280
      	[17659.132449]      __tcp_ack_snd_check+0x9c/0xf0
      	[17659.136587]      tcp_rcv_established+0x5a4/0xa70
      	[17659.140899]      tcp_v4_do_rcv+0x27c/0x620
      	[17659.144687]      tcp_prequeue_process+0x108/0x170
      	[17659.149085]      tcp_recvmsg+0x940/0x1020
      	[17659.152787]      inet_recvmsg+0x124/0x180
      	[17659.156488]      sock_recvmsg+0x64/0x80
      	[17659.160012]      SyS_recvfrom+0xd8/0x180
      	[17659.163626]      __sys_trace_return+0x0/0x4
      	[17659.167506] INFO: Freed in kfree_skbmem+0xa0/0xb0 age=23 cpu=1 pid=13
      	[17659.174000]      free_debug_processing+0x1d4/0x2c0
      	[17659.178486]      __slab_free+0x240/0x390
      	[17659.182100]      kmem_cache_free+0x24c/0x270
      	[17659.186062]      kfree_skbmem+0xa0/0xb0
      	[17659.189587]      __kfree_skb+0x28/0x40
      	[17659.193025]      napi_gro_receive+0x168/0x1c0
      	[17659.197074]      hns_nic_rx_up_pro+0x58/0x90
      	[17659.201038]      hns_nic_rx_poll_one+0x518/0xbc0
      	[17659.205352]      hns_nic_common_poll+0x94/0x140
      	[17659.209576]      net_rx_action+0x458/0x5e0
      	[17659.213363]      __do_softirq+0x1b8/0x480
      	[17659.217062]      run_ksoftirqd+0x64/0x80
      	[17659.220679]      smpboot_thread_fn+0x224/0x310
      	[17659.224821]      kthread+0x150/0x170
      	[17659.228084]      ret_from_fork+0x10/0x40
      
      	BUG: KASAN: use-after-free in hns_nic_net_xmit+0x8c/0xc0...
      	[17751.080490]      __slab_alloc+0x52c/0x560
      	[17751.084188]      kmem_cache_alloc+0x244/0x280
      	[17751.088238]      __build_skb+0x40/0x150
      	[17751.091764]      build_skb+0x28/0x100
      	[17751.095115]      __alloc_rx_skb+0x94/0x150
      	[17751.098900]      __napi_alloc_skb+0x34/0x90
      	[17751.102776]      hns_nic_rx_poll_one+0x180/0xbc0
      	[17751.107097]      hns_nic_common_poll+0x94/0x140
      	[17751.111333]      net_rx_action+0x458/0x5e0
      	[17751.115123]      __do_softirq+0x1b8/0x480
      	[17751.118823]      run_ksoftirqd+0x64/0x80
      	[17751.122437]      smpboot_thread_fn+0x224/0x310
      	[17751.126575]      kthread+0x150/0x170
      	[17751.129838]      ret_from_fork+0x10/0x40
      	[17751.133454] INFO: Freed in kfree_skbmem+0xa0/0xb0 age=19 cpu=7 pid=43
      	[17751.139951]      free_debug_processing+0x1d4/0x2c0
      	[17751.144436]      __slab_free+0x240/0x390
      	[17751.148051]      kmem_cache_free+0x24c/0x270
      	[17751.152014]      kfree_skbmem+0xa0/0xb0
      	[17751.155543]      __kfree_skb+0x28/0x40
      	[17751.159022]      napi_gro_receive+0x168/0x1c0
      	[17751.163074]      hns_nic_rx_up_pro+0x58/0x90
      	[17751.167041]      hns_nic_rx_poll_one+0x518/0xbc0
      	[17751.171358]      hns_nic_common_poll+0x94/0x140
      	[17751.175585]      net_rx_action+0x458/0x5e0
      	[17751.179373]      __do_softirq+0x1b8/0x480
      	[17751.183076]      run_ksoftirqd+0x64/0x80
      	[17751.186691]      smpboot_thread_fn+0x224/0x310
      	[17751.190826]      kthread+0x150/0x170
      	[17751.194093]      ret_from_fork+0x10/0x40
      
      Fixes: 13ac695e ("net:hns: Add support of Hip06 SoC to the Hislicon Network Subsystem")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarlipeng <lipeng321@huawei.com>
      Reported-by: default avatarJun He <hjat2005@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarErick Reyes <erickreyes@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8f4be01
    • Tom Herbert's avatar
      kcm: lock lower socket in kcm_attach · 406996f3
      Tom Herbert authored
      
      [ Upstream commit 2cc683e8 ]
      
      Need to lock lower socket in order to provide mutual exclusion
      with kcm_unattach.
      
      v2: Add Reported-by for syzbot
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reported-by: syzbot+ea75c0ffcd353d32515f064aaebefc5279e6161e@syzkaller.appspotmail.com
      Signed-off-by: default avatarTom Herbert <tom@quantonium.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      406996f3
    • Florian Fainelli's avatar
      net: systemport: Rewrite __bcm_sysport_tx_reclaim() · 002f4557
      Florian Fainelli authored
      
      [ Upstream commit 484d802d ]
      
      There is no need for complex checking between the last consumed index
      and current consumed index, a simple subtraction will do.
      
      This also eliminates the possibility of a permanent transmit queue stall
      under the following conditions:
      
      - one CPU bursts ring->size worth of traffic (up to 256 buffers), to the
        point where we run out of free descriptors, so we stop the transmit
        queue at the end of bcm_sysport_xmit()
      
      - because of our locking, we have the transmit process disable
        interrupts which means we can be blocking the TX reclamation process
      
      - when TX reclamation finally runs, we will be computing the difference
        between ring->c_index (last consumed index by SW) and what the HW
        reports through its register
      
      - this register is masked with (ring->size - 1) = 0xff, which will lead
        to stripping the upper bits of the index (register is 16-bits wide)
      
      - we will be computing last_tx_cn as 0, which means there is no work to
        be done, and we never wake-up the transmit queue, leaving it
        permanently disabled
      
      A practical example is e.g: ring->c_index aka last_c_index = 12, we
      pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12,
      so last_tx_cn == 0, nothing happens.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      002f4557
    • Julian Wiedmann's avatar
      s390/qeth: on channel error, reject further cmd requests · 4751804d
      Julian Wiedmann authored
      
      [ Upstream commit a6c3d939 ]
      
      When the IRQ handler determines that one of the cmd IO channels has
      failed and schedules recovery, block any further cmd requests from
      being submitted. The request would inevitably stall, and prevent the
      recovery from making progress until the request times out.
      
      This sort of error was observed after Live Guest Relocation, where
      the pending IO on the READ channel intentionally gets terminated to
      kick-start recovery. Simultaneously the guest executed SIOCETHTOOL,
      triggering qeth to issue a QUERY CARD INFO command. The command
      then stalled in the inoperabel WRITE channel.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4751804d
    • Julian Wiedmann's avatar
      s390/qeth: lock read device while queueing next buffer · 3426c365
      Julian Wiedmann authored
      
      [ Upstream commit 17bf8c9b ]
      
      For calling ccw_device_start(), issue_next_read() needs to hold the
      device's ccwlock.
      This is satisfied for the IRQ handler path (where qeth_irq() gets called
      under the ccwlock), but we need explicit locking for the initial call by
      the MPC initialization.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3426c365
    • Julian Wiedmann's avatar
      s390/qeth: when thread completes, wake up all waiters · 9ff2636b
      Julian Wiedmann authored
      
      [ Upstream commit 1063e432 ]
      
      qeth_wait_for_threads() is potentially called by multiple users, make
      sure to notify all of them after qeth_clear_thread_running_bit()
      adjusted the thread_running_mask. With no timeout, callers would
      otherwise stall.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ff2636b
    • Julian Wiedmann's avatar
      s390/qeth: free netdevice when removing a card · 4593b4c0
      Julian Wiedmann authored
      
      [ Upstream commit 6be68739 ]
      
      On removal, a qeth card's netdevice is currently not properly freed
      because the call chain looks as follows:
      
      qeth_core_remove_device(card)
      	lx_remove_device(card)
      		unregister_netdev(card->dev)
      		card->dev = NULL			!!!
      	qeth_core_free_card(card)
      		if (card->dev)				!!!
      			free_netdev(card->dev)
      
      Fix it by free'ing the netdev straight after unregistering. This also
      fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()),
      where the need to free the current netdevice was not considered at all.
      
      Note that free_netdev() takes care of the netif_napi_del() for us too.
      
      Fixes: 4a71df50 ("qeth: new qeth device driver")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Reviewed-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4593b4c0
    • Madalin Bucur's avatar
      soc/fsl/qbman: fix issue in qman_delete_cgr_safe() · a85c525b
      Madalin Bucur authored
      
      [ Upstream commit 96f413f4 ]
      
      The wait_for_completion() call in qman_delete_cgr_safe()
      was triggering a scheduling while atomic bug, replacing the
      kthread with a smp_call_function_single() call to fix it.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarRoy Pledge <roy.pledge@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a85c525b
    • Arkadi Sharshevsky's avatar
      team: Fix double free in error path · b1403114
      Arkadi Sharshevsky authored
      
      [ Upstream commit cbcc607e ]
      
      The __send_and_alloc_skb() receives a skb ptr as a parameter but in
      case it fails the skb is not valid:
      - Send failed and released the skb internally.
      - Allocation failed.
      
      The current code tries to release the skb in case of failure which
      causes redundant freeing.
      
      Fixes: 9b00cf2d ("team: implement multipart netlink messages for options transfers")
      Signed-off-by: default avatarArkadi Sharshevsky <arkadis@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1403114
    • Vinicius Costa Gomes's avatar
      skbuff: Fix not waking applications when errors are enqueued · d5862b05
      Vinicius Costa Gomes authored
      
      [ Upstream commit 6e5d58fd ]
      
      When errors are enqueued to the error queue via sock_queue_err_skb()
      function, it is possible that the waiting application is not notified.
      
      Calling 'sk->sk_data_ready()' would not notify applications that
      selected only POLLERR events in poll() (for example).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarRandy E. Witt <randy.e.witt@intel.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5862b05
    • David Ahern's avatar
      net: Only honor ifindex in IP_PKTINFO if non-0 · 5f02dcec
      David Ahern authored
      
      [ Upstream commit 2cbb4ea7 ]
      
      Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings
      if the index is actually set in the message.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f02dcec
    • Nicolas Dichtel's avatar
      netlink: avoid a double skb free in genlmsg_mcast() · 455fc99c
      Nicolas Dichtel authored
      
      [ Upstream commit 02a2385f ]
      
      nlmsg_multicast() consumes always the skb, thus the original skb must be
      freed only when this function is called with a clone.
      
      Fixes: cb9f7a9a ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
      Reported-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      455fc99c
    • Arvind Yadav's avatar
      net/iucv: Free memory obtained by kzalloc · 7d91487e
      Arvind Yadav authored
      
      [ Upstream commit fa6a91e9 ]
      
      Free memory by calling put_device(), if afiucv_iucv_init is not
      successful.
      Signed-off-by: default avatarArvind Yadav <arvind.yadav.cs@gmail.com>
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarUrsula Braun <ursula.braun@de.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d91487e
    • Florian Fainelli's avatar
      net: fec: Fix unbalanced PM runtime calls · c3860a38
      Florian Fainelli authored
      
      [ Upstream commit a069215c ]
      
      When unbinding/removing the driver, we will run into the following warnings:
      
      [  259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator
      [  259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable!
      [  259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00
      [  259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1
      [  259.696239] libphy: fec_enet_mii_bus: probed
      
      Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3860a38
    • SZ Lin (林上智)'s avatar
      net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface · 1f7a3957
      SZ Lin (林上智) authored
      
      [ Upstream commit f9db5069 ]
      
      According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and
      DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only
      available when PHY is configured in RGMII mode with 10Mbps speed. It will
      cause some networking issues without RGMII mode, such as carrier sense
      errors and low throughput. TI also mentioned this issue in their forum[4].
      
      This patch adds the check mechanism for PHY interface with RGMII interface
      type, the in-band mode can only be set in RGMII mode with 10Mbps speed.
      
      References:
      [1]: https://www.ti.com/lit/ug/spruh73p/spruh73p.pdf
      [2]: http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf
      [3]: http://www.ti.com/lit/ug/spruic2b/spruic2b.pdf
      [4]: https://e2e.ti.com/support/arm/sitara_arm/f/791/p/640765/2392155Suggested-by: default avatarHolsety Chen (陳憲輝) <Holsety.Chen@moxa.com>
      Signed-off-by: default avatarSZ Lin (林上智) <sz.lin@moxa.com>
      Signed-off-by: default avatarSchuyler Patton <spatton@ti.com>
      Reviewed-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f7a3957
    • Christophe JAILLET's avatar
      net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred · e9f83a8b
      Christophe JAILLET authored
      
      [ Upstream commit 00777fac ]
      
      If the optional regulator is deferred, we must release some resources.
      They will be re-allocated when the probe function will be called again.
      
      Fixes: 6eacf311 ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9f83a8b
    • Eric Dumazet's avatar
      l2tp: do not accept arbitrary sockets · 84fc2d7c
      Eric Dumazet authored
      
      [ Upstream commit 17cfe79a ]
      
      syzkaller found an issue caused by lack of sufficient checks
      in l2tp_tunnel_create()
      
      RAW sockets can not be considered as UDP ones for instance.
      
      In another patch, we shall replace all pr_err() by less intrusive
      pr_debug() so that syzkaller can find other bugs faster.
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
      dst_release: dst:00000000d53d0d0f refcnt:-1
      Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242
      
      CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x24d lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report+0x23b/0x360 mm/kasan/report.c:412
       __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435
       setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
       l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596
       pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707
       SYSC_connect+0x213/0x4a0 net/socket.c:1640
       SyS_connect+0x24/0x30 net/socket.c:1621
       do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84fc2d7c
    • Lorenzo Bianconi's avatar
      ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option() · c5e6439e
      Lorenzo Bianconi authored
      
      [ Upstream commit 9f62c15f ]
      
      Fix the following slab-out-of-bounds kasan report in
      ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
      linear and the accessed data are not in the linear data region of orig_skb.
      
      [ 1503.122508] ==================================================================
      [ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
      [ 1503.123036] Read of size 1184 at addr ffff8800298ab6b0 by task netperf/1932
      
      [ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
      [ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
      [ 1503.123527] Call Trace:
      [ 1503.123579]  <IRQ>
      [ 1503.123638]  print_address_description+0x6e/0x280
      [ 1503.123849]  kasan_report+0x233/0x350
      [ 1503.123946]  memcpy+0x1f/0x50
      [ 1503.124037]  ndisc_send_redirect+0x94e/0x990
      [ 1503.125150]  ip6_forward+0x1242/0x13b0
      [...]
      [ 1503.153890] Allocated by task 1932:
      [ 1503.153982]  kasan_kmalloc+0x9f/0xd0
      [ 1503.154074]  __kmalloc_track_caller+0xb5/0x160
      [ 1503.154198]  __kmalloc_reserve.isra.41+0x24/0x70
      [ 1503.154324]  __alloc_skb+0x130/0x3e0
      [ 1503.154415]  sctp_packet_transmit+0x21a/0x1810
      [ 1503.154533]  sctp_outq_flush+0xc14/0x1db0
      [ 1503.154624]  sctp_do_sm+0x34e/0x2740
      [ 1503.154715]  sctp_primitive_SEND+0x57/0x70
      [ 1503.154807]  sctp_sendmsg+0xaa6/0x1b10
      [ 1503.154897]  sock_sendmsg+0x68/0x80
      [ 1503.154987]  ___sys_sendmsg+0x431/0x4b0
      [ 1503.155078]  __sys_sendmsg+0xa4/0x130
      [ 1503.155168]  do_syscall_64+0x171/0x3f0
      [ 1503.155259]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ 1503.155436] Freed by task 1932:
      [ 1503.155527]  __kasan_slab_free+0x134/0x180
      [ 1503.155618]  kfree+0xbc/0x180
      [ 1503.155709]  skb_release_data+0x27f/0x2c0
      [ 1503.155800]  consume_skb+0x94/0xe0
      [ 1503.155889]  sctp_chunk_put+0x1aa/0x1f0
      [ 1503.155979]  sctp_inq_pop+0x2f8/0x6e0
      [ 1503.156070]  sctp_assoc_bh_rcv+0x6a/0x230
      [ 1503.156164]  sctp_inq_push+0x117/0x150
      [ 1503.156255]  sctp_backlog_rcv+0xdf/0x4a0
      [ 1503.156346]  __release_sock+0x142/0x250
      [ 1503.156436]  release_sock+0x80/0x180
      [ 1503.156526]  sctp_sendmsg+0xbb0/0x1b10
      [ 1503.156617]  sock_sendmsg+0x68/0x80
      [ 1503.156708]  ___sys_sendmsg+0x431/0x4b0
      [ 1503.156799]  __sys_sendmsg+0xa4/0x130
      [ 1503.156889]  do_syscall_64+0x171/0x3f0
      [ 1503.156980]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ 1503.157158] The buggy address belongs to the object at ffff8800298ab600
                      which belongs to the cache kmalloc-1024 of size 1024
      [ 1503.157444] The buggy address is located 176 bytes inside of
                      1024-byte region [ffff8800298ab600, ffff8800298aba00)
      [ 1503.157702] The buggy address belongs to the page:
      [ 1503.157820] page:ffffea0000a62a00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
      [ 1503.158053] flags: 0x4000000000008100(slab|head)
      [ 1503.158171] raw: 4000000000008100 0000000000000000 0000000000000000 00000001800e000e
      [ 1503.158350] raw: dead000000000100 dead000000000200 ffff880036002600 0000000000000000
      [ 1503.158523] page dumped because: kasan: bad access detected
      
      [ 1503.158698] Memory state around the buggy address:
      [ 1503.158816]  ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1503.158988]  ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1503.159165] >ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1503.159338]                    ^
      [ 1503.159436]  ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1503.159610]  ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1503.159785] ==================================================================
      [ 1503.159964] Disabling lock debugging due to kernel taint
      
      The test scenario to trigger the issue consists of 4 devices:
      - H0: data sender, connected to LAN0
      - H1: data receiver, connected to LAN1
      - GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
        ethernet connection on LAN0 and LAN1
      On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for
      data from LAN0 to LAN1.
      Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent
      data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send
      buffer size is set to 16K). While data streams are active flush the route
      cache on HA multiple times.
      I have not been able to identify a given commit that introduced the issue
      since, using the reproducer described above, the kasan report has been
      triggered from 4.14 and I have not gone back further.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5e6439e
    • Alexey Kodanev's avatar
      dccp: check sk for closed state in dccp_sendmsg() · 1fdc00c1
      Alexey Kodanev authored
      
      [ Upstream commit 67f93df7 ]
      
      dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
      therefore if DCCP socket is disconnected and dccp_sendmsg() is
      called after it, it will cause a NULL pointer dereference in
      dccp_write_xmit().
      
      This crash and the reproducer was reported by syzbot. Looks like
      it is reproduced if commit 69c64866 ("dccp: CVE-2017-8824:
      use-after-free in DCCP code") is applied.
      
      Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fdc00c1
    • Kirill Tkhai's avatar
      net: Fix hlist corruptions in inet_evict_bucket() · 1562f147
      Kirill Tkhai authored
      
      [ Upstream commit a5600024 ]
      
      inet_evict_bucket() iterates global list, and
      several tasks may call it in parallel. All of
      them hash the same fq->list_evictor to different
      lists, which leads to list corruption.
      
      This patch makes fq be hashed to expired list
      only if this has not been made yet by another
      task. Since inet_frag_alloc() allocates fq
      using kmem_cache_zalloc(), we may rely on
      list_evictor is initially unhashed.
      
      The problem seems to exist before async
      pernet_operations, as there was possible to have
      exit method to be executed in parallel with
      inet_frags::frags_work, so I add two Fixes tags.
      This also may go to stable.
      
      Fixes: d1fe1944 "inet: frag: don't re-use chainlist for evictor"
      Fixes: f84c6821 "net: Convert pernet_subsys, registered from inet_init()"
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1562f147
    • Eric Dumazet's avatar
      net: use skb_to_full_sk() in skb_update_prio() · 28984ba0
      Eric Dumazet authored
      
      [ Upstream commit 4dcb31d4 ]
      
      Andrei Vagin reported a KASAN: slab-out-of-bounds error in
      skb_update_prio()
      
      Since SYNACK might be attached to a request socket, we need to
      get back to the listener socket.
      Since this listener is manipulated without locks, add const
      qualifiers to sock_cgroup_prioidx() so that the const can also
      be used in skb_update_prio()
      
      Also add the const qualifier to sock_cgroup_classid() for consistency.
      
      Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28984ba0
    • Eric Dumazet's avatar
      ieee802154: 6lowpan: fix possible NULL deref in lowpan_device_event() · e7d79566
      Eric Dumazet authored
      
      [ Upstream commit ca0edb13 ]
      
      A tun device type can trivially be set to arbitrary value using
      TUNSETLINK ioctl().
      
      Therefore, lowpan_device_event() must really check that ieee802154_ptr
      is not NULL.
      
      Fixes: 2c88b528 ("ieee802154: 6lowpan: remove check on null")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@osg.samsung.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarStefan Schmidt <stefan@osg.samsung.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7d79566
    • Alexey Kodanev's avatar
      sch_netem: fix skb leak in netem_enqueue() · e927ffbf
      Alexey Kodanev authored
      
      [ Upstream commit 35d889d1 ]
      
      When we exceed current packets limit and we have more than one
      segment in the list returned by skb_gso_segment(), netem drops
      only the first one, skipping the rest, hence kmemleak reports:
      
      unreferenced object 0xffff880b5d23b600 (size 1024):
        comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s)
        hex dump (first 32 bytes):
          00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00  ..#]............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520
          [<000000001709b32f>] skb_segment+0x8c8/0x3710
          [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830
          [<00000000c921cba1>] inet_gso_segment+0x476/0x1370
          [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510
          [<000000002182660a>] __skb_gso_segment+0x1dd/0x620
          [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem]
          [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120
          [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00
          [<00000000d309e9d3>] ip_output+0x1aa/0x2c0
          [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670
          [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0
          [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540
          [<0000000013d06d02>] tasklet_action+0x1ca/0x250
          [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3
          [<00000000e7ed027c>] irq_exit+0x1e2/0x210
      
      Fix it by adding the rest of the segments, if any, to skb 'to_free'
      list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
      because they can be useful in the future if we need to drop segmented
      GSO packets in other places.
      
      Fixes: 6071bd1a ("netem: Segment GSO packets on enqueue")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e927ffbf
    • Paul Blakey's avatar
      rhashtable: Fix rhlist duplicates insertion · ad621704
      Paul Blakey authored
      
      [ Upstream commit d3dcf8eb ]
      
      When inserting duplicate objects (those with the same key),
      current rhlist implementation messes up the chain pointers by
      updating the bucket pointer instead of prev next pointer to the
      newly inserted node. This causes missing elements on removal and
      travesal.
      
      Fix that by properly updating pprev pointer to point to
      the correct rhash_head next pointer.
      
      Issue: 1241076
      Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
      Fixes: ca26893f ('rhashtable: Add rhlist interface')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad621704
    • Guillaume Nault's avatar
      ppp: avoid loop in xmit recursion detection code · fe3627f6
      Guillaume Nault authored
      
      [ Upstream commit 6d066734 ]
      
      We already detect situations where a PPP channel sends packets back to
      its upper PPP device. While this is enough to avoid deadlocking on xmit
      locks, this doesn't prevent packets from looping between the channel
      and the unit.
      
      The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq
      before checking for xmit recursion. Therefore, __ppp_xmit_process()
      might dequeue a packet from ppp->file.xq and send it on the channel
      which, in turn, loops it back on the unit. Then ppp_start_xmit()
      queues the packet back to ppp->file.xq and __ppp_xmit_process() picks
      it up and sends it again through the channel. Therefore, the packet
      will loop between __ppp_xmit_process() and ppp_start_xmit() until some
      other part of the xmit path drops it.
      
      For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops
      the packet after a few iterations. But PPTP reallocates the headroom
      if necessary, letting the loop run and exhaust the machine resources
      (as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109).
      
      Fix this by letting __ppp_xmit_process() enqueue the skb to
      ppp->file.xq, so that we can check for recursion before adding it to
      the queue. Now ppp_xmit_process() can drop the packet when recursion is
      detected.
      
      __ppp_channel_push() is a bit special. It calls __ppp_xmit_process()
      without having any actual packet to send. This is used by
      ppp_output_wakeup() to re-enable transmission on the parent unit (for
      implementations like ppp_async.c, where the .start_xmit() function
      might not consume the skb, leaving it in ppp->xmit_pending and
      disabling transmission).
      Therefore, __ppp_xmit_process() needs to handle the case where skb is
      NULL, dequeuing as many packets as possible from ppp->file.xq.
      Reported-by: default avatarxu heng <xuheng333@zoho.com>
      Fixes: 55454a56 ("ppp: avoid dealock on recursive xmit")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe3627f6
    • Roman Mashak's avatar
      net sched actions: return explicit error when tunnel_key mode is not specified · 4f2f7a07
      Roman Mashak authored
      
      [ Upstream commit 51d4740f ]
      
      If set/unset mode of the tunnel_key action is not provided, ->init() still
      returns 0, and the caller proceeds with bogus 'struct tc_action *' object,
      this results in crash:
      
      % tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1
      
      [   35.805515] general protection fault: 0000 [#1] SMP PTI
      [   35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass
      crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
      crypto_simd glue_helper cryptd serio_raw
      [   35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286
      [   35.808929] RIP: 0010:tcf_action_init+0x90/0x190
      [   35.809457] RSP: 0018:ffffb8edc068b9a0 EFLAGS: 00010206
      [   35.810053] RAX: 1320c000000a0003 RBX: 0000000000000001 RCX: 0000000000000000
      [   35.810866] RDX: 0000000000000070 RSI: 0000000000007965 RDI: ffffb8edc068b910
      [   35.811660] RBP: ffffb8edc068b9d0 R08: 0000000000000000 R09: ffffb8edc068b808
      [   35.812463] R10: ffffffffc02bf040 R11: 0000000000000040 R12: ffffb8edc068bb38
      [   35.813235] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb8edc068b910
      [   35.814006] FS:  00007f3d0d8556c0(0000) GS:ffff91d1dbc40000(0000)
      knlGS:0000000000000000
      [   35.814881] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   35.815540] CR2: 000000000043f720 CR3: 0000000019248001 CR4: 00000000001606a0
      [   35.816457] Call Trace:
      [   35.817158]  tc_ctl_action+0x11a/0x220
      [   35.817795]  rtnetlink_rcv_msg+0x23d/0x2e0
      [   35.818457]  ? __slab_alloc+0x1c/0x30
      [   35.819079]  ? __kmalloc_node_track_caller+0xb1/0x2b0
      [   35.819544]  ? rtnl_calcit.isra.30+0xe0/0xe0
      [   35.820231]  netlink_rcv_skb+0xce/0x100
      [   35.820744]  netlink_unicast+0x164/0x220
      [   35.821500]  netlink_sendmsg+0x293/0x370
      [   35.822040]  sock_sendmsg+0x30/0x40
      [   35.822508]  ___sys_sendmsg+0x2c5/0x2e0
      [   35.823149]  ? pagecache_get_page+0x27/0x220
      [   35.823714]  ? filemap_fault+0xa2/0x640
      [   35.824423]  ? page_add_file_rmap+0x108/0x200
      [   35.825065]  ? alloc_set_pte+0x2aa/0x530
      [   35.825585]  ? finish_fault+0x4e/0x70
      [   35.826140]  ? __handle_mm_fault+0xbc1/0x10d0
      [   35.826723]  ? __sys_sendmsg+0x41/0x70
      [   35.827230]  __sys_sendmsg+0x41/0x70
      [   35.827710]  do_syscall_64+0x68/0x120
      [   35.828195]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      [   35.828859] RIP: 0033:0x7f3d0ca4da67
      [   35.829331] RSP: 002b:00007ffc9f284338 EFLAGS: 00000246 ORIG_RAX:
      000000000000002e
      [   35.830304] RAX: ffffffffffffffda RBX: 00007ffc9f284460 RCX: 00007f3d0ca4da67
      [   35.831247] RDX: 0000000000000000 RSI: 00007ffc9f2843b0 RDI: 0000000000000003
      [   35.832167] RBP: 000000005aa6a7a9 R08: 0000000000000001 R09: 0000000000000000
      [   35.833075] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
      [   35.833997] R13: 00007ffc9f2884c0 R14: 0000000000000001 R15: 0000000000674640
      [   35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2
      fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff <ff> d0 48
      8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c
      [   35.837442] RIP: tcf_action_init+0x90/0x190 RSP: ffffb8edc068b9a0
      [   35.838291] ---[ end trace a095c06ee4b97a26 ]---
      
      Fixes: d0f6dd8a ("net/sched: Introduce act_tunnel_key")
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f2f7a07
    • Greg Kroah-Hartman's avatar
      Revert "genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs" · 6c9ca571
      Greg Kroah-Hartman authored
      This reverts commit f2596a98 which is
      commit 382bd4de upstream.
      
      It causes too many problems with the stable tree, and would require too
      many other things to be backported, so just revert it.
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Sasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c9ca571
    • Johannes Thumshirn's avatar
      scsi: sg: don't return bogus Sg_requests · 6505dd1f
      Johannes Thumshirn authored
      commit 48ae8484 upstream.
      
      If the list search in sg_get_rq_mark() fails to find a valid request, we
      return a bogus element. This then can later lead to a GPF in
      sg_remove_scat().
      
      So don't return bogus Sg_requests in sg_get_rq_mark() but NULL in case
      the list search doesn't find a valid request.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Doug Gilbert <dgilbert@interlog.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Acked-by: default avatarDoug Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Cc: Tony Battersby <tonyb@cybernetics.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6505dd1f
  2. 28 Mar, 2018 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.91 · c44cfe06
      Greg Kroah-Hartman authored
      c44cfe06
    • Daniel Borkmann's avatar
      bpf, x64: increase number of passes · c9e30719
      Daniel Borkmann authored
      commit 6007b080 upstream.
      
      In Cilium some of the main programs we run today are hitting 9 passes
      on x64's JIT compiler, and we've had cases already where we surpassed
      the limit where the JIT then punts the program to the interpreter
      instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
      or insertion failures due to the prog array owner being JITed but the
      program to insert not (both must have the same JITed/non-JITed property).
      
      One concrete case the program image shrunk from 12,767 bytes down to
      10,288 bytes where the image converged after 16 steps. I've measured
      that this took 340us in the JIT until it converges on my i7-6600U. Thus,
      increase the original limit we had from day one where the JIT covered
      cBPF only back then before we run into the case (as similar with the
      complexity limit) where we trip over this and hit program rejections.
      Also add a cond_resched() into the compilation loop, the JIT process
      runs without any locks and may sleep anyway.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9e30719
    • Chenbo Feng's avatar
      bpf: skip unnecessary capability check · 3eb88807
      Chenbo Feng authored
      commit 0fa4fe85 upstream.
      
      The current check statement in BPF syscall will do a capability check
      for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
      code path will trigger unnecessary security hooks on capability checking
      and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
      access. This can be resolved by simply switch the order of the statement
      and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
      allowed.
      Signed-off-by: default avatarChenbo Feng <fengc@google.com>
      Acked-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eb88807
    • Daniel Borkmann's avatar
      kbuild: disable clang's default use of -fmerge-all-constants · 733a4e1a
      Daniel Borkmann authored
      commit 87e0d4f0 upstream.
      
      Prasad reported that he has seen crashes in BPF subsystem with netd
      on Android with arm64 in the form of (note, the taint is unrelated):
      
        [ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
        [ 4134.820925] Mem abort info:
        [ 4134.901283]   Exception class = DABT (current EL), IL = 32 bits
        [ 4135.016736]   SET = 0, FnV = 0
        [ 4135.119820]   EA = 0, S1PTW = 0
        [ 4135.201431] Data abort info:
        [ 4135.301388]   ISV = 0, ISS = 0x00000021
        [ 4135.359599]   CM = 0, WnR = 0
        [ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
        [ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
        [ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
        [ 4135.674610] Modules linked in:
        [ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S      W       4.14.19+ #1
        [ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
        [ 4135.731599] PC is at bpf_prog_add+0x20/0x68
        [ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
        [ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
        [ 4135.769062] sp : ffffff801d4e3ce0
        [...]
        [ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
        [ 4136.273746] Call trace:
        [...]
        [ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
        [ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
        [ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
        [ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
        [ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
        [ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
      
      Android's netd was basically pinning the uid cookie BPF map in BPF
      fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
      again resulting in above panic. Issue is that the map was wrongly
      identified as a prog! Above kernel was compiled with clang 4.0,
      and it turns out that clang decided to merge the bpf_prog_iops and
      bpf_map_iops into a single memory location, such that the two i_ops
      could then not be distinguished anymore.
      
      Reason for this miscompilation is that clang has the more aggressive
      -fmerge-all-constants enabled by default. In fact, clang source code
      has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
      to do so:
      
        Pointers with different bases cannot represent the same object.
        (Note that clang defaults to -fmerge-all-constants, which can
        lead to inconsistent results for comparisons involving the address
        of a constant; this generally doesn't matter in practice.)
      
      The issue never appeared with gcc however, since gcc does not enable
      -fmerge-all-constants by default and even *explicitly* states in
      it's option description that using this flag results in non-conforming
      behavior, quote from man gcc:
      
        Languages like C or C++ require each variable, including multiple
        instances of the same variable in recursive calls, to have distinct
        locations, so using this option results in non-conforming behavior.
      
      There are also various clang bug reports open on that matter [1],
      where clang developers acknowledge the non-conforming behavior,
      and refer to disabling it with -fno-merge-all-constants. But even
      if this gets fixed in clang today, there are already users out there
      that triggered this. Thus, fix this issue by explicitly adding
      -fno-merge-all-constants to the kernel's Makefile to generically
      disable this optimization, since potentially other places in the
      kernel could subtly break as well.
      
      Note, there is also a flag called -fmerge-constants (not supported
      by clang), which is more conservative and only applies to strings
      and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
      gcc's code, the two flags -fmerge-{all-,}constants share the same
      variable internally, so when disabling it via -fno-merge-all-constants,
      then we really don't merge any const data (e.g. strings), and text
      size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
      
        $ gcc -fverbose-asm -O2 foo.c -S -o foo.S
          -> foo.S lists -fmerge-constants under options enabled
        $ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
          -> foo.S doesn't list -fmerge-constants under options enabled
        $ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
          -> foo.S lists -fmerge-constants under options enabled
      
      Thus, as a workaround we need to set both -fno-merge-all-constants
      *and* -fmerge-constants in the Makefile in order for text size to
      stay as is.
      
        [1] https://bugs.llvm.org/show_bug.cgi?id=18538Reported-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Chenbo Feng <fengc@google.com>
      Cc: Richard Smith <richard-llvm@metafoo.co.uk>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Tested-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      733a4e1a
    • Shuah Khan's avatar
      selftests: x86: sysret_ss_attrs doesn't build on a PIE build · 353f71fe
      Shuah Khan authored
      commit 3346a6a4 upstream.
      
      sysret_ss_attrs fails to compile leading x86 test run to fail on systems
      configured to build using PIE by default. Add -no-pie fix it.
      
      Relocation might still fail if relocated above 4G. For now this change
      fixes the build and runs x86 tests.
      
      tools/testing/selftests/x86$ make
      gcc -m64 -o .../tools/testing/selftests/x86/single_step_syscall_64 -O2
      -g -std=gnu99 -pthread -Wall  single_step_syscall.c -lrt -ldl
      gcc -m64 -o .../tools/testing/selftests/x86/sysret_ss_attrs_64 -O2 -g
      -std=gnu99 -pthread -Wall  sysret_ss_attrs.c thunks.S -lrt -ldl
      /usr/bin/ld: /tmp/ccS6pvIh.o: relocation R_X86_64_32S against `.text'
      can not be used when making a shared object; recompile with -fPIC
      /usr/bin/ld: final link failed: Nonrepresentable section on output
      collect2: error: ld returned 1 exit status
      Makefile:49: recipe for target
      '.../tools/testing/selftests/x86/sysret_ss_attrs_64' failed
      make: *** [.../tools/testing/selftests/x86/sysret_ss_attrs_64] Error 1
      Suggested-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      353f71fe
    • Dave Hansen's avatar
      x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey' · 1443abc9
      Dave Hansen authored
      commit 91c49c2d upstream.
      
      'si_pkey' is now #defined to be the name of the new siginfo field that
      protection keys uses.  Rename it not to conflict.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171111001231.DFFC8285@viggo.jf.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1443abc9
    • Eric W. Biederman's avatar
      signal/testing: Don't look for __SI_FAULT in userspace · f41f8156
      Eric W. Biederman authored
      commit d12fe87e upstream.
      
      Fix the debug print statements in these tests where they reference
      si_codes and in particular __SI_FAULT.  __SI_FAULT is a kernel
      internal value and should never be seen by userspace.
      
      While I am in there also fix si_code_str.  si_codes are an enumeration
      there are not a bitmap so == and not & is the apropriate operation to
      test for an si_code.
      
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Fixes: 5f23f6d0 ("x86/pkeys: Add self-tests")
      Fixes: e754aedc ("x86/mpx, selftests: Add MPX self test")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f41f8156
    • Andy Lutomirski's avatar
      selftests/x86/protection_keys: Fix syscall NR redefinition warnings · 93b48392
      Andy Lutomirski authored
      commit 693cb558 upstream.
      
      On new enough glibc, the pkey syscalls numbers are available.  Check
      first before defining them to avoid warnings like:
      
      protection_keys.c:198:0: warning: "SYS_pkey_alloc" redefined
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1fbef53a9e6befb7165ff855fc1a7d4788a191d6.1509794321.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93b48392
    • Dave Hansen's avatar
      selftests, x86, protection_keys: fix wrong offset in siginfo · 26e9852f
      Dave Hansen authored
      commit 2195bff0 upstream.
      
      The siginfo contains a bunch of information about the fault.
      For protection keys, it tells us which protection key's
      permissions were violated.
      
      The wrong offset in here leads to reading garbage and thus
      failures in the tests.
      
      We should probably eventually move this over to using the
      kernel's headers defining the siginfo instead of a hard-coded
      offset.  But, for now, just do the simplest fix.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26e9852f
    • Nadav Amit's avatar
      staging: lustre: ptlrpc: kfree used instead of kvfree · 1e0fc7db
      Nadav Amit authored
      commit c3eec596 upstream.
      
      rq_reqbuf is allocated using kvmalloc() but released in one occasion
      using kfree() instead of kvfree().
      
      The issue was found using grep based on a similar bug.
      
      Fixes: d7e09d03 ("add Lustre file system client support")
      Fixes: ee0ec194 ("lustre: ptlrpc: Replace uses of OBD_{ALLOC,FREE}_LARGE")
      
      Cc: Peng Tao <bergwolf@gmail.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: James Simmons <jsimmons@infradead.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarAndreas Dilger <andreas.dilger@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e0fc7db
    • Linus Walleij's avatar
      iio: ABI: Fix name of timestamp sysfs file · 162daa27
      Linus Walleij authored
      commit b9a35893 upstream.
      
      The name of the file is "current_timetamp_clock" not
      "timestamp_clock".
      
      Fixes: bc2b7dab ("iio:core: timestamping clock selection support")
      Cc: Gregor Boirie <gregor.boirie@parrot.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      162daa27
    • Kan Liang's avatar
      perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers · 9c0d0a0c
      Kan Liang authored
      commit 320b0651 upstream.
      
      The number of CHAs is miscalculated on multi-domain PCI Skylake server systems,
      resulting in an uncore driver initialization error.
      
      Gary Kroening explains:
      
       "For systems with a single PCI segment, it is sufficient to look for the
        bus number to change in order to determine that all of the CHa's have
        been counted for a single socket.
      
        However, for multi PCI segment systems, each socket is given a new
        segment and the bus number does NOT change.  So looking only for the
        bus number to change ends up counting all of the CHa's on all sockets
        in the system.  This leads to writing CPU MSRs beyond a valid range and
        causes an error in ivbep_uncore_msr_init_box()."
      
      To fix this bug, query the number of CHAs from the CAPID6 register:
      it should read bits 27:0 in the CAPID6 register located at
      Device 30, Function 3, Offset 0x9C. These 28 bits form a bit vector
      of available LLC slices and the CHAs that manage those slices.
      Reported-by: default avatarKroening, Gary <gary.kroening@hpe.com>
      Tested-by: default avatarKroening, Gary <gary.kroening@hpe.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: abanman@hpe.com
      Cc: dimitri.sivanich@hpe.com
      Cc: hpa@zytor.com
      Cc: mike.travis@hpe.com
      Cc: russ.anderson@hpe.com
      Fixes: cd34cd97 ("perf/x86/intel/uncore: Add Skylake server uncore support")
      Link: http://lkml.kernel.org/r/1520967094-13219-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c0d0a0c