1. 24 Nov, 2017 12 commits
    • Xin Long's avatar
      sctp: do not peel off an assoc from one netns to another one · 362d2ce0
      Xin Long authored
      
      [ Upstream commit df80cd9b ]
      
      Now when peeling off an association to the sock in another netns, all
      transports in this assoc are not to be rehashed and keep use the old
      key in hashtable.
      
      As a transport uses sk->net as the hash key to insert into hashtable,
      it would miss removing these transports from hashtable due to the new
      netns when closing the sock and all transports are being freeed, then
      later an use-after-free issue could be caused when looking up an asoc
      and dereferencing those transports.
      
      This is a very old issue since very beginning, ChunYu found it with
      syzkaller fuzz testing with this series:
      
        socket$inet6_sctp()
        bind$inet6()
        sendto$inet6()
        unshare(0x40000000)
        getsockopt$inet_sctp6_SCTP_GET_ASSOC_ID_LIST()
        getsockopt$inet_sctp6_SCTP_SOCKOPT_PEELOFF()
      
      This patch is to block this call when peeling one assoc off from one
      netns to another one, so that the netns of all transport would not
      go out-sync with the key in hashtable.
      
      Note that this patch didn't fix it by rehashing transports, as it's
      difficult to handle the situation when the tuple is already in use
      in the new netns. Besides, no one would like to peel off one assoc
      to another netns, considering ipaddrs, ifaces, etc. are usually
      different.
      Reported-by: default avatarChunYu Wang <chunwang@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      362d2ce0
    • Jason A. Donenfeld's avatar
      af_netlink: ensure that NLMSG_DONE never fails in dumps · 99aa74ce
      Jason A. Donenfeld authored
      
      [ Upstream commit 0642840b ]
      
      The way people generally use netlink_dump is that they fill in the skb
      as much as possible, breaking when nla_put returns an error. Then, they
      get called again and start filling out the next skb, and again, and so
      forth. The mechanism at work here is the ability for the iterative
      dumping function to detect when the skb is filled up and not fill it
      past the brim, waiting for a fresh skb for the rest of the data.
      
      However, if the attributes are small and nicely packed, it is possible
      that a dump callback function successfully fills in attributes until the
      skb is of size 4080 (libmnl's default page-sized receive buffer size).
      The dump function completes, satisfied, and then, if it happens to be
      that this is actually the last skb, and no further ones are to be sent,
      then netlink_dump will add on the NLMSG_DONE part:
      
        nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);
      
      It is very important that netlink_dump does this, of course. However, in
      this example, that call to nlmsg_put_answer will fail, because the
      previous filling by the dump function did not leave it enough room. And
      how could it possibly have done so? All of the nla_put variety of
      functions simply check to see if the skb has enough tailroom,
      independent of the context it is in.
      
      In order to keep the important assumptions of all netlink dump users, it
      is therefore important to give them an skb that has this end part of the
      tail already reserved, so that the call to nlmsg_put_answer does not
      fail. Otherwise, library authors are forced to find some bizarre sized
      receive buffer that has a large modulo relative to the common sizes of
      messages received, which is ugly and buggy.
      
      This patch thus saves the NLMSG_DONE for an additional message, for the
      case that things are dangerously close to the brim. This requires
      keeping track of the errno from ->dump() across calls.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99aa74ce
    • Cong Wang's avatar
      vlan: fix a use-after-free in vlan_device_event() · 080ecd2b
      Cong Wang authored
      
      [ Upstream commit 052d41c0 ]
      
      After refcnt reaches zero, vlan_vid_del() could free
      dev->vlan_info via RCU:
      
      	RCU_INIT_POINTER(dev->vlan_info, NULL);
      	call_rcu(&vlan_info->rcu, vlan_info_rcu_free);
      
      However, the pointer 'grp' still points to that memory
      since it is set before vlan_vid_del():
      
              vlan_info = rtnl_dereference(dev->vlan_info);
              if (!vlan_info)
                      goto out;
              grp = &vlan_info->grp;
      
      Depends on when that RCU callback is scheduled, we could
      trigger a use-after-free in vlan_group_for_each_dev()
      right following this vlan_vid_del().
      
      Fix it by moving vlan_vid_del() before setting grp. This
      is also symmetric to the vlan_vid_add() we call in
      vlan_device_event().
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Fixes: efc73f4b ("net: Fix memory leak - vlan_info struct")
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Girish Moodalbail <girish.moodalbail@oracle.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarGirish Moodalbail <girish.moodalbail@oracle.com>
      Tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      080ecd2b
    • Andrey Konovalov's avatar
      net: usb: asix: fill null-ptr-deref in asix_suspend · 58baa36d
      Andrey Konovalov authored
      
      [ Upstream commit 8f562462 ]
      
      When asix_suspend() is called dev->driver_priv might not have been
      assigned a value, so we need to check that it's not NULL.
      
      Similar issue is present in asix_resume(), this patch fixes it as well.
      
      Found by syzkaller.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bb36300 task.stack: ffff88006bba8000
      RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
      RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
      RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
      RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
      R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
      FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
      Call Trace:
       usb_suspend_interface drivers/usb/core/driver.c:1209
       usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
       usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
       __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
       rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
       rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
       __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
       pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
       usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
       hub_port_connect drivers/usb/core/hub.c:4903
       hub_port_connect_change drivers/usb/core/hub.c:5009
       port_event drivers/usb/core/hub.c:5115
       hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
       process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
       worker_thread+0x221/0x1850 kernel/workqueue.c:2253
       kthread+0x3a1/0x470 kernel/kthread.c:231
       ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
      Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
      00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
      3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
      RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
      ---[ end trace dfc4f5649284342c ]---
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58baa36d
    • Kristian Evensen's avatar
      qmi_wwan: Add missing skb_reset_mac_header-call · 4ad82095
      Kristian Evensen authored
      
      [ Upstream commit 0de0add1 ]
      
      When we receive a packet on a QMI device in raw IP mode, we should call
      skb_reset_mac_header() to ensure that skb->mac_header contains a valid
      offset in the packet. While it shouldn't really matter, the packets have
      no MAC header and the interface is configured as-such, it seems certain
      parts of the network stack expects a "good" value in skb->mac_header.
      
      Without the skb_reset_mac_header() call added in this patch, for example
      shaping traffic (using tc) triggers the following oops on the first
      received packet:
      
      [  303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
      [  303.655045] Kernel bug detected[#1]:
      [  303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
      [  303.664339] task: 8fdf05e0 task.stack: 8f15c000
      [  303.668844] $ 0   : 00000000 00000001 0000007a 00000000
      [  303.674062] $ 4   : 8149a2fc 8149a2fc 8149ce20 00000000
      [  303.679284] $ 8   : 00000030 3878303a 31623465 20303235
      [  303.684510] $12   : ded731e3 2626a277 00000000 03bd0000
      [  303.689747] $16   : 8ef62b40 00000043 8f137918 804db5fc
      [  303.694978] $20   : 00000001 00000004 8fc13800 00000003
      [  303.700215] $24   : 00000001 8024ab10
      [  303.705442] $28   : 8f15c000 8fc19cf0 00000043 802cc920
      [  303.710664] Hi    : 00000000
      [  303.713533] Lo    : 74e58000
      [  303.716436] epc   : 802cc920 skb_panic+0x58/0x5c
      [  303.721046] ra    : 802cc920 skb_panic+0x58/0x5c
      [  303.725639] Status: 11007c03 KERNEL EXL IE
      [  303.729823] Cause : 50800024 (ExcCode 09)
      [  303.733817] PrId  : 0001992f (MIPS 1004Kc)
      [  303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
      Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
      [  303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
      [  303.970871]         8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
      [  303.979219]         8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
      [  303.987568]         8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
      [  303.995934]         00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
      [  304.004324]         ...
      [  304.006767] Call Trace:
      [  304.009241] [<802cc920>] skb_panic+0x58/0x5c
      [  304.013504] [<802cd2a4>] skb_push+0x78/0x90
      [  304.017783] [<8f137918>] 0x8f137918
      [  304.021269] Code: 00602825  0c02a3b4  24842888 <000c000d> 8c870060  8c8200a0  0007382b  00070336  8c88005c
      [  304.031034]
      [  304.032805] ---[ end trace b778c482b3f0bda9 ]---
      [  304.041384] Kernel panic - not syncing: Fatal exception in interrupt
      [  304.051975] Rebooting in 3 seconds..
      
      While the oops is for a 4.9-kernel, I was able to trigger the same oops with
      net-next as of yesterday.
      
      Fixes: 32f7adf6 ("net: qmi_wwan: support "raw IP" mode")
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ad82095
    • Bjørn Mork's avatar
      net: qmi_wwan: fix divide by 0 on bad descriptors · 02a0c063
      Bjørn Mork authored
      
      [ Upstream commit 7fd07833 ]
      
      A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
      cause a divide error in usbnet_probe:
      
      divide error: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bef5c00 task.stack: ffff88006bf60000
      RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
      RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
      RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
      RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
      RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
      R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
      R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
      Call Trace:
       usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
       qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
       usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
       really_probe drivers/base/dd.c:413
       driver_probe_device+0x522/0x740 drivers/base/dd.c:557
      
      Fix by simply ignoring the bogus descriptor, as it is optional
      for QMI devices anyway.
      
      Fixes: 423ce8ca ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02a0c063
    • Bjørn Mork's avatar
      net: cdc_ether: fix divide by 0 on bad descriptors · f3766218
      Bjørn Mork authored
      
      [ Upstream commit 2cb80187 ]
      
      Setting dev->hard_mtu to 0 will cause a divide error in
      usbnet_probe. Protect against devices with bogus CDC Ethernet
      functional descriptors by ignoring a zero wMaxSegmentSize.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3766218
    • Hangbin Liu's avatar
      bonding: discard lowest hash bit for 802.3ad layer3+4 · 6f239c06
      Hangbin Liu authored
      
      [ Upstream commit b5f86218 ]
      
      After commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range
      in connect()"), we will try to use even ports for connect(). Then if an
      application (seen clearly with iperf) opens multiple streams to the same
      destination IP and port, each stream will be given an even source port.
      
      So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
      will always hash all these streams to the same interface. And the total
      throughput will limited to a single slave.
      
      Change the tcp code will impact the whole tcp behavior, only for bonding
      usage. Paolo Abeni suggested fix this by changing the bonding code only,
      which should be more reasonable, and less impact.
      
      Fix this by discarding the lowest hash bit because it contains little entropy.
      After the fix we can re-balance between slaves.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f239c06
    • Ye Yin's avatar
      netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed · afd9fa66
      Ye Yin authored
      
      [ Upstream commit 2b5ec1a5 ]
      
      When run ipvs in two different network namespace at the same host, and one
      ipvs transport network traffic to the other network namespace ipvs.
      'ipvs_property' flag will make the second ipvs take no effect. So we should
      clear 'ipvs_property' when SKB network namespace changed.
      
      Fixes: 621e84d6 ("dev: introduce skb_scrub_packet()")
      Signed-off-by: default avatarYe Yin <hustcat@gmail.com>
      Signed-off-by: default avatarWei Zhou <chouryzhou@gmail.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afd9fa66
    • Eric Dumazet's avatar
      tcp: do not mangle skb->cb[] in tcp_make_synack() · 3920a5bd
      Eric Dumazet authored
      
      [ Upstream commit 3b117750 ]
      
      Christoph Paasch sent a patch to address the following issue :
      
      tcp_make_synack() is leaving some TCP private info in skb->cb[],
      then send the packet by other means than tcp_transmit_skb()
      
      tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse
      IPv4/IPV6 stacks, but we have no such cleanup for SYNACK.
      
      tcp_make_synack() should not use tcp_init_nondata_skb() :
      
      tcp_init_nondata_skb() really should be limited to skbs put in write/rtx
      queues (the ones that are only sent via tcp_transmit_skb())
      
      This patch fixes the issue and should even save few cpu cycles ;)
      
      Fixes: 971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3920a5bd
    • Jeff Barnhill's avatar
      net: vrf: correct FRA_L3MDEV encode type · 58b21b02
      Jeff Barnhill authored
      
      [ Upstream commit 18129a24 ]
      
      FRA_L3MDEV is defined as U8, but is being added as a U32 attribute. On
      big endian architecture, this results in the l3mdev entry not being
      added to the FIB rules.
      
      Fixes: 1aa6c4f6 ("net: vrf: Add l3mdev rules on first device create")
      Signed-off-by: default avatarJeff Barnhill <0xeffeff@gmail.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58b21b02
    • Konstantin Khlebnikov's avatar
      tcp_nv: fix division by zero in tcpnv_acked() · b0e50c4e
      Konstantin Khlebnikov authored
      
      [ Upstream commit 4eebff27 ]
      
      Average RTT could become zero. This happened in real life at least twice.
      This patch treats zero as 1us.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: default avatarLawrence Brakmo <Brakmo@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0e50c4e
  2. 21 Nov, 2017 28 commits