1. 05 Jan, 2016 23 commits
    • Johannes Berg's avatar
      mac80211: fix driver RSSI event calculations · 24a3387f
      Johannes Berg authored
      commit 8ec6d978 upstream.
      
      The ifmgd->ave_beacon_signal value cannot be taken as is for
      comparisons, it must be divided by since it's represented
      like that for better accuracy of the EWMA calculations. This
      would lead to invalid driver RSSI events. Fix the used value.
      
      Fixes: 615f7b9b ("mac80211: add driver RSSI threshold events")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      24a3387f
    • Andrew Cooper's avatar
      x86/cpu: Fix SMAP check in PVOPS environments · 6119d487
      Andrew Cooper authored
      commit 581b7f15 upstream.
      
      There appears to be no formal statement of what pv_irq_ops.save_fl() is
      supposed to return precisely.  Native returns the full flags, while lguest and
      Xen only return the Interrupt Flag, and both have comments by the
      implementations stating that only the Interrupt Flag is looked at.  This may
      have been true when initially implemented, but no longer is.
      
      To make matters worse, the Xen PVOP leaves the upper bits undefined, making
      the BUG_ON() undefined behaviour.  Experimentally, this now trips for 32bit PV
      guests on Broadwell hardware.  The BUG_ON() is consistent for an individual
      build, but not consistent for all builds.  It has also been a sitting timebomb
      since SMAP support was introduced.
      
      Use native_save_fl() instead, which will obtain an accurate view of the AC
      flag.
      Signed-off-by: default avatarAndrew Cooper <andrew.cooper3@citrix.com>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Tested-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <lguest@lists.ozlabs.org>
      Cc: Xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6119d487
    • Borislav Petkov's avatar
      x86/cpu: Call verify_cpu() after having entered long mode too · fed70fc9
      Borislav Petkov authored
      commit 04633df0 upstream.
      
      When we get loaded by a 64-bit bootloader, kernel entry point is
      startup_64 in head_64.S. We don't trust any and all bootloaders because
      some will fiddle with CPU configuration so we go ahead and massage each
      CPU into sanity again.
      
      For example, some dell BIOSes have this XD disable feature which set
      IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
      for other OSes but Linux sure doesn't need it.
      
      A similar thing is present in the Surface 3 firmware - see
      https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
      only on the BSP:
      
        # rdmsr -a 0x1a0
        400850089
        850089
        850089
        850089
      
      I know, right?!
      
      There's not even an off switch in there.
      
      So fix all those cases by sanitizing the 64-bit entry point too. For
      that, make verify_cpu() callable in 64-bit mode also.
      Requested-and-debugged-by: default avatar"H. Peter Anvin" <hpa@zytor.com>
      Reported-and-tested-by: default avatarBastien Nocera <bugzilla@hadess.net>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email-bp@alien8.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fed70fc9
    • Krzysztof Mazur's avatar
      x86/setup: Fix low identity map for >= 2GB kernel range · 590861b7
      Krzysztof Mazur authored
      commit 68accac3 upstream.
      
      The commit f5f3497c extended the low identity mapping. However, if
      the kernel uses more than 2 GB (VMSPLIT_2G_OPT or VMSPLIT_1G memory
      split), the normal memory mapping is overwritten by the low identity
      mapping causing a crash. To avoid overwritting, limit the low identity
      map to cover only memory before kernel range (PAGE_OFFSET).
      
      Fixes: f5f3497c "x86/setup: Extend low identity map to cover whole kernel range
      Signed-off-by: default avatarKrzysztof Mazur <krzysiek@podlesie.net>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Laszlo Ersek <lersek@redhat.com>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: http://lkml.kernel.org/r/1446815916-22105-1-git-send-email-krzysiek@podlesie.netSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      590861b7
    • Paolo Bonzini's avatar
      x86/setup: Extend low identity map to cover whole kernel range · 2c131230
      Paolo Bonzini authored
      commit f5f3497c upstream.
      
      On 32-bit systems, the initial_page_table is reused by
      efi_call_phys_prolog as an identity map to call
      SetVirtualAddressMap.  efi_call_phys_prolog takes care of
      converting the current CPU's GDT to a physical address too.
      
      For PAE kernels the identity mapping is achieved by aliasing the
      first PDPE for the kernel memory mapping into the first PDPE
      of initial_page_table.  This makes the EFI stub's trick "just work".
      
      However, for non-PAE kernels there is no guarantee that the identity
      mapping in the initial_page_table extends as far as the GDT; in this
      case, accesses to the GDT will cause a page fault (which quickly becomes
      a triple fault).  Fix this by copying the kernel mappings from
      swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
      identity mapping.
      
      For some reason, this is only reproducible with QEMU's dynamic translation
      mode, and not for example with KVM.  However, even under KVM one can clearly
      see that the page table is bogus:
      
          $ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize
          $ gdb
          (gdb) target remote localhost:1234
          (gdb) hb *0x02858f6f
          Hardware assisted breakpoint 1 at 0x2858f6f
          (gdb) c
          Continuing.
      
          Breakpoint 1, 0x02858f6f in ?? ()
          (gdb) monitor info registers
          ...
          GDT=     0724e000 000000ff
          IDT=     fffbb000 000007ff
          CR0=0005003b CR2=ff896000 CR3=032b7000 CR4=00000690
          ...
      
      The page directory is sane:
      
          (gdb) x/4wx 0x32b7000
          0x32b7000:	0x03398063	0x03399063	0x0339a063	0x0339b063
          (gdb) x/4wx 0x3398000
          0x3398000:	0x00000163	0x00001163	0x00002163	0x00003163
          (gdb) x/4wx 0x3399000
          0x3399000:	0x00400003	0x00401003	0x00402003	0x00403003
      
      but our particular page directory entry is empty:
      
          (gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) * 4
          0x32b7070:	0x00000000
      
      [ It appears that you can skate past this issue if you don't receive
        any interrupts while the bogus GDT pointer is loaded, or if you avoid
        reloading the segment registers in general.
      
        Andy Lutomirski provides some additional insight:
      
         "AFAICT it's entirely permissible for the GDTR and/or LDT
          descriptor to point to unmapped memory.  Any attempt to use them
          (segment loads, interrupts, IRET, etc) will try to access that memory
          as if the access came from CPL 0 and, if the access fails, will
          generate a valid page fault with CR2 pointing into the GDT or
          LDT."
      
        Up until commit 23a0d4e8 ("efi: Disable interrupts around EFI
        calls, not in the epilog/prolog calls") interrupts were disabled
        around the prolog and epilog calls, and the functional GDT was
        re-installed before interrupts were re-enabled.
      
        Which explains why no one has hit this issue until now. ]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reported-by: default avatarLaszlo Ersek <lersek@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      [ Updated changelog. ]
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2c131230
    • Peter Ujfalusi's avatar
      ARM: common: edma: Fix channel parameter for irq callbacks · 3a78adf1
      Peter Ujfalusi authored
      commit 696d8b70 upstream.
      
      In case when the interrupt happened for the second eDMA the channel
      number was incorrectly passed to the client driver.
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3a78adf1
    • Marek Szyprowski's avatar
      ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap() · e58fe788
      Marek Szyprowski authored
      commit 7e312103 upstream.
      
      IOMMU-based dma_mmap() implementation lacked proper support for offset
      parameter used in mmap call (it always assumed that mapping starts from
      offset zero). This patch adds support for offset parameter to IOMMU-based
      implementation.
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e58fe788
    • Marek Szyprowski's avatar
      ARM: 8426/1: dma-mapping: add missing range check in dma_mmap() · 3e8936ea
      Marek Szyprowski authored
      commit 371f0f08 upstream.
      
      dma_mmap() function in IOMMU-based dma-mapping implementation lacked
      a check for valid range of mmap parameters (offset and buffer size), what
      might have caused access beyond the allocated buffer. This patch fixes
      this issue.
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3e8936ea
    • Dmitry Tunin's avatar
      Bluetooth: ath3k: Add support of 04ca:300d AR3012 device · 941cb9aa
      Dmitry Tunin authored
      commit 7e730c7f upstream.
      
      BugLink: https://bugs.launchpad.net/bugs/1394368
      
      This device requires new firmware files
       AthrBT_0x11020100.dfu and ramps_0x11020100_40.dfu added to
      /lib/firmware/ar3k/ that are not included in linux-firmware yet.
      
      T: Bus=02 Lev=01 Prnt=01 Port=04 Cnt=03 Dev#= 5 Spd=12 MxCh= 0
      D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
      P: Vendor=04ca ProdID=300d Rev= 0.01
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms
      E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
      E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
      E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
      I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
      E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
      I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
      E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
      I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
      E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
      I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
      E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
      I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
      E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms
      Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      941cb9aa
    • Eric Dumazet's avatar
      ipv6: sctp: implement sctp_v6_destroy_sock() · d7ca7e9c
      Eric Dumazet authored
      [ Upstream commit 602dd62d ]
      
      Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets.
      
      We need to call inet6_destroy_sock() to properly release
      inet6 specific fields.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d7ca7e9c
    • Konstantin Khlebnikov's avatar
      net/neighbour: fix crash at dumping device-agnostic proxy entries · 18e7027f
      Konstantin Khlebnikov authored
      [ Upstream commit 6adc5fd6 ]
      
      Proxy entries could have null pointer to net-device.
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Fixes: 84920c14 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      18e7027f
    • Eric Dumazet's avatar
      ipv6: add complete rcu protection around np->opt · 71781d1f
      Eric Dumazet authored
      [ Upstream commit 45f6fad8 ]
      
      This patch addresses multiple problems :
      
      UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
      while socket is not locked : Other threads can change np->opt
      concurrently. Dmitry posted a syzkaller
      (http://github.com/google/syzkaller) program desmonstrating
      use-after-free.
      
      Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
      and dccp_v6_request_recv_sock() also need to use RCU protection
      to dereference np->opt once (before calling ipv6_dup_options())
      
      This patch adds full RCU protection to np->opt
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      71781d1f
    • Michal Kubeček's avatar
      ipv6: distinguish frag queues by device for multicast and link-local packets · eac843e4
      Michal Kubeček authored
      [ Upstream commit 264640fc ]
      
      If a fragmented multicast packet is received on an ethernet device which
      has an active macvlan on top of it, each fragment is duplicated and
      received both on the underlying device and the macvlan. If some
      fragments for macvlan are processed before the whole packet for the
      underlying device is reassembled, the "overlapping fragments" test in
      ip6_frag_queue() discards the whole fragment queue.
      
      To resolve this, add device ifindex to the search key and require it to
      match reassembling multicast packets and packets to link-local
      addresses.
      
      Note: similar patch has been already submitted by Yoshifuji Hideaki in
      
        http://patchwork.ozlabs.org/patch/220979/
      
      but got lost and forgotten for some reason.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eac843e4
    • Aaro Koskinen's avatar
      broadcom: fix PHY_ID_BCM5481 entry in the id table · f0456018
      Aaro Koskinen authored
      [ Upstream commit 3c25a860 ]
      
      Commit fcb26ec5 ("broadcom: move all PHY_ID's to header")
      updated broadcom_tbl to use PHY_IDs, but incorrectly replaced 0x0143bca0
      with PHY_ID_BCM5482 (making a duplicate entry, and completely omitting
      the original). Fix that.
      
      Fixes: fcb26ec5 ("broadcom: move all PHY_ID's to header")
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f0456018
    • Nikolay Aleksandrov's avatar
      net: ip6mr: fix static mfc/dev leaks on table destruction · a8c59bc8
      Nikolay Aleksandrov authored
      [ Upstream commit 4c698046 ]
      
      Similar to ipv4, when destroying an mrt table the static mfc entries and
      the static devices are kept, which leads to devices that can never be
      destroyed (because of refcnt taken) and leaked memory. Make sure that
      everything is cleaned up on netns destruction.
      
      Fixes: 8229efda ("netns: ip6mr: enable namespace support in ipv6 multicast forwarding code")
      CC: Benjamin Thery <benjamin.thery@bull.net>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a8c59bc8
    • Nikolay Aleksandrov's avatar
      net: ipmr: fix static mfc/dev leaks on table destruction · 536db34c
      Nikolay Aleksandrov authored
      [ Upstream commit 0e615e96 ]
      
      When destroying an mrt table the static mfc entries and the static
      devices are kept, which leads to devices that can never be destroyed
      (because of refcnt taken) and leaked memory, for example:
      unreferenced object 0xffff880034c144c0 (size 192):
        comm "mfc-broken", pid 4777, jiffies 4320349055 (age 46001.964s)
        hex dump (first 32 bytes):
          98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff  .S.4.....S.4....
          ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00  ................
        backtrace:
          [<ffffffff815c1b9e>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811ea6e0>] kmem_cache_alloc+0x190/0x300
          [<ffffffff815931cb>] ip_mroute_setsockopt+0x5cb/0x910
          [<ffffffff8153d575>] do_ip_setsockopt.isra.11+0x105/0xff0
          [<ffffffff8153e490>] ip_setsockopt+0x30/0xa0
          [<ffffffff81564e13>] raw_setsockopt+0x33/0x90
          [<ffffffff814d1e14>] sock_common_setsockopt+0x14/0x20
          [<ffffffff814d0b51>] SyS_setsockopt+0x71/0xc0
          [<ffffffff815cdbf6>] entry_SYSCALL_64_fastpath+0x16/0x7a
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Make sure that everything is cleaned on netns destruction.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      536db34c
    • Daniel Borkmann's avatar
      net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds · 84786621
      Daniel Borkmann authored
      [ Upstream commit 6900317f ]
      
      David and HacKurx reported a following/similar size overflow triggered
      in a grsecurity kernel, thanks to PaX's gcc size overflow plugin:
      
      (Already fixed in later grsecurity versions by Brad and PaX Team.)
      
      [ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314
                     cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr;
      [ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7
      [ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...]
      [ 1002.296153]  ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8
      [ 1002.296162]  ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8
      [ 1002.296169]  ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60
      [ 1002.296176] Call Trace:
      [ 1002.296190]  [<ffffffff818129ba>] dump_stack+0x45/0x57
      [ 1002.296200]  [<ffffffff8121f838>] report_size_overflow+0x38/0x60
      [ 1002.296209]  [<ffffffff816a979e>] scm_detach_fds+0x2ce/0x300
      [ 1002.296220]  [<ffffffff81791899>] unix_stream_read_generic+0x609/0x930
      [ 1002.296228]  [<ffffffff81791c9f>] unix_stream_recvmsg+0x4f/0x60
      [ 1002.296236]  [<ffffffff8178dc00>] ? unix_set_peek_off+0x50/0x50
      [ 1002.296243]  [<ffffffff8168fac7>] sock_recvmsg+0x47/0x60
      [ 1002.296248]  [<ffffffff81691522>] ___sys_recvmsg+0xe2/0x1e0
      [ 1002.296257]  [<ffffffff81693496>] __sys_recvmsg+0x46/0x80
      [ 1002.296263]  [<ffffffff816934fc>] SyS_recvmsg+0x2c/0x40
      [ 1002.296271]  [<ffffffff8181a3ab>] entry_SYSCALL_64_fastpath+0x12/0x85
      
      Further investigation showed that this can happen when an *odd* number of
      fds are being passed over AF_UNIX sockets.
      
      In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
      where i is the number of successfully passed fds, differ by 4 bytes due
      to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
      on 64 bit. The padding is used to align subsequent cmsg headers in the
      control buffer.
      
      When the control buffer passed in from the receiver side *lacks* these 4
      bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
      overflow in scm_detach_fds():
      
        int cmlen = CMSG_LEN(i * sizeof(int));  <--- cmlen w/o tail-padding
        err = put_user(SOL_SOCKET, &cm->cmsg_level);
        if (!err)
          err = put_user(SCM_RIGHTS, &cm->cmsg_type);
        if (!err)
          err = put_user(cmlen, &cm->cmsg_len);
        if (!err) {
          cmlen = CMSG_SPACE(i * sizeof(int));  <--- cmlen w/ 4 byte extra tail-padding
          msg->msg_control += cmlen;
          msg->msg_controllen -= cmlen;         <--- iff no tail-padding space here ...
        }                                            ... wrap-around
      
      F.e. it will wrap to a length of 18446744073709551612 bytes in case the
      receiver passed in msg->msg_controllen of 20 bytes, and the sender
      properly transferred 1 fd to the receiver, so that its CMSG_LEN results
      in 20 bytes and CMSG_SPACE in 24 bytes.
      
      In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
      issue in my tests as alignment seems always on 4 byte boundary. Same
      should be in case of native 32 bit, where we end up with 4 byte boundaries
      as well.
      
      In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
      a single fd would mean that on successful return, msg->msg_controllen is
      being set by the kernel to 24 bytes instead, thus more than the input
      buffer advertised. It could f.e. become an issue if such application later
      on zeroes or copies the control buffer based on the returned msg->msg_controllen
      elsewhere.
      
      Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).
      
      Going over the code, it seems like msg->msg_controllen is not being read
      after scm_detach_fds() in scm_recv() anymore by the kernel, good!
      
      Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
      and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
      and ___sys_recvmsg() places the updated length, that is, new msg_control -
      old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
      in the example).
      
      Long time ago, Wei Yongjun fixed something related in commit 1ac70e7a
      ("[NET]: Fix function put_cmsg() which may cause usr application memory
      overflow").
      
      RFC3542, section 20.2. says:
      
        The fields shown as "XX" are possible padding, between the cmsghdr
        structure and the data, and between the data and the next cmsghdr
        structure, if required by the implementation. While sending an
        application may or may not include padding at the end of last
        ancillary data in msg_controllen and implementations must accept both
        as valid. On receiving a portable application must provide space for
        padding at the end of the last ancillary data as implementations may
        copy out the padding at the end of the control message buffer and
        include it in the received msg_controllen. When recvmsg() is called
        if msg_controllen is too small for all the ancillary data items
        including any trailing padding after the last item an implementation
        may set MSG_CTRUNC.
      
      Since we didn't place MSG_CTRUNC for already quite a long time, just do
      the same as in 1ac70e7a to avoid an overflow.
      
      Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix
      error in SCM_RIGHTS code sample"). Some people must have copied this (?),
      thus it got triggered in the wild (reported several times during boot by
      David and HacKurx).
      
      No Fixes tag this time as pre 2002 (that is, pre history tree).
      Reported-by: default avatarDavid Sterba <dave@jikos.cz>
      Reported-by: default avatarHacKurx <hackurx@gmail.com>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: Eric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      84786621
    • Eric Dumazet's avatar
      tcp: initialize tp->copied_seq in case of cross SYN connection · 9c43328b
      Eric Dumazet authored
      [ Upstream commit 142a2e7e ]
      
      Dmitry provided a syzkaller (http://github.com/google/syzkaller)
      generated program that triggers the WARNING at
      net/ipv4/tcp.c:1729 in tcp_recvmsg() :
      
      WARN_ON(tp->copied_seq != tp->rcv_nxt &&
              !(flags & (MSG_PEEK | MSG_TRUNC)));
      
      His program is specifically attempting a Cross SYN TCP exchange,
      that we support (for the pleasure of hackers ?), but it looks we
      lack proper tcp->copied_seq initialization.
      
      Thanks again Dmitry for your report and testings.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9c43328b
    • Eric Dumazet's avatar
      tcp: md5: fix lockdep annotation · 42a3f2ae
      Eric Dumazet authored
      [ Upstream commit 1b8e6a01 ]
      
      When a passive TCP is created, we eventually call tcp_md5_do_add()
      with sk pointing to the child. It is not owner by the user yet (we
      will add this socket into listener accept queue a bit later anyway)
      
      But we do own the spinlock, so amend the lockdep annotation to avoid
      following splat :
      
      [ 8451.090932] net/ipv4/tcp_ipv4.c:923 suspicious rcu_dereference_protected() usage!
      [ 8451.090932]
      [ 8451.090932] other info that might help us debug this:
      [ 8451.090932]
      [ 8451.090934]
      [ 8451.090934] rcu_scheduler_active = 1, debug_locks = 1
      [ 8451.090936] 3 locks held by socket_sockopt_/214795:
      [ 8451.090936]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff855c6ac1>] __netif_receive_skb_core+0x151/0xe90
      [ 8451.090947]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff85618143>] ip_local_deliver_finish+0x43/0x2b0
      [ 8451.090952]  #2:  (slock-AF_INET){+.-...}, at: [<ffffffff855acda5>] sk_clone_lock+0x1c5/0x500
      [ 8451.090958]
      [ 8451.090958] stack backtrace:
      [ 8451.090960] CPU: 7 PID: 214795 Comm: socket_sockopt_
      
      [ 8451.091215] Call Trace:
      [ 8451.091216]  <IRQ>  [<ffffffff856fb29c>] dump_stack+0x55/0x76
      [ 8451.091229]  [<ffffffff85123b5b>] lockdep_rcu_suspicious+0xeb/0x110
      [ 8451.091235]  [<ffffffff8564544f>] tcp_md5_do_add+0x1bf/0x1e0
      [ 8451.091239]  [<ffffffff85645751>] tcp_v4_syn_recv_sock+0x1f1/0x4c0
      [ 8451.091242]  [<ffffffff85642b27>] ? tcp_v4_md5_hash_skb+0x167/0x190
      [ 8451.091246]  [<ffffffff85647c78>] tcp_check_req+0x3c8/0x500
      [ 8451.091249]  [<ffffffff856451ae>] ? tcp_v4_inbound_md5_hash+0x11e/0x190
      [ 8451.091253]  [<ffffffff85647170>] tcp_v4_rcv+0x3c0/0x9f0
      [ 8451.091256]  [<ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0
      [ 8451.091260]  [<ffffffff856181b6>] ip_local_deliver_finish+0xb6/0x2b0
      [ 8451.091263]  [<ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0
      [ 8451.091267]  [<ffffffff85618d38>] ip_local_deliver+0x48/0x80
      [ 8451.091270]  [<ffffffff85618510>] ip_rcv_finish+0x160/0x700
      [ 8451.091273]  [<ffffffff8561900e>] ip_rcv+0x29e/0x3d0
      [ 8451.091277]  [<ffffffff855c74b7>] __netif_receive_skb_core+0xb47/0xe90
      
      Fixes: a8afca03 ("tcp: md5: protects md5sig_info with RCU")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      42a3f2ae
    • Bjørn Mork's avatar
      net: qmi_wwan: add XS Stick W100-2 from 4G Systems · 94f06ca7
      Bjørn Mork authored
      [ Upstream commit 68242a5a ]
      
      Thomas reports
      "
      4gsystems sells two total different LTE-surfsticks under the same name.
      ..
      The newer version of XS Stick W100 is from "omega"
      ..
      Under windows the driver switches to the same ID, and uses MI03\6 for
      network and MI01\6 for modem.
      ..
      echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
      echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id
      
      T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1c9e ProdID=9b01 Rev=02.32
      S:  Manufacturer=USB Modem
      S:  Product=USB Modem
      S:  SerialNumber=
      C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I:  If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
      
      Now all important things are there:
      
      wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)
      
      There is also ttyUSB0, but it is not usable, at least not for at.
      
      The device works well with qmi and ModemManager-NetworkManager.
      "
      Reported-by: default avatarThomas Schäfer <tschaefer@t-online.de>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      94f06ca7
    • Neil Horman's avatar
      snmp: Remove duplicate OUTMCAST stat increment · 1165ac07
      Neil Horman authored
      [ Upstream commit 41033f02 ]
      
      the OUTMCAST stat is double incremented, getting bumped once in the mcast code
      itself, and again in the common ip output path.  Remove the mcast bump, as its
      not needed
      
      Validated by the reporter, with good results
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarClaus Jensen <claus.jensen@microsemi.com>
      CC: Claus Jensen <claus.jensen@microsemi.com>
      CC: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1165ac07
    • lucien's avatar
      sctp: translate host order to network order when setting a hmacid · 228b4355
      lucien authored
      [ Upstream commit ed5a377d ]
      
      now sctp auth cannot work well when setting a hmacid manually, which
      is caused by that we didn't use the network order for hmacid, so fix
      it by adding the transformation in sctp_auth_ep_set_hmacs.
      
      even we set hmacid with the network order in userspace, it still
      can't work, because of this condition in sctp_auth_ep_set_hmacs():
      
      		if (id > SCTP_AUTH_HMAC_ID_MAX)
      			return -EOPNOTSUPP;
      
      so this wasn't working before and thus it won't break compatibility.
      
      Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      228b4355
    • Daniel Borkmann's avatar
      packet: infer protocol from ethernet header if unset · fbc3cdbe
      Daniel Borkmann authored
      [ Upstream commit c72219b7 ]
      
      In case no struct sockaddr_ll has been passed to packet
      socket's sendmsg() when doing a TX_RING flush run, then
      skb->protocol is set to po->num instead, which is the protocol
      passed via socket(2)/bind(2).
      
      Applications only xmitting can go the path of allocating the
      socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the
      TX_RING with sll_protocol of 0. That way, register_prot_hook()
      is neither called on creation nor on bind time, which saves
      cycles when there's no interest in capturing anyway.
      
      That leaves us however with po->num 0 instead and therefore
      the TX_RING flush run sets skb->protocol to 0 as well. Eric
      reported that this leads to problems when using tools like
      trafgen over bonding device. I.e. the bonding's hash function
      could invoke the kernel's flow dissector, which depends on
      skb->protocol being properly set. In the current situation, all
      the traffic is then directed to a single slave.
      
      Fix it up by inferring skb->protocol from the Ethernet header
      when not set and we have ARPHRD_ETHER device type. This is only
      done in case of SOCK_RAW and where we have a dev->hard_header_len
      length. In case of ARPHRD_ETHER devices, this is guaranteed to
      cover ETH_HLEN, and therefore being accessed on the skb after
      the skb_store_bits().
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fbc3cdbe
  2. 15 Dec, 2015 2 commits
    • Daniel Borkmann's avatar
      packet: do skb_probe_transport_header when we actually have data · be67618b
      Daniel Borkmann authored
      [ Upstream commit efdfa2f7 ]
      
      In tpacket_fill_skb() commit c1aad275 ("packet: set transport
      header before doing xmit") and later on 40893fd0 ("net: switch
      to use skb_probe_transport_header()") was probing for a transport
      header on the skb from a ring buffer slot, but at a time, where
      the skb has _not even_ been filled with data yet. So that call into
      the flow dissector is pretty useless. Lets do it after we've set
      up the skb frags.
      
      Fixes: c1aad275 ("packet: set transport header before doing xmit")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      be67618b
    • Rainer Weikusat's avatar
      unix: avoid use-after-free in ep_remove_wait_queue · 9964b4c4
      Rainer Weikusat authored
      [ Upstream commit 7d267278 ]
      
      Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
      An AF_UNIX datagram socket being the client in an n:1 association with
      some server socket is only allowed to send messages to the server if the
      receive queue of this socket contains at most sk_max_ack_backlog
      datagrams. This implies that prospective writers might be forced to go
      to sleep despite none of the message presently enqueued on the server
      receive queue were sent by them. In order to ensure that these will be
      woken up once space becomes again available, the present unix_dgram_poll
      routine does a second sock_poll_wait call with the peer_wait wait queue
      of the server socket as queue argument (unix_dgram_recvmsg does a wake
      up on this queue after a datagram was received). This is inherently
      problematic because the server socket is only guaranteed to remain alive
      for as long as the client still holds a reference to it. In case the
      connection is dissolved via connect or by the dead peer detection logic
      in unix_dgram_sendmsg, the server socket may be freed despite "the
      polling mechanism" (in particular, epoll) still has a pointer to the
      corresponding peer_wait queue. There's no way to forcibly deregister a
      wait queue with epoll.
      
      Based on an idea by Jason Baron, the patch below changes the code such
      that a wait_queue_t belonging to the client socket is enqueued on the
      peer_wait queue of the server whenever the peer receive queue full
      condition is detected by either a sendmsg or a poll. A wake up on the
      peer queue is then relayed to the ordinary wait queue of the client
      socket via wake function. The connection to the peer wait queue is again
      dissolved if either a wake up is about to be relayed or the client
      socket reconnects or a dead peer is detected or the client socket is
      itself closed. This enables removing the second sock_poll_wait from
      unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
      that no blocked writer sleeps forever.
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mobileactivedefense.com>
      Fixes: ec0d215f ("af_unix: fix 'poll for write'/connected DGRAM sockets")
      Reviewed-by: default avatarJason Baron <jbaron@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9964b4c4
  3. 14 Dec, 2015 3 commits
  4. 30 Nov, 2015 1 commit
  5. 23 Nov, 2015 1 commit
  6. 18 Nov, 2015 10 commits
    • Yasuaki Ishimatsu's avatar
      x86/mm/hotplug: Modify PGD entry when removing memory · 5f2f9512
      Yasuaki Ishimatsu authored
      commit 9661d5bc upstream.
      
      When hot-adding/removing memory, sync_global_pgds() is called
      for synchronizing PGD to PGD entries of all processes MM.  But
      when hot-removing memory, sync_global_pgds() does not work
      correctly.
      
      At first, sync_global_pgds() checks whether target PGD is none
      or not.  And if PGD is none, the PGD is skipped.  But when
      hot-removing memory, PGD may be none since PGD may be cleared by
      free_pud_table().  So when sync_global_pgds() is called after
      hot-removing memory, sync_global_pgds() should not skip PGD even
      if the PGD is none.  And sync_global_pgds() must clear PGD
      entries of all processes MM.
      
      Currently sync_global_pgds() does not clear PGD entries of all
      processes MM when hot-removing memory.  So when hot adding
      memory which is same memory range as removed memory after
      hot-removing memory, following call traces are shown:
      
       kernel BUG at arch/x86/mm/init_64.c:206!
       ...
       [<ffffffff815e0c80>] kernel_physical_mapping_init+0x1b2/0x1d2
       [<ffffffff815ced94>] init_memory_mapping+0x1d4/0x380
       [<ffffffff8104aebd>] arch_add_memory+0x3d/0xd0
       [<ffffffff815d03d9>] add_memory+0xb9/0x1b0
       [<ffffffff81352415>] acpi_memory_device_add+0x1af/0x28e
       [<ffffffff81325dc4>] acpi_bus_device_attach+0x8c/0xf0
       [<ffffffff813413b9>] acpi_ns_walk_namespace+0xc8/0x17f
       [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7
       [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7
       [<ffffffff813418ed>] acpi_walk_namespace+0x95/0xc5
       [<ffffffff81326b4c>] acpi_bus_scan+0x9a/0xc2
       [<ffffffff81326bff>] acpi_scan_bus_device_check+0x8b/0x12e
       [<ffffffff81326cb5>] acpi_scan_device_check+0x13/0x15
       [<ffffffff81320122>] acpi_os_execute_deferred+0x25/0x32
       [<ffffffff8107e02b>] process_one_work+0x17b/0x460
       [<ffffffff8107edfb>] worker_thread+0x11b/0x400
       [<ffffffff8107ece0>] ? rescuer_thread+0x400/0x400
       [<ffffffff81085aef>] kthread+0xcf/0xe0
       [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0
       [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
      
      This patch clears PGD entries of all processes MM when
      sync_global_pgds() is called after hot-removing memory
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Acked-by: default avatarToshi Kani <toshi.kani@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5f2f9512
    • Yasuaki Ishimatsu's avatar
      x86/mm/hotplug: Pass sync_global_pgds() a correct argument in remove_pagetable() · 3434ce3d
      Yasuaki Ishimatsu authored
      commit 5255e0a7 upstream.
      
      When hot-adding memory after hot-removing memory, following call
      traces are shown:
      
        kernel BUG at arch/x86/mm/init_64.c:206!
        ...
       [<ffffffff815e0c80>] kernel_physical_mapping_init+0x1b2/0x1d2
       [<ffffffff815ced94>] init_memory_mapping+0x1d4/0x380
       [<ffffffff8104aebd>] arch_add_memory+0x3d/0xd0
       [<ffffffff815d03d9>] add_memory+0xb9/0x1b0
       [<ffffffff81352415>] acpi_memory_device_add+0x1af/0x28e
       [<ffffffff81325dc4>] acpi_bus_device_attach+0x8c/0xf0
       [<ffffffff813413b9>] acpi_ns_walk_namespace+0xc8/0x17f
       [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7
       [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7
       [<ffffffff813418ed>] acpi_walk_namespace+0x95/0xc5
       [<ffffffff81326b4c>] acpi_bus_scan+0x9a/0xc2
       [<ffffffff81326bff>] acpi_scan_bus_device_check+0x8b/0x12e
       [<ffffffff81326cb5>] acpi_scan_device_check+0x13/0x15
       [<ffffffff81320122>] acpi_os_execute_deferred+0x25/0x32
       [<ffffffff8107e02b>] process_one_work+0x17b/0x460
       [<ffffffff8107edfb>] worker_thread+0x11b/0x400
       [<ffffffff8107ece0>] ? rescuer_thread+0x400/0x400
       [<ffffffff81085aef>] kthread+0xcf/0xe0
       [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0
       [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
      
      The patch-set fixes the issue.
      
      This patch (of 2):
      
      remove_pagetable() gets start argument and passes the argument
      to sync_global_pgds().  In this case, the argument must not be
      modified.  If the argument is modified and passed to
      sync_global_pgds(), sync_global_pgds() does not correctly
      synchronize PGD to PGD entries of all processes MM since
      synchronized range of memory [start, end] is wrong.
      
      Unfortunately the start argument is modified in
      remove_pagetable().  So this patch fixes the issue.
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Acked-by: default avatarToshi Kani <toshi.kani@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3434ce3d
    • Marcelo Leitner's avatar
      ipv6: addrconf: validate new MTU before applying it · 49f9add0
      Marcelo Leitner authored
      commit 77751427 upstream.
      
      Currently we don't check if the new MTU is valid or not and this allows
      one to configure a smaller than minimum allowed by RFCs or even bigger
      than interface own MTU, which is a problem as it may lead to packet
      drops.
      
      If you have a daemon like NetworkManager running, this may be exploited
      by remote attackers by forging RA packets with an invalid MTU, possibly
      leading to a DoS. (NetworkManager currently only validates for values
      too small, but not for too big ones.)
      
      The fix is just to make sure the new value is valid. That is, between
      IPV6_MIN_MTU and interface's MTU.
      
      Note that similar check is already performed at
      ndisc_router_discovery(), for when kernel itself parses the RA.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      49f9add0
    • Nadav Amit's avatar
      KVM: x86: Use new is_noncanonical_address in _linearize · 2b27106c
      Nadav Amit authored
      commit 4be4de7e upstream.
      
      Replace the current canonical address check with the new function which is
      identical.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2b27106c
    • Nadav Amit's avatar
      KVM: x86: Fix far-jump to non-canonical check · e4bcfa44
      Nadav Amit authored
      commit 7e46dddd upstream.
      
      Commit d1442d85 ("KVM: x86: Handle errors when RIP is set during far
      jumps") introduced a bug that caused the fix to be incomplete.  Due to
      incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit
      segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may
      not trigger #GP.  As we know, this imposes a security problem.
      
      In addition, the condition for two warnings was incorrect.
      
      Fixes: d1442d85Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e4bcfa44
    • Paolo Bonzini's avatar
      KVM: svm: unconditionally intercept #DB · 4c6a0e0e
      Paolo Bonzini authored
      commit cbdb967a upstream.
      
      This is needed to avoid the possibility that the guest triggers
      an infinite stream of #DB exceptions (CVE-2015-8104).
      
      VMX is not affected: because it does not save DR6 in the VMCS,
      it already intercepts #DB unconditionally.
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4c6a0e0e
    • Eric Northup's avatar
      KVM: x86: work around infinite loop in microcode when #AC is delivered · 0ccaee7b
      Eric Northup authored
      commit 54a20552 upstream.
      
      It was found that a guest can DoS a host by triggering an infinite
      stream of "alignment check" (#AC) exceptions.  This causes the
      microcode to enter an infinite loop where the core never receives
      another interrupt.  The host kernel panics pretty quickly due to the
      effects (CVE-2015-5307).
      Signed-off-by: default avatarEric Northup <digitaleric@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0ccaee7b
    • Nadav Amit's avatar
      KVM: x86: Defining missing x86 vectors · 97a51976
      Nadav Amit authored
      commit c9cdd085 upstream.
      
      Defining XE, XM and VE vector numbers.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      97a51976
    • David Howells's avatar
      KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring · bd6e0469
      David Howells authored
      commit f05819df upstream.
      
      The following sequence of commands:
      
          i=`keyctl add user a a @s`
          keyctl request2 keyring foo bar @t
          keyctl unlink $i @s
      
      tries to invoke an upcall to instantiate a keyring if one doesn't already
      exist by that name within the user's keyring set.  However, if the upcall
      fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
      other error code.  When the key is garbage collected, the key destroy
      function is called unconditionally and keyring_destroy() uses list_empty()
      on keyring->type_data.link - which is in a union with reject_error.
      Subsequently, the kernel tries to unlink the keyring from the keyring names
      list - which oopses like this:
      
      	BUG: unable to handle kernel paging request at 00000000ffffff8a
      	IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
      	...
      	Workqueue: events key_garbage_collector
      	...
      	RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
      	RSP: 0018:ffff88003e2f3d30  EFLAGS: 00010203
      	RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
      	RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
      	RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
      	R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
      	R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
      	...
      	CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
      	...
      	Call Trace:
      	 [<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
      	 [<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
      	 [<ffffffff8105ec9b>] process_one_work+0x28e/0x547
      	 [<ffffffff8105fd17>] worker_thread+0x26e/0x361
      	 [<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
      	 [<ffffffff810648ad>] kthread+0xf3/0xfb
      	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
      	 [<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
      	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
      
      Note the value in RAX.  This is a 32-bit representation of -ENOKEY.
      
      The solution is to only call ->destroy() if the key was successfully
      instantiated.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bd6e0469
    • David Howells's avatar
      KEYS: Fix race between key destruction and finding a keyring by name · f4562591
      David Howells authored
      commit 94c4554b upstream.
      
      There appears to be a race between:
      
       (1) key_gc_unused_keys() which frees key->security and then calls
           keyring_destroy() to unlink the name from the name list
      
       (2) find_keyring_by_name() which calls key_permission(), thus accessing
           key->security, on a key before checking to see whether the key usage is 0
           (ie. the key is dead and might be cleaned up).
      
      Fix this by calling ->destroy() before cleaning up the core key data -
      including key->security.
      Reported-by: default avatarPetr Matousek <pmatouse@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f4562591