1. 17 Jan, 2018 10 commits
    • Maciej W. Rozycki's avatar
      MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET · 725679dc
      Maciej W. Rozycki authored
      commit 006501e0 upstream.
      
      Complement commit d614fd58 ("mips/ptrace: Preserve previous
      registers for short regset write") and like with the PTRACE_GETREGSET
      ptrace(2) request also apply a BUILD_BUG_ON check for the size of the
      `elf_fpreg_t' type in the PTRACE_SETREGSET request handler.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Fixes: d614fd58 ("mips/ptrace: Preserve previous registers for short regset write")
      Cc: James Hogan <james.hogan@mips.com>
      Cc: Paul Burton <Paul.Burton@mips.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/17929/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      725679dc
    • Maciej W. Rozycki's avatar
      MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA · 9584ae52
      Maciej W. Rozycki authored
      commit be07a6a1 upstream.
      
      Fix a commit 72b22bba ("MIPS: Don't assume 64-bit FP registers for
      FP regset") public API regression, then activated by commit 1db1af84
      ("MIPS: Basic MSA context switching support"), that caused the FCSR
      register not to be read or written for CONFIG_CPU_HAS_MSA kernel
      configurations (regardless of actual presence or absence of the MSA
      feature in a given processor) with ptrace(2) PTRACE_GETREGSET and
      PTRACE_SETREGSET requests nor recorded in core dumps.
      
      This is because with !CONFIG_CPU_HAS_MSA configurations the whole of
      `elf_fpregset_t' array is bulk-copied as it is, which includes the FCSR
      in one half of the last, 33rd slot, whereas with CONFIG_CPU_HAS_MSA
      configurations array elements are copied individually, and then only the
      leading 32 FGR slots while the remaining slot is ignored.
      
      Correct the code then such that only FGR slots are copied in the
      respective !MSA and MSA helpers an then the FCSR slot is handled
      separately in common code.  Use `ptrace_setfcr31' to update the FCSR
      too, so that the read-only mask is respected.
      
      Retrieving a correct value of FCSR is important in debugging not only
      for the human to be able to get the right interpretation of the
      situation, but for correct operation of GDB as well.  This is because
      the condition code bits in FSCR are used by GDB to determine the
      location to place a breakpoint at when single-stepping through an FPU
      branch instruction.  If such a breakpoint is placed incorrectly (i.e.
      with the condition reversed), then it will be missed, likely causing the
      debuggee to run away from the control of GDB and consequently breaking
      the process of investigation.
      
      Fortunately GDB continues using the older PTRACE_GETFPREGS ptrace(2)
      request which is unaffected, so the regression only really hits with
      post-mortem debug sessions using a core dump file, in which case
      execution, and consequently single-stepping through branches is not
      possible.  Of course core files created by buggy kernels out there will
      have the value of FCSR recorded clobbered, but such core files cannot be
      corrected and the person using them simply will have to be aware that
      the value of FCSR retrieved is not reliable.
      
      Which also means we can likely get away without defining a replacement
      API which would ensure a correct value of FSCR to be retrieved, or none
      at all.
      
      This is based on previous work by Alex Smith, extensively rewritten.
      Signed-off-by: default avatarAlex Smith <alex@alex-smith.me.uk>
      Signed-off-by: default avatarJames Hogan <james.hogan@mips.com>
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Fixes: 72b22bba ("MIPS: Don't assume 64-bit FP registers for FP regset")
      Cc: Paul Burton <Paul.Burton@mips.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/17928/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9584ae52
    • Maciej W. Rozycki's avatar
      MIPS: Consistently handle buffer counter with PTRACE_SETREGSET · a6972f8b
      Maciej W. Rozycki authored
      commit 80b3ffce upstream.
      
      Update commit d614fd58 ("mips/ptrace: Preserve previous registers
      for short regset write") bug and consistently consume all data supplied
      to `fpr_set_msa' with the ptrace(2) PTRACE_SETREGSET request, such that
      a zero data buffer counter is returned where insufficient data has been
      given to fill a whole number of FP general registers.
      
      In reality this is not going to happen, as the caller is supposed to
      only supply data covering a whole number of registers and it is verified
      in `ptrace_regset' and again asserted in `fpr_set', however structuring
      code such that the presence of trailing partial FP general register data
      causes `fpr_set_msa' to return with a non-zero data buffer counter makes
      it appear that this trailing data will be used if there are subsequent
      writes made to FP registers, which is going to be the case with the FCSR
      once the missing write to that register has been fixed.
      
      Fixes: d614fd58 ("mips/ptrace: Preserve previous registers for short regset write")
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Cc: James Hogan <james.hogan@mips.com>
      Cc: Paul Burton <Paul.Burton@mips.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/17927/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6972f8b
    • Maciej W. Rozycki's avatar
      MIPS: Guard against any partial write attempt with PTRACE_SETREGSET · e68049f6
      Maciej W. Rozycki authored
      commit dc24d0ed upstream.
      
      Complement commit d614fd58 ("mips/ptrace: Preserve previous
      registers for short regset write") and ensure that no partial register
      write attempt is made with PTRACE_SETREGSET, as we do not preinitialize
      any temporaries used to hold incoming register data and consequently
      random data could be written.
      
      It is the responsibility of the caller, such as `ptrace_regset', to
      arrange for writes to span whole registers only, so here we only assert
      that it has indeed happened.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Fixes: 72b22bba ("MIPS: Don't assume 64-bit FP registers for FP regset")
      Cc: James Hogan <james.hogan@mips.com>
      Cc: Paul Burton <Paul.Burton@mips.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/17926/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e68049f6
    • Maciej W. Rozycki's avatar
      MIPS: Factor out NT_PRFPREG regset access helpers · b1e808b9
      Maciej W. Rozycki authored
      commit a03fe725 upstream.
      
      In preparation to fix a commit 72b22bba ("MIPS: Don't assume 64-bit
      FP registers for FP regset") FCSR access regression factor out
      NT_PRFPREG regset access helpers for the non-MSA and the MSA variants
      respectively, to avoid having to deal with excessive indentation in the
      actual fix.
      
      No functional change, however use `target->thread.fpu.fpr[0]' rather
      than `target->thread.fpu.fpr[i]' for FGR holding type size determination
      as there's no `i' variable to refer to anymore, and for the factored out
      `i' variable declaration use `unsigned int' rather than `unsigned' as
      its type, following the common style.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Fixes: 72b22bba ("MIPS: Don't assume 64-bit FP registers for FP regset")
      Cc: James Hogan <james.hogan@mips.com>
      Cc: Paul Burton <Paul.Burton@mips.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/17925/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1e808b9
    • Maciej W. Rozycki's avatar
      MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task · 1e918a43
      Maciej W. Rozycki authored
      commit b67336ee upstream.
      
      Fix an API loophole introduced with commit 9791554b ("MIPS,prctl:
      add PR_[GS]ET_FP_MODE prctl options for MIPS"), where the caller of
      prctl(2) is incorrectly allowed to make a change to CP0.Status.FR or
      CP0.Config5.FRE register bits even if CONFIG_MIPS_O32_FP64_SUPPORT has
      not been enabled, despite that an executable requesting the mode
      requested via ELF file annotation would not be allowed to run in the
      first place, or for n64 and n64 ABI tasks which do not have non-default
      modes defined at all.  Add suitable checks to `mips_set_process_fp_mode'
      and bail out if an invalid mode change has been requested for the ABI in
      effect, even if the FPU hardware or emulation would otherwise allow it.
      
      Always succeed however without taking any further action if the mode
      requested is the same as one already in effect, regardless of whether
      any mode change, should it be requested, would actually be allowed for
      the task concerned.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Fixes: 9791554b ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
      Reviewed-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: James Hogan <james.hogan@mips.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/17800/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e918a43
    • Bart Van Assche's avatar
      IB/srpt: Disable RDMA access by the initiator · 6c2c83eb
      Bart Van Assche authored
      commit bec40c26 upstream.
      
      With the SRP protocol all RDMA operations are initiated by the target.
      Since no RDMA operations are initiated by the initiator, do not grant
      the initiator permission to submit RDMA reads or writes to the target.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c2c83eb
    • Wolfgang Grandegger's avatar
      can: gs_usb: fix return value of the "set_bittiming" callback · a71d6de9
      Wolfgang Grandegger authored
      commit d5b42e66 upstream.
      
      The "set_bittiming" callback treats a positive return value as error!
      For that reason "can_changelink()" will quit silently after setting
      the bittiming values without processing ctrlmode, restart-ms, etc.
      Signed-off-by: default avatarWolfgang Grandegger <wg@grandegger.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a71d6de9
    • Wanpeng Li's avatar
      KVM: Fix stack-out-of-bounds read in write_mmio · eb91461d
      Wanpeng Li authored
      commit e39d200f upstream.
      
      Reported by syzkaller:
      
        BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
        Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298
      
        CPU: 6 PID: 32298 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #18
        Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
        Call Trace:
         dump_stack+0xab/0xe1
         print_address_description+0x6b/0x290
         kasan_report+0x28a/0x370
         write_mmio+0x11e/0x270 [kvm]
         emulator_read_write_onepage+0x311/0x600 [kvm]
         emulator_read_write+0xef/0x240 [kvm]
         emulator_fix_hypercall+0x105/0x150 [kvm]
         em_hypercall+0x2b/0x80 [kvm]
         x86_emulate_insn+0x2b1/0x1640 [kvm]
         x86_emulate_instruction+0x39a/0xb90 [kvm]
         handle_exception+0x1b4/0x4d0 [kvm_intel]
         vcpu_enter_guest+0x15a0/0x2640 [kvm]
         kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
         kvm_vcpu_ioctl+0x479/0x880 [kvm]
         do_vfs_ioctl+0x142/0x9a0
         SyS_ioctl+0x74/0x80
         entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
      to the guest memory, however, write_mmio tracepoint always prints 8 bytes
      through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
      leaks 5 bytes from the kernel stack (CVE-2017-17741).  This patch fixes
      it by just accessing the bytes which we operate on.
      
      Before patch:
      
      syz-executor-5567  [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f
      
      After patch:
      
      syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarDarren Kenny <darren.kenny@oracle.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Tested-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb91461d
    • Suren Baghdasaryan's avatar
      dm bufio: fix shrinker scans when (nr_to_scan < retain_target) · cbb1cc72
      Suren Baghdasaryan authored
      commit fbc7c07e upstream.
      
      When system is under memory pressure it is observed that dm bufio
      shrinker often reclaims only one buffer per scan. This change fixes
      the following two issues in dm bufio shrinker that cause this behavior:
      
      1. ((nr_to_scan - freed) <= retain_target) condition is used to
      terminate slab scan process. This assumes that nr_to_scan is equal
      to the LRU size, which might not be correct because do_shrink_slab()
      in vmscan.c calculates nr_to_scan using multiple inputs.
      As a result when nr_to_scan is less than retain_target (64) the scan
      will terminate after the first iteration, effectively reclaiming one
      buffer per scan and making scans very inefficient. This hurts vmscan
      performance especially because mutex is acquired/released every time
      dm_bufio_shrink_scan() is called.
      New implementation uses ((LRU size - freed) <= retain_target)
      condition for scan termination. LRU size can be safely determined
      inside __scan() because this function is called after dm_bufio_lock().
      
      2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to
      determine number of freeable objects in the slab. However dm_bufio
      always retains retain_target buffers in its LRU and will terminate
      a scan when this mark is reached. Therefore returning the entire LRU size
      from dm_bufio_shrink_count() is misleading because that does not
      represent the number of freeable objects that slab will reclaim during
      a scan. Returning (LRU size - retain_target) better represents the
      number of freeable objects in the slab. This way do_shrink_slab()
      returns 0 when (LRU size < retain_target) and vmscan will not try to
      scan this shrinker avoiding scans that will not reclaim any memory.
      
      Test: tested using Android device running
      <AOSP>/system/extras/alloc-stress that generates memory pressure
      and causes intensive shrinker scans
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      cbb1cc72
  2. 10 Jan, 2018 23 commits
  3. 05 Jan, 2018 7 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.110 · b3e3db15
      Greg Kroah-Hartman authored
      b3e3db15
    • Guenter Roeck's avatar
      kaiser: Set _PAGE_NX only if supported · b33c3c64
      Guenter Roeck authored
      This resolves a crash if loaded under qemu + haxm under windows.
      See https://www.spinics.net/lists/kernel/msg2689835.html for details.
      Here is a boot log (the log is from chromeos-4.4, but Tao Wu says that
      the same log is also seen with vanilla v4.4.110-rc1).
      
      [    0.712750] Freeing unused kernel memory: 552K
      [    0.721821] init: Corrupted page table at address 57b029b332e0
      [    0.722761] PGD 80000000bb238067 PUD bc36a067 PMD bc369067 PTE 45d2067
      [    0.722761] Bad pagetable: 000b [#1] PREEMPT SMP 
      [    0.722761] Modules linked in:
      [    0.722761] CPU: 1 PID: 1 Comm: init Not tainted 4.4.96 #31
      [    0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [    0.722761] task: ffff8800bc290000 ti: ffff8800bc28c000 task.ti: ffff8800bc28c000
      [    0.722761] RIP: 0010:[<ffffffff83f4129e>]  [<ffffffff83f4129e>] __clear_user+0x42/0x67
      [    0.722761] RSP: 0000:ffff8800bc28fcf8  EFLAGS: 00010202
      [    0.722761] RAX: 0000000000000000 RBX: 00000000000001a4 RCX: 00000000000001a4
      [    0.722761] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000057b029b332e0
      [    0.722761] RBP: ffff8800bc28fd08 R08: ffff8800bc290000 R09: ffff8800bb2f4000
      [    0.722761] R10: ffff8800bc290000 R11: ffff8800bb2f4000 R12: 000057b029b332e0
      [    0.722761] R13: 0000000000000000 R14: 000057b029b33340 R15: ffff8800bb1e2a00
      [    0.722761] FS:  0000000000000000(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
      [    0.722761] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [    0.722761] CR2: 000057b029b332e0 CR3: 00000000bb2f8000 CR4: 00000000000006e0
      [    0.722761] Stack:
      [    0.722761]  000057b029b332e0 ffff8800bb95fa80 ffff8800bc28fd18 ffffffff83f4120c
      [    0.722761]  ffff8800bc28fe18 ffffffff83e9e7a1 ffff8800bc28fd68 0000000000000000
      [    0.722761]  ffff8800bc290000 ffff8800bc290000 ffff8800bc290000 ffff8800bc290000
      [    0.722761] Call Trace:
      [    0.722761]  [<ffffffff83f4120c>] clear_user+0x2e/0x30
      [    0.722761]  [<ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7
      [    0.722761]  [<ffffffff83de2088>] search_binary_handler+0x86/0x19c
      [    0.722761]  [<ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98
      [    0.722761]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.722761]  [<ffffffff83de40be>] do_execve+0x23/0x25
      [    0.722761]  [<ffffffff83c002e3>] run_init_process+0x2b/0x2d
      [    0.722761]  [<ffffffff844fec4d>] kernel_init+0x6d/0xda
      [    0.722761]  [<ffffffff84505b2f>] ret_from_fork+0x3f/0x70
      [    0.722761]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.722761] Code: 86 84 be 12 00 00 00 e8 87 0d e8 ff 66 66 90 48 89 d8 48 c1
      eb 03 4c 89 e7 83 e0 07 48 89 d9 be 08 00 00 00 31 d2 48 85 c9 74 0a <48> 89 17
      48 01 f7 ff c9 75 f6 48 89 c1 85 c9 74 09 88 17 48 ff 
      [    0.722761] RIP  [<ffffffff83f4129e>] __clear_user+0x42/0x67
      [    0.722761]  RSP <ffff8800bc28fcf8>
      [    0.722761] ---[ end trace def703879b4ff090 ]---
      [    0.722761] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/kernel/locking/rwsem.c:21
      [    0.722761] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: init
      [    0.722761] CPU: 1 PID: 1 Comm: init Tainted: G      D         4.4.96 #31
      [    0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [    0.722761]  0000000000000086 dcb5d76098c89836 ffff8800bc28fa30 ffffffff83f34004
      [    0.722761]  ffffffff84839dc2 0000000000000015 ffff8800bc28fa40 ffffffff83d57dc9
      [    0.722761]  ffff8800bc28fa68 ffffffff83d57e6a ffffffff84a53640 0000000000000000
      [    0.722761] Call Trace:
      [    0.722761]  [<ffffffff83f34004>] dump_stack+0x4d/0x63
      [    0.722761]  [<ffffffff83d57dc9>] ___might_sleep+0x13a/0x13c
      [    0.722761]  [<ffffffff83d57e6a>] __might_sleep+0x9f/0xa6
      [    0.722761]  [<ffffffff84502788>] down_read+0x20/0x31
      [    0.722761]  [<ffffffff83cc5d9b>] __blocking_notifier_call_chain+0x35/0x63
      [    0.722761]  [<ffffffff83cc5ddd>] blocking_notifier_call_chain+0x14/0x16
      [    0.800374] usb 1-1: new full-speed USB device number 2 using uhci_hcd
      [    0.722761]  [<ffffffff83cefe97>] profile_task_exit+0x1a/0x1c
      [    0.802309]  [<ffffffff83cac84e>] do_exit+0x39/0xe7f
      [    0.802309]  [<ffffffff83ce5938>] ? vprintk_default+0x1d/0x1f
      [    0.802309]  [<ffffffff83d7bb95>] ? printk+0x57/0x73
      [    0.802309]  [<ffffffff83c46e25>] oops_end+0x80/0x85
      [    0.802309]  [<ffffffff83c7b747>] pgtable_bad+0x8a/0x95
      [    0.802309]  [<ffffffff83ca7f4a>] __do_page_fault+0x8c/0x352
      [    0.802309]  [<ffffffff83eefba5>] ? file_has_perm+0xc4/0xe5
      [    0.802309]  [<ffffffff83ca821c>] do_page_fault+0xc/0xe
      [    0.802309]  [<ffffffff84507682>] page_fault+0x22/0x30
      [    0.802309]  [<ffffffff83f4129e>] ? __clear_user+0x42/0x67
      [    0.802309]  [<ffffffff83f4127f>] ? __clear_user+0x23/0x67
      [    0.802309]  [<ffffffff83f4120c>] clear_user+0x2e/0x30
      [    0.802309]  [<ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7
      [    0.802309]  [<ffffffff83de2088>] search_binary_handler+0x86/0x19c
      [    0.802309]  [<ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98
      [    0.802309]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.802309]  [<ffffffff83de40be>] do_execve+0x23/0x25
      [    0.802309]  [<ffffffff83c002e3>] run_init_process+0x2b/0x2d
      [    0.802309]  [<ffffffff844fec4d>] kernel_init+0x6d/0xda
      [    0.802309]  [<ffffffff84505b2f>] ret_from_fork+0x3f/0x70
      [    0.802309]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.830559] Kernel panic - not syncing: Attempted to kill init!  exitcode=0x00000009
      [    0.830559] 
      [    0.831305] Kernel Offset: 0x2c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [    0.831305] ---[ end Kernel panic - not syncing: Attempted to kill init!  exitcode=0x00000009
      
      The crash part of this problem may be solved with the following patch
      (thanks to Hugh for the hint). There is still another problem, though -
      with this patch applied, the qemu session aborts with "VCPU Shutdown
      request", whatever that means.
      
      Cc: lepton <ytht.net@gmail.com>
      Signed-off-by: default avatarGuenter Roeck <groeck@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b33c3c64
    • Andrey Ryabinin's avatar
      x86/kasan: Clear kasan_zero_page after TLB flush · 2b24fe5c
      Andrey Ryabinin authored
      commit 69e0210f upstream.
      
      Currently we clear kasan_zero_page before __flush_tlb_all(). This
      works with current implementation of native_flush_tlb[_global]()
      because it doesn't cause do any writes to kasan shadow memory.
      But any subtle change made in native_flush_tlb*() could break this.
      Also current code seems doesn't work for paravirt guests (lguest).
      
      Only after the TLB flush we can be sure that kasan_zero_page is not
      used as early shadow anymore (instrumented code will not write to it).
      So it should cleared it only after the TLB flush.
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1452516679-32040-2-git-send-email-aryabinin@virtuozzo.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Jamie Iles <jamie.iles@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b24fe5c
    • Andy Lutomirski's avatar
      x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap · 755bd549
      Andy Lutomirski authored
      commit dac16fba upstream.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/9d37826fdc7e2d2809efe31d5345f97186859284.1449702533.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Jamie Iles <jamie.iles@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      755bd549
    • Andy Lutomirski's avatar
      x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader · 64e23980
      Andy Lutomirski authored
      commit 6b078f5d upstream.
      
      The pvclock vdso code was too abstracted to understand easily
      and excessively paranoid.  Simplify it for a huge speedup.
      
      This opens the door for additional simplifications, as the vdso
      no longer accesses the pvti for any vcpu other than vcpu 0.
      
      Before, vclock_gettime using kvm-clock took about 45ns on my
      machine. With this change, it takes 29ns, which is almost as
      fast as the pure TSC implementation.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/6b51dcc41f1b101f963945c5ec7093d72bdac429.1449702533.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Jamie Iles <jamie.iles@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64e23980
    • Kees Cook's avatar
      KPTI: Report when enabled · bfd51a4d
      Kees Cook authored
      Make sure dmesg reports when KPTI is enabled.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfd51a4d
    • Kees Cook's avatar
      KPTI: Rename to PAGE_TABLE_ISOLATION · 3e1457d6
      Kees Cook authored
      This renames CONFIG_KAISER to CONFIG_PAGE_TABLE_ISOLATION.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e1457d6