1. 17 Jan, 2018 6 commits
    • Bart Van Assche's avatar
      IB/srpt: Disable RDMA access by the initiator · 30191718
      Bart Van Assche authored
      commit bec40c26 upstream.
      
      With the SRP protocol all RDMA operations are initiated by the target.
      Since no RDMA operations are initiated by the initiator, do not grant
      the initiator permission to submit RDMA reads or writes to the target.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30191718
    • Wolfgang Grandegger's avatar
      can: gs_usb: fix return value of the "set_bittiming" callback · 02f201f7
      Wolfgang Grandegger authored
      commit d5b42e66 upstream.
      
      The "set_bittiming" callback treats a positive return value as error!
      For that reason "can_changelink()" will quit silently after setting
      the bittiming values without processing ctrlmode, restart-ms, etc.
      Signed-off-by: default avatarWolfgang Grandegger <wg@grandegger.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02f201f7
    • Wanpeng Li's avatar
      KVM: Fix stack-out-of-bounds read in write_mmio · c781e3be
      Wanpeng Li authored
      commit e39d200f upstream.
      
      Reported by syzkaller:
      
        BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
        Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298
      
        CPU: 6 PID: 32298 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #18
        Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
        Call Trace:
         dump_stack+0xab/0xe1
         print_address_description+0x6b/0x290
         kasan_report+0x28a/0x370
         write_mmio+0x11e/0x270 [kvm]
         emulator_read_write_onepage+0x311/0x600 [kvm]
         emulator_read_write+0xef/0x240 [kvm]
         emulator_fix_hypercall+0x105/0x150 [kvm]
         em_hypercall+0x2b/0x80 [kvm]
         x86_emulate_insn+0x2b1/0x1640 [kvm]
         x86_emulate_instruction+0x39a/0xb90 [kvm]
         handle_exception+0x1b4/0x4d0 [kvm_intel]
         vcpu_enter_guest+0x15a0/0x2640 [kvm]
         kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
         kvm_vcpu_ioctl+0x479/0x880 [kvm]
         do_vfs_ioctl+0x142/0x9a0
         SyS_ioctl+0x74/0x80
         entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
      to the guest memory, however, write_mmio tracepoint always prints 8 bytes
      through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
      leaks 5 bytes from the kernel stack (CVE-2017-17741).  This patch fixes
      it by just accessing the bytes which we operate on.
      
      Before patch:
      
      syz-executor-5567  [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f
      
      After patch:
      
      syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarDarren Kenny <darren.kenny@oracle.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Tested-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c781e3be
    • Vasanthakumar Thiagarajan's avatar
      ath10k: rebuild crypto header in rx data frames · c5ab9ee1
      Vasanthakumar Thiagarajan authored
      commit 7eccb738 upstream.
      
      Rx data frames notified through HTT_T2H_MSG_TYPE_RX_IND and
      HTT_T2H_MSG_TYPE_RX_FRAG_IND expect PN/TSC check to be done
      on host (mac80211) rather than firmware. Rebuild cipher header
      in every received data frames (that are notified through those
      HTT interfaces) from the rx_hdr_status tlv available in the
      rx descriptor of the first msdu. Skip setting RX_FLAG_IV_STRIPPED
      flag for the packets which requires mac80211 PN/TSC check support
      and set appropriate RX_FLAG for stripped crypto tail. Hw QCA988X,
      QCA9887, QCA99X0, QCA9984, QCA9888 and QCA4019 currently need the
      rebuilding of cipher header to perform PN/TSC check for replay
      attack.
      
      Please note that removing crypto tail for CCMP-256, GCMP and GCMP-256 ciphers
      in raw mode needs to be fixed. Since Rx with these ciphers in raw
      mode does not work in the current form even without this patch and
      removing crypto tail for these chipers needs clean up, raw mode related
      issues in CCMP-256, GCMP and GCMP-256 can be addressed in follow up
      patches.
      Tested-by: default avatarManikanta Pubbisetty <mpubbise@qti.qualcomm.com>
      Signed-off-by: default avatarVasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5ab9ee1
    • David Spinadel's avatar
      mac80211: Add RX flag to indicate ICV stripped · 234c8e60
      David Spinadel authored
      commit cef0acd4 upstream.
      
      Add a flag that indicates that the WEP ICV was stripped from an
      RX packet, allowing the device to not transfer that if it's
      already checked.
      Signed-off-by: default avatarDavid Spinadel <david.spinadel@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      234c8e60
    • Suren Baghdasaryan's avatar
      dm bufio: fix shrinker scans when (nr_to_scan < retain_target) · b58aa24e
      Suren Baghdasaryan authored
      commit fbc7c07e upstream.
      
      When system is under memory pressure it is observed that dm bufio
      shrinker often reclaims only one buffer per scan. This change fixes
      the following two issues in dm bufio shrinker that cause this behavior:
      
      1. ((nr_to_scan - freed) <= retain_target) condition is used to
      terminate slab scan process. This assumes that nr_to_scan is equal
      to the LRU size, which might not be correct because do_shrink_slab()
      in vmscan.c calculates nr_to_scan using multiple inputs.
      As a result when nr_to_scan is less than retain_target (64) the scan
      will terminate after the first iteration, effectively reclaiming one
      buffer per scan and making scans very inefficient. This hurts vmscan
      performance especially because mutex is acquired/released every time
      dm_bufio_shrink_scan() is called.
      New implementation uses ((LRU size - freed) <= retain_target)
      condition for scan termination. LRU size can be safely determined
      inside __scan() because this function is called after dm_bufio_lock().
      
      2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to
      determine number of freeable objects in the slab. However dm_bufio
      always retains retain_target buffers in its LRU and will terminate
      a scan when this mark is reached. Therefore returning the entire LRU size
      from dm_bufio_shrink_count() is misleading because that does not
      represent the number of freeable objects that slab will reclaim during
      a scan. Returning (LRU size - retain_target) better represents the
      number of freeable objects in the slab. This way do_shrink_slab()
      returns 0 when (LRU size < retain_target) and vmscan will not try to
      scan this shrinker avoiding scans that will not reclaim any memory.
      
      Test: tested using Android device running
      <AOSP>/system/extras/alloc-stress that generates memory pressure
      and causes intensive shrinker scans
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      b58aa24e
  2. 10 Jan, 2018 22 commits
  3. 05 Jan, 2018 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.75 · 9f747558
      Greg Kroah-Hartman authored
      9f747558
    • Guenter Roeck's avatar
      kaiser: Set _PAGE_NX only if supported · 92fd81f7
      Guenter Roeck authored
      This resolves a crash if loaded under qemu + haxm under windows.
      See https://www.spinics.net/lists/kernel/msg2689835.html for details.
      Here is a boot log (the log is from chromeos-4.4, but Tao Wu says that
      the same log is also seen with vanilla v4.4.110-rc1).
      
      [    0.712750] Freeing unused kernel memory: 552K
      [    0.721821] init: Corrupted page table at address 57b029b332e0
      [    0.722761] PGD 80000000bb238067 PUD bc36a067 PMD bc369067 PTE 45d2067
      [    0.722761] Bad pagetable: 000b [#1] PREEMPT SMP 
      [    0.722761] Modules linked in:
      [    0.722761] CPU: 1 PID: 1 Comm: init Not tainted 4.4.96 #31
      [    0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [    0.722761] task: ffff8800bc290000 ti: ffff8800bc28c000 task.ti: ffff8800bc28c000
      [    0.722761] RIP: 0010:[<ffffffff83f4129e>]  [<ffffffff83f4129e>] __clear_user+0x42/0x67
      [    0.722761] RSP: 0000:ffff8800bc28fcf8  EFLAGS: 00010202
      [    0.722761] RAX: 0000000000000000 RBX: 00000000000001a4 RCX: 00000000000001a4
      [    0.722761] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000057b029b332e0
      [    0.722761] RBP: ffff8800bc28fd08 R08: ffff8800bc290000 R09: ffff8800bb2f4000
      [    0.722761] R10: ffff8800bc290000 R11: ffff8800bb2f4000 R12: 000057b029b332e0
      [    0.722761] R13: 0000000000000000 R14: 000057b029b33340 R15: ffff8800bb1e2a00
      [    0.722761] FS:  0000000000000000(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
      [    0.722761] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [    0.722761] CR2: 000057b029b332e0 CR3: 00000000bb2f8000 CR4: 00000000000006e0
      [    0.722761] Stack:
      [    0.722761]  000057b029b332e0 ffff8800bb95fa80 ffff8800bc28fd18 ffffffff83f4120c
      [    0.722761]  ffff8800bc28fe18 ffffffff83e9e7a1 ffff8800bc28fd68 0000000000000000
      [    0.722761]  ffff8800bc290000 ffff8800bc290000 ffff8800bc290000 ffff8800bc290000
      [    0.722761] Call Trace:
      [    0.722761]  [<ffffffff83f4120c>] clear_user+0x2e/0x30
      [    0.722761]  [<ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7
      [    0.722761]  [<ffffffff83de2088>] search_binary_handler+0x86/0x19c
      [    0.722761]  [<ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98
      [    0.722761]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.722761]  [<ffffffff83de40be>] do_execve+0x23/0x25
      [    0.722761]  [<ffffffff83c002e3>] run_init_process+0x2b/0x2d
      [    0.722761]  [<ffffffff844fec4d>] kernel_init+0x6d/0xda
      [    0.722761]  [<ffffffff84505b2f>] ret_from_fork+0x3f/0x70
      [    0.722761]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.722761] Code: 86 84 be 12 00 00 00 e8 87 0d e8 ff 66 66 90 48 89 d8 48 c1
      eb 03 4c 89 e7 83 e0 07 48 89 d9 be 08 00 00 00 31 d2 48 85 c9 74 0a <48> 89 17
      48 01 f7 ff c9 75 f6 48 89 c1 85 c9 74 09 88 17 48 ff 
      [    0.722761] RIP  [<ffffffff83f4129e>] __clear_user+0x42/0x67
      [    0.722761]  RSP <ffff8800bc28fcf8>
      [    0.722761] ---[ end trace def703879b4ff090 ]---
      [    0.722761] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/kernel/locking/rwsem.c:21
      [    0.722761] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: init
      [    0.722761] CPU: 1 PID: 1 Comm: init Tainted: G      D         4.4.96 #31
      [    0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [    0.722761]  0000000000000086 dcb5d76098c89836 ffff8800bc28fa30 ffffffff83f34004
      [    0.722761]  ffffffff84839dc2 0000000000000015 ffff8800bc28fa40 ffffffff83d57dc9
      [    0.722761]  ffff8800bc28fa68 ffffffff83d57e6a ffffffff84a53640 0000000000000000
      [    0.722761] Call Trace:
      [    0.722761]  [<ffffffff83f34004>] dump_stack+0x4d/0x63
      [    0.722761]  [<ffffffff83d57dc9>] ___might_sleep+0x13a/0x13c
      [    0.722761]  [<ffffffff83d57e6a>] __might_sleep+0x9f/0xa6
      [    0.722761]  [<ffffffff84502788>] down_read+0x20/0x31
      [    0.722761]  [<ffffffff83cc5d9b>] __blocking_notifier_call_chain+0x35/0x63
      [    0.722761]  [<ffffffff83cc5ddd>] blocking_notifier_call_chain+0x14/0x16
      [    0.800374] usb 1-1: new full-speed USB device number 2 using uhci_hcd
      [    0.722761]  [<ffffffff83cefe97>] profile_task_exit+0x1a/0x1c
      [    0.802309]  [<ffffffff83cac84e>] do_exit+0x39/0xe7f
      [    0.802309]  [<ffffffff83ce5938>] ? vprintk_default+0x1d/0x1f
      [    0.802309]  [<ffffffff83d7bb95>] ? printk+0x57/0x73
      [    0.802309]  [<ffffffff83c46e25>] oops_end+0x80/0x85
      [    0.802309]  [<ffffffff83c7b747>] pgtable_bad+0x8a/0x95
      [    0.802309]  [<ffffffff83ca7f4a>] __do_page_fault+0x8c/0x352
      [    0.802309]  [<ffffffff83eefba5>] ? file_has_perm+0xc4/0xe5
      [    0.802309]  [<ffffffff83ca821c>] do_page_fault+0xc/0xe
      [    0.802309]  [<ffffffff84507682>] page_fault+0x22/0x30
      [    0.802309]  [<ffffffff83f4129e>] ? __clear_user+0x42/0x67
      [    0.802309]  [<ffffffff83f4127f>] ? __clear_user+0x23/0x67
      [    0.802309]  [<ffffffff83f4120c>] clear_user+0x2e/0x30
      [    0.802309]  [<ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7
      [    0.802309]  [<ffffffff83de2088>] search_binary_handler+0x86/0x19c
      [    0.802309]  [<ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98
      [    0.802309]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.802309]  [<ffffffff83de40be>] do_execve+0x23/0x25
      [    0.802309]  [<ffffffff83c002e3>] run_init_process+0x2b/0x2d
      [    0.802309]  [<ffffffff844fec4d>] kernel_init+0x6d/0xda
      [    0.802309]  [<ffffffff84505b2f>] ret_from_fork+0x3f/0x70
      [    0.802309]  [<ffffffff844febe0>] ? rest_init+0x87/0x87
      [    0.830559] Kernel panic - not syncing: Attempted to kill init!  exitcode=0x00000009
      [    0.830559] 
      [    0.831305] Kernel Offset: 0x2c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [    0.831305] ---[ end Kernel panic - not syncing: Attempted to kill init!  exitcode=0x00000009
      
      The crash part of this problem may be solved with the following patch
      (thanks to Hugh for the hint). There is still another problem, though -
      with this patch applied, the qemu session aborts with "VCPU Shutdown
      request", whatever that means.
      
      Cc: lepton <ytht.net@gmail.com>
      Signed-off-by: default avatarGuenter Roeck <groeck@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92fd81f7
    • Kees Cook's avatar
      KPTI: Report when enabled · ea6cd39d
      Kees Cook authored
      Make sure dmesg reports when KPTI is enabled.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea6cd39d
    • Kees Cook's avatar
      KPTI: Rename to PAGE_TABLE_ISOLATION · e71fac01
      Kees Cook authored
      This renames CONFIG_KAISER to CONFIG_PAGE_TABLE_ISOLATION.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e71fac01
    • Borislav Petkov's avatar
      x86/kaiser: Move feature detection up · 59094faf
      Borislav Petkov authored
      
      ... before the first use of kaiser_enabled as otherwise funky
      things happen:
      
        about to get started...
        (XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000]
        (XEN) Pagetable walk from ffff88022a449090:
        (XEN)  L4[0x110] = 0000000229e0e067 0000000000001e0e
        (XEN)  L3[0x008] = 0000000000000000 ffffffffffffffff
        (XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08
        entry.o#create_bounce_frame+0x135/0x14d
        (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
        (XEN) ----[ Xen-4.9.1_02-3.21  x86_64  debug=n   Not tainted ]----
        (XEN) CPU:    0
        (XEN) RIP:    e033:[<ffffffff81007460>]
        (XEN) RFLAGS: 0000000000000286   EM: 1   CONTEXT: pv guest (d0v0)
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59094faf
    • Jiri Kosina's avatar
      kaiser: disabled on Xen PV · 402e63de
      Jiri Kosina authored
      
      Kaiser cannot be used on paravirtualized MMUs (namely reading and writing CR3).
      This does not work with KAISER as the CR3 switch from and to user space PGD
      would require to map the whole XEN_PV machinery into both.
      
      More importantly, enabling KAISER on Xen PV doesn't make too much sense, as PV
      guests use distinct %cr3 values for kernel and user already.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      402e63de
    • Borislav Petkov's avatar
      x86/kaiser: Reenable PARAVIRT · 2c272175
      Borislav Petkov authored
      
      Now that the required bits have been addressed, reenable
      PARAVIRT.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c272175
    • Thomas Gleixner's avatar
      x86/paravirt: Dont patch flush_tlb_single · 1817d2c2
      Thomas Gleixner authored
      
      commit a0357954 upstream
      
      native_flush_tlb_single() will be changed with the upcoming
      PAGE_TABLE_ISOLATION feature. This requires to have more code in
      there than INVLPG.
      
      Remove the paravirt patching for it.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Cc: michael.schwarz@iaik.tugraz.at
      Cc: moritz.lipp@iaik.tugraz.at
      Cc: richard.fellner@student.tugraz.at
      Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1817d2c2
    • Hugh Dickins's avatar
      kaiser: kaiser_flush_tlb_on_return_to_user() check PCID · fe5cb75f
      Hugh Dickins authored
      
      Let kaiser_flush_tlb_on_return_to_user() do the X86_FEATURE_PCID
      check, instead of each caller doing it inline first: nobody needs
      to optimize for the noPCID case, it's clearer this way, and better
      suits later changes.  Replace those no-op X86_CR3_PCID_KERN_FLUSH lines
      by a BUILD_BUG_ON() in load_new_mm_cr3(), in case something changes.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe5cb75f
    • Hugh Dickins's avatar
      kaiser: asm/tlbflush.h handle noPGE at lower level · b72c26e9
      Hugh Dickins authored
      
      I found asm/tlbflush.h too twisty, and think it safer not to avoid
      __native_flush_tlb_global_irq_disabled() in the kaiser_enabled case,
      but instead let it handle kaiser_enabled along with cr3: it can just
      use __native_flush_tlb() for that, no harm in re-disabling preemption.
      
      (This is not the same change as Kirill and Dave have suggested for
      upstream, flipping PGE in cr4: that's neat, but needs a cpu_has_pge
      check; cr3 is enough for kaiser, and thought to be cheaper than cr4.)
      
      Also delete the X86_FEATURE_INVPCID invpcid_flush_all_nonglobals()
      preference from __native_flush_tlb(): unlike the invpcid_flush_all()
      preference in __native_flush_tlb_global(), it's not seen in upstream
      4.14, and was recently reported to be surprisingly slow.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b72c26e9
    • Hugh Dickins's avatar
      kaiser: drop is_atomic arg to kaiser_pagetable_walk() · 8c2f8a5c
      Hugh Dickins authored
      
      I have not observed a might_sleep() warning from setup_fixmap_gdt()'s
      use of kaiser_add_mapping() in our tree (why not?), but like upstream
      we have not provided a way for that to pass is_atomic true down to
      kaiser_pagetable_walk(), and at startup it's far from a likely source
      of trouble: so just delete the walk's is_atomic arg and might_sleep().
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c2f8a5c
    • Hugh Dickins's avatar
      kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush · 169b369f
      Hugh Dickins authored
      
      Now that we're playing the ALTERNATIVE game, use that more efficient
      method: instead of user-mapping an extra page, and reading an extra
      cacheline each time for x86_cr3_pcid_noflush.
      
      Neel has found that __stringify(bts $X86_CR3_PCID_NOFLUSH_BIT, %rax)
      is a working substitute for the "bts $63, %rax" in these ALTERNATIVEs;
      but the one line with $63 in looks clearer, so let's stick with that.
      
      Worried about what happens with an ALTERNATIVE between the jump and
      jump label in another ALTERNATIVE?  I was, but have checked the
      combinations in SWITCH_KERNEL_CR3_NO_STACK at entry_SYSCALL_64,
      and it does a good job.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      169b369f