1. 06 Jun, 2018 9 commits
    • Thomas Gleixner's avatar
      x86/apic: Provide apic_ack_irq() · c0255770
      Thomas Gleixner authored
      apic_ack_edge() is explicitely for handling interrupt affinity cleanup when
      interrupt remapping is not available or disable.
      
      Remapped interrupts and also some of the platform specific special
      interrupts, e.g. UV, invoke ack_APIC_irq() directly.
      
      To address the issue of failing an affinity update with -EBUSY the delayed
      affinity mechanism can be reused, but ack_APIC_irq() does not handle
      that. Adding this to ack_APIC_irq() is not possible, because that function
      is also used for exceptions and directly handled interrupts like IPIs.
      
      Create a new function, which just contains the conditional invocation of
      irq_move_irq() and the final ack_APIC_irq().
      
      Reuse the new function in apic_ack_edge().
      
      Preparatory change for the real fix.
      
      Fixes: dccfe314 ("x86/vector: Simplify vector move cleanup")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de
      c0255770
    • Thomas Gleixner's avatar
      genirq/migration: Avoid out of line call if pending is not set · d340ebd6
      Thomas Gleixner authored
      The upcoming fix for the -EBUSY return from affinity settings requires to
      use the irq_move_irq() functionality even on irq remapped interrupts. To
      avoid the out of line call, move the check for the pending bit into an
      inline helper.
      
      Preparatory change for the real fix. No functional change.
      
      Fixes: dccfe314 ("x86/vector: Simplify vector move cleanup")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de
      d340ebd6
    • Thomas Gleixner's avatar
      genirq/generic_pending: Do not lose pending affinity update · a33a5d2d
      Thomas Gleixner authored
      The generic pending interrupt mechanism moves interrupts from the interrupt
      handler on the original target CPU to the new destination CPU. This is
      required for x86 and ia64 due to the way the interrupt delivery and
      acknowledge works if the interrupts are not remapped.
      
      However that update can fail for various reasons. Some of them are valid
      reasons to discard the pending update, but the case, when the previous move
      has not been fully cleaned up is not a legit reason to fail.
      
      Check the return value of irq_do_set_affinity() for -EBUSY, which indicates
      a pending cleanup, and rearm the pending move in the irq dexcriptor so it's
      tried again when the next interrupt arrives.
      
      Fixes: 996c5912 ("x86/irq: Plug vector cleanup race")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.386544292@linutronix.de
      a33a5d2d
    • Thomas Gleixner's avatar
      x86/apic/vector: Prevent hlist corruption and leaks · 80ae7b1a
      Thomas Gleixner authored
      Several people observed the WARN_ON() in irq_matrix_free() which triggers
      when the caller tries to free an vector which is not in the allocation
      range. Song provided the trace information which allowed to decode the root
      cause.
      
      The rework of the vector allocation mechanism failed to preserve a sanity
      check, which prevents setting a new target vector/CPU when the previous
      affinity change has not fully completed.
      
      As a result a half finished affinity change can be overwritten, which can
      cause the leak of a irq descriptor pointer on the previous target CPU and
      double enqueue of the hlist head into the cleanup lists of two or more
      CPUs. After one CPU cleaned up its vector the next CPU will invoke the
      cleanup handler with vector 0, which triggers the out of range warning in
      the matrix allocator.
      
      Prevent this by checking the apic_data of the interrupt whether the
      move_in_progress flag is false and the hlist node is not hashed. Return
      -EBUSY if not.
      
      This prevents the damage and restores the behaviour before the vector
      allocation rework, but due to other changes in that area it also widens the
      chance that user space can observe -EBUSY. In theory this should be fine,
      but actually not all user space tools handle -EBUSY correctly. Addressing
      that is not part of this fix, but will be addressed in follow up patches.
      
      Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
      Reported-by: default avatarDmitry Safonov <0x7f454c46@gmail.com>
      Reported-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reported-by: default avatarSong Liu <liu.song.a23@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: https://lkml.kernel.org/r/20180604162224.303870257@linutronix.de
      80ae7b1a
    • Dou Liyang's avatar
      x86/vector: Fix the args of vector_alloc tracepoint · 838d76d6
      Dou Liyang authored
      The vector_alloc tracepont reversed the reserved and ret aggs, that made
      the trace print wrong. Exchange them.
      
      Fixes: 8d1e3dca ("x86/vector: Add tracepoints for vector management")
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: hpa@zytor.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180601065031.21872-1-douly.fnst@cn.fujitsu.com
      838d76d6
    • Dou Liyang's avatar
      x86/idt: Simplify the idt_setup_apic_and_irq_gates() · 33662812
      Dou Liyang authored
      The idt_setup_apic_and_irq_gates() sets the gates from
      FIRST_EXTERNAL_VECTOR up to FIRST_SYSTEM_VECTOR first. then secondly, from
      FIRST_SYSTEM_VECTOR to NR_VECTORS, it takes both APIC=y and APIC=n into
      account.
      
      But for APIC=n, the FIRST_SYSTEM_VECTOR is equal to NR_VECTORS, all
      vectors has been set at the first step.
      
      Simplify the second step, make it just work for APIC=y.
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20180523023555.2933-1-douly.fnst@cn.fujitsu.com
      33662812
    • Varsha Rao's avatar
      x86/platform/uv: Remove extra parentheses · cce2946b
      Varsha Rao authored
      Remove extra parentheses to fix the extraneous parentheses clang
      warning.
      Suggested-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: default avatarVarsha Rao <rvarsha016@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Nicholas Mc Guire <der.herr@hofr.at>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: https://lkml.kernel.org/r/20180520080012.8215-1-rvarsha016@gmail.com
      cce2946b
    • Kirill A. Shutemov's avatar
      x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME · 94d49eb3
      Kirill A. Shutemov authored
      AMD SME claims one bit from physical address to indicate whether the
      page is encrypted or not. To achieve that we clear out the bit from
      __PHYSICAL_MASK.
      
      The capability to adjust __PHYSICAL_MASK is required beyond AMD SME.
      For instance for upcoming Intel Multi-Key Total Memory Encryption.
      
      Factor it out into a separate feature with own Kconfig handle.
      
      It also helps with overhead of AMD SME. It saves more than 3k in .text
      on defconfig + AMD_MEM_ENCRYPT:
      
      	add/remove: 3/2 grow/shrink: 5/110 up/down: 189/-3753 (-3564)
      
      We would need to return to this once we have infrastructure to patch
      constants in code. That's good candidate for it.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Cc: linux-mm@kvack.org
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: https://lkml.kernel.org/r/20180518113028.79825-1-kirill.shutemov@linux.intel.com
      94d49eb3
    • Arnd Bergmann's avatar
      x86: Mark native_set_p4d() as __always_inline · 046c0dbe
      Arnd Bergmann authored
      When CONFIG_OPTIMIZE_INLINING is enabled, the function native_set_p4d()
      may not be fully inlined into the caller, resulting in a false-positive
      warning about an access to the __pgtable_l5_enabled variable from a
      non-__init function, despite the original caller being an __init function:
      
      WARNING: vmlinux.o(.text.unlikely+0x1429): Section mismatch in reference from the function native_set_p4d() to the variable .init.data:__pgtable_l5_enabled
      WARNING: vmlinux.o(.text.unlikely+0x1429): Section mismatch in reference from the function native_p4d_clear() to the variable .init.data:__pgtable_l5_enabled
      
      The function native_set_p4d() references the variable __initdata
      __pgtable_l5_enabled.  This is often because native_set_p4d lacks a
      __initdata annotation or the annotation of __pgtable_l5_enabled is wrong.
      
      Marking the native_set_p4d function and its caller native_p4d_clear()
      avoids this problem.
      
      I did not bisect the original cause, but I assume this is related to the
      recent rework that turned pgtable_l5_enabled() into an inline function,
      which in turn caused the compiler to make different inlining decisions.
      
      Fixes: ad3fe525 ("x86/mm: Unify pgtable_l5_enabled usage in early boot code")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Link: https://lkml.kernel.org/r/20180605113715.1133726-1-arnd@arndb.de
      046c0dbe
  2. 04 Jun, 2018 1 commit
  3. 03 Jun, 2018 6 commits
    • Linus Torvalds's avatar
      Linux 4.17 · 29dcea88
      Linus Torvalds authored
      29dcea88
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 325e14f9
      Linus Torvalds authored
      Pull vfs fixes from Al Viro.
      
       - fix io_destroy()/aio_complete() race
      
       - the vfs_open() change to get rid of open_check_o_direct() boilerplate
         was nice, but buggy. Al has a patch avoiding a revert, but that's
         definitely not a last-day fodder, so for now revert it is...
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        Revert "fs: fold open_check_o_direct into do_dentry_open"
        fix io_destroy()/aio_complete() race
      325e14f9
    • Al Viro's avatar
      Revert "fs: fold open_check_o_direct into do_dentry_open" · af04fadc
      Al Viro authored
      This reverts commit cab64df1.
      
      Having vfs_open() in some cases drop the reference to
      struct file combined with
      
      	error = vfs_open(path, f, cred);
      	if (error) {
      		put_filp(f);
      		return ERR_PTR(error);
      	}
      	return f;
      
      is flat-out wrong.  It used to be
      
      		error = vfs_open(path, f, cred);
      		if (!error) {
      			/* from now on we need fput() to dispose of f */
      			error = open_check_o_direct(f);
      			if (error) {
      				fput(f);
      				f = ERR_PTR(error);
      			}
      		} else {
      			put_filp(f);
      			f = ERR_PTR(error);
      		}
      
      and sure, having that open_check_o_direct() boilerplate gotten rid of is
      nice, but not that way...
      
      Worse, another call chain (via finish_open()) is FUBAR now wrt
      FILE_OPENED handling - in that case we get error returned, with file
      already hit by fput() *AND* FILE_OPENED not set.  Guess what happens in
      path_openat(), when it hits
      
      	if (!(opened & FILE_OPENED)) {
      		BUG_ON(!error);
      		put_filp(file);
      	}
      
      The root cause of all that crap is that the callers of do_dentry_open()
      have no way to tell which way did it fail; while that could be fixed up
      (by passing something like int *opened to do_dentry_open() and have it
      marked if we'd called ->open()), it's probably much too late in the
      cycle to do so right now.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      af04fadc
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 874cd339
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
      
       - two patches addressing the problem that the scheduler allows under
         certain conditions user space tasks to be scheduled on CPUs which are
         not yet fully booted which causes a few subtle and hard to debug
         issue
      
       - add a missing runqueue clock update in the deadline scheduler which
         triggers a warning under certain circumstances
      
       - fix a silly typo in the scheduler header file
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/headers: Fix typo
        sched/deadline: Fix missing clock update
        sched/core: Require cpu_active() in select_task_rq(), for user tasks
        sched/core: Fix rules for running on online && !active CPUs
      874cd339
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 26bdace7
      Linus Torvalds authored
      Pull perf tooling fixes from Thomas Gleixner:
      
       - fix 'perf test Session topology' segfault on s390 (Thomas Richter)
      
       - fix NULL return handling in bpf__prepare_load() (YueHaibing)
      
       - fix indexing on Coresight ETM packet queue decoder (Mathieu Poirier)
      
       - fix perf.data format description of NRCPUS header (Arnaldo Carvalho
         de Melo)
      
       - update perf.data documentation section on cpu topology
      
       - handle uncore event aliases in small groups properly (Kan Liang)
      
       - add missing perf_sample.addr into python sample dictionary (Leo Yan)
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf tools: Fix perf.data format description of NRCPUS header
        perf script python: Add addr into perf sample dict
        perf data: Update documentation section on cpu topology
        perf cs-etm: Fix indexing for decoder packet queue
        perf bpf: Fix NULL return handling in bpf__prepare_load()
        perf test: "Session topology" dumps core on s390
        perf parse-events: Handle uncore event aliases in small groups properly
      26bdace7
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 918fe1b3
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Infinite loop in _decode_session6(), from Eric Dumazet.
      
       2) Pass correct argument to nla_strlcpy() in netfilter, also from Eric
          Dumazet.
      
       3) Out of bounds memory access in ipv6 srh code, from Mathieu Xhonneux.
      
       4) NULL deref in XDP_REDIRECT handling of tun driver, from Toshiaki
          Makita.
      
       5) Incorrect idr release in cls_flower, from Paul Blakey.
      
       6) Probe error handling fix in davinci_emac, from Dan Carpenter.
      
       7) Memory leak in XPS configuration, from Alexander Duyck.
      
       8) Use after free with cloned sockets in kcm, from Kirill Tkhai.
      
       9) MTU handling fixes fo ip_tunnel and ip6_tunnel, from Nicolas
          Dichtel.
      
      10) Fix UAPI hole in bpf data structure for 32-bit compat applications,
          from Daniel Borkmann.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
        bpf: fix uapi hole for 32 bit compat applications
        net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
        ip6_tunnel: remove magic mtu value 0xFFF8
        ip_tunnel: restore binding to ifaces with a large mtu
        net: dsa: b53: Add BCM5389 support
        kcm: Fix use-after-free caused by clonned sockets
        net-sysfs: Fix memory leak in XPS configuration
        ixgbe: fix parsing of TC actions for HW offload
        net: ethernet: davinci_emac: fix error handling in probe()
        net/ncsi: Fix array size in dumpit handler
        cls_flower: Fix incorrect idr release when failing to modify rule
        net/sonic: Use dma_mapping_error()
        xfrm Fix potential error pointer dereference in xfrm_bundle_create.
        vhost_net: flush batched heads before trying to busy polling
        tun: Fix NULL pointer dereference in XDP redirect
        be2net: Fix error detection logic for BE3
        net: qmi_wwan: Add Netgear Aircard 779S
        mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG
        atm: zatm: fix memcmp casting
        iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs
        ...
      918fe1b3
  4. 02 Jun, 2018 15 commits
  5. 01 Jun, 2018 9 commits