Commits · 4f3f0dd2a75b3a25198492cd815ea82de9cef8a7 · nexedi / linux

12 Apr, 2017 33 commits

metag/usercopy: Set flags before ADDZ · 4f3f0dd2

James Hogan authored Apr 04, 2017

commit fd40eee1 upstream.

The fixup code for the copy_to_user rapf loops reads TXStatus.LSM_STEP
to decide how far to rewind the source pointer. There is a special case
for the last execution of an MGETL/MGETD, since it leaves LSM_STEP=0
even though the number of MGETLs/MGETDs attempted was 4. This uses ADDZ
which is conditional upon the Z condition flag, but the AND instruction
which masked the TXStatus.LSM_STEP field didn't set the condition flags
based on the result.

Fix that now by using ANDS which does set the flags, and also marking
the condition codes as clobbered by the inline assembly.

Fixes: 373cd784 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4f3f0dd2

metag/usercopy: Zero rest of buffer from copy_from_user · 3dc0fe51

James Hogan authored Mar 31, 2017

commit 563ddc10 upstream.

Currently we try to zero the destination for a failed read from userland
in fixup code in the usercopy.c macros. The rest of the destination
buffer is then zeroed from __copy_user_zeroing(), which is used for both
copy_from_user() and __copy_from_user().

Unfortunately we fail to zero in the fixup code as D1Ar1 is set to 0
before the fixup code entry labels, and __copy_from_user() shouldn't even
be zeroing the rest of the buffer.

Move the zeroing out into copy_from_user() and rename
__copy_user_zeroing() to raw_copy_from_user() since it no longer does
any zeroing. This also conveniently matches the name needed for
RAW_COPY_USER support in a later patch.

Fixes: 373cd784 ("metag: Memory handling")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3dc0fe51

metag/usercopy: Add early abort to copy_to_user · 4a93ac81

James Hogan authored Mar 31, 2017

commit fb8ea062 upstream.

When copying to userland on Meta, if any faults are encountered
immediately abort the copy instead of continuing on and repeatedly
faulting, and worse potentially copying further bytes successfully to
subsequent valid pages.

Fixes: 373cd784 ("metag: Memory handling")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4a93ac81

metag/usercopy: Fix alignment error checking · 49a292dc

James Hogan authored Mar 31, 2017

commit 22572119 upstream.

Fix the error checking of the alignment adjustment code in
raw_copy_from_user(), which mistakenly considers it safe to skip the
error check when aligning the source buffer on a 2 or 4 byte boundary.

If the destination buffer was unaligned it may have started to copy
using byte or word accesses, which could well be at the start of a new
(valid) source page. This would result in it appearing to have copied 1
or 2 bytes at the end of the first (invalid) page rather than none at
all.

Fixes: 373cd784 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

49a292dc

metag/usercopy: Drop unused macros · 2bb52b47

James Hogan authored Mar 31, 2017

commit ef62a2d8 upstream.

Metag's lib/usercopy.c has a bunch of copy_from_user macros for larger
copies between 5 and 16 bytes which are completely unused. Before fixing
zeroing lets drop these macros so there is less to fix.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2bb52b47

brcmfmac: use local iftype avoiding use-after-free of virtual interface · 9afc076d

Arend Van Spriel authored Mar 28, 2017

commit d77facb8 upstream.

A use-after-free was found using KASAN. In brcmf_p2p_del_if() the virtual
interface is removed using call to brcmf_remove_interface(). After that
the virtual interface instance has been freed and should not be referenced.
Solve this by storing the nl80211 iftype in local variable, which is used
in a couple of places anyway.
Reported-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

9afc076d

mac80211: unconditionally start new netdev queues with iTXQ support · c0321505

Johannes Berg authored Mar 29, 2017

commit 7d65f829 upstream.

When internal mac80211 TXQs aren't supported, netdev queues must
always started out started even when driver queues are stopped
while the interface is added. This is necessary because with the
internal TXQ support netdev queues are never stopped and packet
scheduling/dropping is done in mac80211.

Fixes: 80a83cfc ("mac80211: skip netdev queue control with software queuing")
Reported-and-tested-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c0321505

ring-buffer: Fix return value check in test_ringbuffer() · 703cebf6

Wei Yongjun authored Jun 17, 2016

commit 62277de7 upstream.

In case of error, the function kthread_run() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().

Link: http://lkml.kernel.org/r/1466184839-14927-1-git-send-email-weiyj_lk@163.com

Fixes: 6c43e554 ("ring-buffer: Add ring buffer startup selftest")
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

703cebf6

xfs: Honor FALLOC_FL_KEEP_SIZE when punching ends of files · f7db1899

Calvin Owens authored Apr 03, 2017

commit 3dd09d5a upstream.

When punching past EOF on XFS, fallocate(mode=PUNCH_HOLE|KEEP_SIZE) will
round the file size up to the nearest multiple of PAGE_SIZE:

  calvinow@vm-disks/generic-xfs-1 ~$ dd if=/dev/urandom of=test bs=2048 count=1
  calvinow@vm-disks/generic-xfs-1 ~$ stat test
    Size: 2048            Blocks: 8          IO Block: 4096   regular file
  calvinow@vm-disks/generic-xfs-1 ~$ fallocate -n -l 2048 -o 2048 -p test
  calvinow@vm-disks/generic-xfs-1 ~$ stat test
    Size: 4096            Blocks: 8          IO Block: 4096   regular file

Commit 3c2bdc91 ("xfs: kill xfs_zero_remaining_bytes") replaced
xfs_zero_remaining_bytes() with calls to iomap helpers. The new helpers
don't enforce that [pos,offset) lies strictly on [0,i_size) when being
called from xfs_free_file_space(), so by "leaking" these ranges into
xfs_zero_range() we get this buggy behavior.

Fix this by reintroducing the checks xfs_zero_remaining_bytes() did
against i_size at the bottom of xfs_free_file_space().
Reported-by: Aaron Gao <gzh@fb.com>
Fixes: 3c2bdc91 ("xfs: kill xfs_zero_remaining_bytes")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f7db1899

orangefs: move features validation to fix filesystem hang · dc62935c

Martin Brandenburg authored Apr 06, 2017

commit cefdc26e upstream.

Without this fix (and another to the userspace component itself
described later), the kernel will be unable to process any OrangeFS
requests after the userspace component is restarted (due to a crash or
at the administrator's behest).

The bug here is that inside orangefs_remount, the orangefs_request_mutex
is locked.  When the userspace component restarts while the filesystem
is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device,
which causes the kernel to send it a few requests aimed at synchronizing
the state between the two.  While this is happening the
orangefs_request_mutex is locked to prevent any other requests going
through.

This is only half of the bugfix.  The other half is in the userspace
component which outright ignores(!) requests made before it considers
the filesystem remounted, which is after the ioctl returns.  Of course
the ioctl doesn't return until after the userspace component responds to
the request it ignores.  The userspace component has been changed to
allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status.

Mike Marshall says:
 "I've tested this patch against the fixed userspace part. This patch is
  real important, I hope it can make it into 4.11...

  Here's what happens when the userspace daemon is restarted, without
  the patch:

    =============================================
    [ INFO: possible recursive locking detected ]
    [   4.10.0-00007-ge98bdb30 #1 Not tainted    ]
    ---------------------------------------------
    pvfs2-client-co/29032 is trying to acquire lock:
     (orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs]
                  but task is already holding lock:
     (orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs]

    CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb30 #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
    Call Trace:
     __lock_acquire+0x7eb/0x1290
     lock_acquire+0xe8/0x1d0
     mutex_lock_killable_nested+0x6f/0x6e0
     service_operation+0x3c7/0x7b0 [orangefs]
     orangefs_remount+0xea/0x150 [orangefs]
     dispatch_ioctl_command+0x227/0x330 [orangefs]
     orangefs_devreq_ioctl+0x29/0x70 [orangefs]
     do_vfs_ioctl+0xa3/0x6e0
     SyS_ioctl+0x79/0x90"
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dc62935c

Kbuild: use cc-disable-warning consistently for maybe-uninitialized · c4fe79a4

Arnd Bergmann authored Jan 13, 2017

commit b334e19a upstream.

In commit a76bcf55 ("Kbuild: enable -Wmaybe-uninitialized warning
for "make W=1""), I reverted another change that happened to fix a problem
with old compilers, and now we get this report again with old compilers
(prior to gcc-4.8) and GCOV enabled:

   cc1: warnings being treated as errors
   drivers/gpu/drm/i915/intel_ringbuffer.c: In function 'intel_ring_setup_status_page':
   drivers/gpu/drm/i915/intel_ringbuffer.c:438: error: 'mmio.reg' may be used uninitialized in this function
   At top level:
>> cc1: error: unrecognized command line option "-Wno-maybe-uninitialized"

The problem is that we turn off the warning conditionally in a number
of places as we should, but one of them does it unconditionally.
Instead, change it to call cc-disable-warning as we do elsewhere.

The original patch that caused it was merged into linux-4.7, then
4.8 removed the change and 4.9 brought it back, so we probably want
a backport to 4.9 once this is merged.

Use a ':=' assignment instead of '=' to force the cc-disable-warning
call to only be evaluated once instead of every time.

Fixes: a76bcf55 ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"")
Fixes: e72e2dfe ("gcov: disable -Wmaybe-uninitialized warning")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c4fe79a4

ACPI / gpio: do not fall back to parsing _CRS when we get a deferral · e4a62348

Dmitry Torokhov authored Mar 23, 2017

commit 693bdaa1 upstream.

If, while locating GPIOs by name, we get probe deferral, we should
immediately report it to caller rather than trying to fall back to parsing
unnamed GPIOs from _CRS block.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-and-Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e4a62348

dm verity fec: fix bufio leaks · ab83597b

Sami Tolvanen authored Mar 31, 2017

commit 86e3e83b upstream.

Buffers read through dm_bufio_read() were not released in all code paths.

Fixes: a739ff3f ("dm verity: add support for forward error correction")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ab83597b

dm verity fec: limit error correction recursion · 2ff08764

Sami Tolvanen authored Mar 15, 2017

commit f1a880a9 upstream.

If the hash tree itself is sufficiently corrupt in addition to data blocks,
it's possible for error correction to end up in a deep recursive loop,
which eventually causes a kernel panic.  This change limits the
recursion to a reasonable level during a single I/O operation.

Fixes: a739ff3f ("dm verity: add support for forward error correction")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2ff08764

ptrace: fix PTRACE_LISTEN race corrupting task->state · d9fa4351

bsegall@google.com authored Apr 07, 2017

commit 5402e97a upstream.

In PT_SEIZED + LISTEN mode STOP/CONT signals cause a wakeup against
__TASK_TRACED.  If this races with the ptrace_unfreeze_traced at the end
of a PTRACE_LISTEN, this can wake the task /after/ the check against
__TASK_TRACED, but before the reset of state to TASK_TRACED.  This
causes it to instead clobber TASK_WAKING, allowing a subsequent wakeup
against TRACED while the task is still on the rq wake_list, corrupting
it.

Oleg said:
 "The kernel can crash or this can lead to other hard-to-debug problems.
  In short, "task->state = TASK_TRACED" in ptrace_unfreeze_traced()
  assumes that nobody else can wake it up, but PTRACE_LISTEN breaks the
  contract. Obviusly it is very wrong to manipulate task->state if this
  task is already running, or WAKING, or it sleeps again"

[akpm@linux-foundation.org: coding-style fixes]
Fixes: 9899d11f ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL")
Link: http://lkml.kernel.org/r/xm26y3vfhmkp.fsf_-_@bsegall-linux.mtv.corp.google.comSigned-off-by: Ben Segall <bsegall@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d9fa4351

mm/page_alloc.c: fix print order in show_free_areas() · e3b08ebe

Alexander Polakov authored Apr 07, 2017

commit 1f06b81a upstream.

Fixes: 11fb9989 ("mm: move most file-based accounting to the node")
Link: http://lkml.kernel.org/r/1490377730.30219.2.camel@beget.ruSigned-off-by: Alexander Polyakov <apolyakov@beget.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e3b08ebe

Reset TreeId to zero on SMB2 TREE_CONNECT · 8446cb1a

Jan-Marek Glogowski authored Feb 20, 2017

commit 806a28ef upstream.

Currently the cifs module breaks the CIFS specs on reconnect as
described in http://msdn.microsoft.com/en-us/library/cc246529.aspx:

"TreeId (4 bytes): Uniquely identifies the tree connect for the
command. This MUST be 0 for the SMB2 TREE_CONNECT Request."
Signed-off-by: Jan-Marek Glogowski <glogow@fbihome.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Tested-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8446cb1a

cfg80211: check rdev resume callback only for registered wiphy · 57e1e90d

Arend Van Spriel authored Mar 28, 2017

commit b3ef5520 upstream.

We got the following use-after-free KASAN report:

 BUG: KASAN: use-after-free in wiphy_resume+0x591/0x5a0 [cfg80211]
	 at addr ffff8803fc244090
 Read of size 8 by task kworker/u16:24/2587
 CPU: 6 PID: 2587 Comm: kworker/u16:24 Tainted: G    B 4.9.13-debug+
 Hardware name: Dell Inc. XPS 15 9550/0N7TVV, BIOS 1.2.19 12/22/2016
 Workqueue: events_unbound async_run_entry_fn
  ffff880425d4f9d8 ffffffffaeedb541 ffff88042b80ef00 ffff8803fc244088
  ffff880425d4fa00 ffffffffae84d7a1 ffff880425d4fa98 ffff8803fc244080
  ffff88042b80ef00 ffff880425d4fa88 ffffffffae84da3a ffffffffc141f7d9
 Call Trace:
  [<ffffffffaeedb541>] dump_stack+0x85/0xc4
  [<ffffffffae84d7a1>] kasan_object_err+0x21/0x70
  [<ffffffffae84da3a>] kasan_report_error+0x1fa/0x500
  [<ffffffffc141f7d9>] ? cfg80211_bss_age+0x39/0xc0 [cfg80211]
  [<ffffffffc141f83a>] ? cfg80211_bss_age+0x9a/0xc0 [cfg80211]
  [<ffffffffae48d46d>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211]
  [<ffffffffae84def1>] __asan_report_load8_noabort+0x61/0x70
  [<ffffffffc13fb100>] ? wiphy_suspend+0xbb0/0xc70 [cfg80211]
  [<ffffffffc13fb751>] ? wiphy_resume+0x591/0x5a0 [cfg80211]
  [<ffffffffc13fb751>] wiphy_resume+0x591/0x5a0 [cfg80211]
  [<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211]
  [<ffffffffaf3b206e>] dpm_run_callback+0x6e/0x4f0
  [<ffffffffaf3b31b2>] device_resume+0x1c2/0x670
  [<ffffffffaf3b367d>] async_resume+0x1d/0x50
  [<ffffffffae3ee84e>] async_run_entry_fn+0xfe/0x610
  [<ffffffffae3d0666>] process_one_work+0x716/0x1a50
  [<ffffffffae3d05c9>] ? process_one_work+0x679/0x1a50
  [<ffffffffafdd7b6d>] ? _raw_spin_unlock_irq+0x3d/0x60
  [<ffffffffae3cff50>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0
  [<ffffffffae3d1a80>] worker_thread+0xe0/0x1460
  [<ffffffffae3d19a0>] ? process_one_work+0x1a50/0x1a50
  [<ffffffffae3e54c2>] kthread+0x222/0x2e0
  [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80
  [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80
  [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80
  [<ffffffffafdd86aa>] ret_from_fork+0x2a/0x40
 Object at ffff8803fc244088, in cache kmalloc-1024 size: 1024
 Allocated:
 PID = 71
  save_stack_trace+0x1b/0x20
  save_stack+0x46/0xd0
  kasan_kmalloc+0xad/0xe0
  kasan_slab_alloc+0x12/0x20
  __kmalloc_track_caller+0x134/0x360
  kmemdup+0x20/0x50
  brcmf_cfg80211_attach+0x10b/0x3a90 [brcmfmac]
  brcmf_bus_start+0x19a/0x9a0 [brcmfmac]
  brcmf_pcie_setup+0x1f1a/0x3680 [brcmfmac]
  brcmf_fw_request_nvram_done+0x44c/0x11b0 [brcmfmac]
  request_firmware_work_func+0x135/0x280
  process_one_work+0x716/0x1a50
  worker_thread+0xe0/0x1460
  kthread+0x222/0x2e0
  ret_from_fork+0x2a/0x40
 Freed:
 PID = 2568
  save_stack_trace+0x1b/0x20
  save_stack+0x46/0xd0
  kasan_slab_free+0x71/0xb0
  kfree+0xe8/0x2e0
  brcmf_cfg80211_detach+0x62/0xf0 [brcmfmac]
  brcmf_detach+0x14a/0x2b0 [brcmfmac]
  brcmf_pcie_remove+0x140/0x5d0 [brcmfmac]
  brcmf_pcie_pm_leave_D3+0x198/0x2e0 [brcmfmac]
  pci_pm_resume+0x186/0x220
  dpm_run_callback+0x6e/0x4f0
  device_resume+0x1c2/0x670
  async_resume+0x1d/0x50
  async_run_entry_fn+0xfe/0x610
  process_one_work+0x716/0x1a50
  worker_thread+0xe0/0x1460
  kthread+0x222/0x2e0
  ret_from_fork+0x2a/0x40
 Memory state around the buggy address:
  ffff8803fc243f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8803fc244000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff8803fc244080: fc fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                          ^
  ffff8803fc244100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8803fc244180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

What is happening is that brcmf_pcie_resume() detects a device that
is no longer responsive and it decides to unbind resulting in a
wiphy_unregister() and wiphy_free() call. Now the wiphy instance
remains allocated, because PM needs to call wiphy_resume() for it.
However, brcmfmac already does a kfree() for the struct
cfg80211_registered_device::ops field. Change the checks in
wiphy_resume() to only access the struct cfg80211_registered_device::ops
if the wiphy instance is still registered at this time.
Reported-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

57e1e90d

arm64: mm: unaligned access by user-land should be received as SIGBUS · 3715dbf7

Victor Kamensky authored Apr 03, 2017

commit 09a6adf5 upstream.

After 52d7523d (arm64: mm: allow the kernel to handle alignment faults on
user accesses) commit user-land accesses that produce unaligned exceptions
like in case of aarch32 ldm/stm/ldrd/strd instructions operating on
unaligned memory received by user-land as SIGSEGV. It is wrong, it should
be reported as SIGBUS as it was before 52d7523d commit.

Changed do_bad_area function to take signal and code parameters out of esr
value using fault_info table, so in case of do_alignment_fault fault
user-land will receive SIGBUS. Wrapped access to fault_info table into
esr_to_fault_info function.

Fixes: 52d7523d (arm64: mm: allow the kernel to handle alignment faults on user accesses)
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3715dbf7

iio: bmg160: reset chip when probing · 71b44ef8

Quentin Schulz authored Mar 21, 2017

commit 4bdc9029 upstream.

The gyroscope chip might need to be reset to be used.

Without the chip being reset, the driver stopped at the first
regmap_read (to get the CHIP_ID) and failed to probe.

The datasheet of the gyroscope says that a minimum wait of 30ms after
the reset has to be done.

This patch has been checked on a BMX055 and the datasheet of the BMG160
and the BMI055 give the same reset register and bits.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

71b44ef8

kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd · ac303c64

Suzuki K Poulose authored Apr 03, 2017

commit 8b3405e3 upstream.

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range. And since we have to unmap the entire Guest memory range
holding a spinlock, make sure we yield the lock if necessary, after we
unmap each PUD range.

Fixes: commit d5d8184d ("KVM: ARM: Memory virtualization setup")
Cc: Paolo Bonzini <pbonzin@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[ Avoid vCPU starvation and lockup detector warnings ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ac303c64

arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region · a1ea3189

Marc Zyngier authored Mar 16, 2017

commit 72f31048 upstream.

We don't hold the mmap_sem while searching for VMAs (via find_vma), in
kvm_arch_prepare_memory_region, which can end up in expected failures.

Fixes: commit 8eef9123 ("arm/arm64: KVM: map MMIO regions at creation time")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Eric Auger <eric.auger@rehat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
[ Handle dirty page logging failure case ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a1ea3189

arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm · 48f2825a

Marc Zyngier authored Mar 16, 2017

commit 90f6e150 upstream.

We don't hold the mmap_sem while searching for the VMAs when
we try to unmap each memslot for a VM. Fix this properly to
avoid unexpected results.

Fixes: commit 957db105 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

48f2825a

staging: android: ashmem: lseek failed due to no FMODE_LSEEK. · 8f8de8d2

Shuxiao Zhang authored Apr 06, 2017

commit 97fbfef6 upstream.

vfs_llseek will check whether the file mode has
FMODE_LSEEK, no return failure. But ashmem can be
lseek, so add FMODE_LSEEK to ashmem file.

Comment From Greg Hackmann:
	ashmem_llseek() passes the llseek() call through to the backing
	shmem file.  91360b02 ("ashmem: use vfs_llseek()") changed
	this from directly calling the file's llseek() op into a VFS
	layer call.  This also adds a check for the FMODE_LSEEK bit, so
	without that bit ashmem_llseek() now always fails with -ESPIPE.

Fixes: 91360b02 ("ashmem: use vfs_llseek()")
Signed-off-by: Shuxiao Zhang <zhangshuxiao@xiaomi.com>
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8f8de8d2

sysfs: be careful of error returns from ops->show() · d9eedab3

NeilBrown authored Apr 03, 2017

commit c8a139d0 upstream.

ops->show() can return a negative error code.
Commit 65da3484 ("sysfs: correctly handle short reads on PREALLOC attrs.")
(in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors
would look like large numbers.
As a result, if an error is returned, sysfs_kf_read() will return the
value of 'count', typically 4096.

Commit 17d0774f ("sysfs: correctly handle read offset on PREALLOC attrs")
(in v4.8) extended this error to use the unsigned large 'len' as a size for
memmove().
Consequently, if ->show returns an error, then the first read() on the
sysfs file will return 4096 and could return uninitialized memory to
user-space.
If the application performs a subsequent read, this will trigger a memmove()
with extremely large count, and is likely to crash the machine is bizarre ways.

This bug can currently only be triggered by reading from an md
sysfs attribute declared with __ATTR_PREALLOC() during the
brief period between when mddev_put() deletes an mddev from
the ->all_mddevs list, and when mddev_delayed_delete() - which is
scheduled on a workqueue - completes.
Before this, an error won't be returned by the ->show()
After this, the ->show() won't be called.

I can reproduce it reliably only by putting delay like
	usleep_range(500000,700000);
early in mddev_delayed_delete(). Then after creating an
md device md0 run
  echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state

The bug can be triggered without the usleep.

Fixes: 65da3484 ("sysfs: correctly handle short reads on PREALLOC attrs.")
Fixes: 17d0774f ("sysfs: correctly handle read offset on PREALLOC attrs")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d9eedab3

drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl() · 4ddd24d5

Li Qiang authored Mar 27, 2017

commit e7e11f99 upstream.

In vmw_surface_define_ioctl(), the 'num_sizes' is the sum of the
'req->mip_levels' array. This array can be assigned any value from
the user space. As both the 'num_sizes' and the array is uint32_t,
it is easy to make 'num_sizes' overflow. The later 'mip_levels' is
used as the loop count. This can lead an oob write. Add the check of
'req->mip_levels' to avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4ddd24d5

drm/vmwgfx: Remove getparam error message · 28c84df7

Thomas Hellstrom authored Mar 27, 2017

commit 53e16798 upstream.

The mesa winsys sometimes uses unimplemented parameter requests to
check for features. Remove the error message to avoid bloating the
kernel log.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

28c84df7

drm/ttm, drm/vmwgfx: Relax permission checking when opening surfaces · b116797b

Thomas Hellstrom authored Mar 27, 2017

commit fe25deb7 upstream.

Previously, when a surface was opened using a legacy (non prime) handle,
it was verified to have been created by a client in the same master realm.
Relax this so that opening is also allowed recursively if the client
already has the surface open.

This works around a regression in svga mesa where opening of a shared
surface is used recursively to obtain surface information.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b116797b

drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl() · 604d2eac

Murray McAllister authored Mar 27, 2017

commit 63774069 upstream.

In vmw_get_cap_3d_ioctl(), a user can supply 0 for a size that is
used in vzalloc(). This eventually calls dump_stack() (in warn_alloc()),
which can leak useful addresses to dmesg.

Add check to avoid a size of 0.
Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

604d2eac

drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl() · 73ab7251

Murray McAllister authored Mar 27, 2017

commit 36274ab8 upstream.

Before memory allocations vmw_surface_define_ioctl() checks the
upper-bounds of a user-supplied size, but does not check if the
supplied size is 0.

Add check to avoid NULL pointer dereferences.
Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

73ab7251

drm/vmwgfx: Type-check lookups of fence objects · 92cc4816

Thomas Hellstrom authored Mar 27, 2017

commit f7652afa upstream.

A malicious caller could otherwise hand over handles to other objects
causing all sorts of interesting problems.

Testing done: Ran a Fedora 25 desktop using both Xorg and
gnome-shell/Wayland.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

92cc4816

ppdev: fix registering same device name · 9c2b46e7

Sudip Mukherjee authored Mar 06, 2017

commit 9a69645d upstream.

Usually every parallel port will have a single pardev registered with
it. But ppdev driver is an exception. This userspace parallel port
driver allows to create multiple parrallel port devices for a single
parallel port. And as a result we were having a big warning like:
"sysfs: cannot create duplicate filename '/devices/parport0/ppdev0.0'".
And with that many parallel port printers stopped working.

We have been using the minor number as the id field while registering
a parralel port device with a parralel port. But when there are
multiple parrallel port device for one single parallel port, they all
tried to register with the same name like 'pardev0.0' and everything
started failing.
Use an incremented index as the id instead of the minor number.

Fixes: 8b7d3a9d ("ppdev: use new parport device model")
Cc: stable <stable@vger.kernel.org> # v4.9+
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1414656
Bugzilla: https://bugs.archlinux.org/task/52322Tested-by: James Feeney <james@nurealm.net>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

9c2b46e7

ppdev: check before attaching port · bf5202b5

Sudip Mukherjee authored Nov 12, 2016

commit dd5c472a upstream.

After parport starts using the device model, all pardevice drivers
should decide in their match_port callback function if they want to
attach with that particulatr port. ppdev has been converted to use the
new parport device-model code but pp_attach() tried to attach with all
the ports.
Create a new array of pointer and use that to remember the ports we
have attached. And use that information to skip attaching ports which
we have already attached.
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bf5202b5

08 Apr, 2017 7 commits

Linux 4.9.21 · 37feaf80
Greg Kroah-Hartman authored Apr 08, 2017

37feaf80

nvme/pci: Disable on removal when disconnected · 02b23e05

Keith Busch authored Feb 10, 2017

commit 6db28eda upstream.

If the device is not present, the driver should disable the queues
immediately. Prior to this, the driver was relying on the watchdog timer
to kill the queues if requests were outstanding to the device, and that
just delays removal up to one second.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

02b23e05

nvme/core: Fix race kicking freed request_queue · a5e39a7f

Keith Busch authored Feb 10, 2017

commit f33447b9 upstream.

If a namespace has already been marked dead, we don't want to kick the
request_queue again since we may have just freed it from another thread.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a5e39a7f

padata: avoid race in reordering · eb8c62a3

Jason A. Donenfeld authored Mar 23, 2017

commit de5540d0 upstream.

Under extremely heavy uses of padata, crashes occur, and with list
debugging turned on, this happens instead:

[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
[87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
[87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
[87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
[87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
[87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
[87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
[87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
[87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120

padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:

spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);

This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:

next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
       padata = list_entry(reorder->list.next,
                           struct padata_priv, list);
       spin_lock(&reorder->lock);
       list_del_init(&padata->list);
       atomic_dec(&pd->reorder_objects);
       spin_unlock(&reorder->lock);

       pd->processed++;

       goto out;
}
out:
return padata;

I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix is thus be hoist that lock outside of
that block.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

eb8c62a3

blk: Ensure users for current->bio_list can see the full list. · 5959cded

NeilBrown authored Mar 10, 2017

commit f5fe1b51 upstream.

Commit 79bd9959 ("blk: improve order of bio handling in generic_make_request()")
changed current->bio_list so that it did not contain *all* of the
queued bios, but only those submitted by the currently running
make_request_fn.

There are two places which walk the list and requeue selected bios,
and others that check if the list is empty.  These are no longer
correct.

So redefine current->bio_list to point to an array of two lists, which
contain all queued bios, and adjust various code to test or walk both
lists.
Signed-off-by: NeilBrown <neilb@suse.com>
Fixes: 79bd9959 ("blk: improve order of bio handling in generic_make_request()")
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5959cded

blk: improve order of bio handling in generic_make_request() · d5986e00

NeilBrown authored Mar 08, 2017

commit 79bd9959 upstream.

To avoid recursion on the kernel stack when stacked block devices
are in use, generic_make_request() will, when called recursively,
queue new requests for later handling. They will be handled when the
make_request_fn for the current bio completes.

If any bios are submitted by a make_request_fn, these will ultimately
be handled seqeuntially. If the handling of one of those generates
further requests, they will be added to the end of the queue.

This strict first-in-first-out behaviour can lead to deadlocks in
various ways, normally because a request might need to wait for a
previous request to the same device to complete. This can happen when
they share a mempool, and can happen due to interdependencies
particular to the device. Both md and dm have examples where this happens.

These deadlocks can be erradicated by more selective ordering of bios.
Specifically by handling them in depth-first order. That is: when the
handling of one bio generates one or more further bios, they are
handled immediately after the parent, before any siblings of the
parent. That way, when generic_make_request() calls make_request_fn
for some particular device, we can be certain that all previously
submited requests for that device have been completely handled and are
not waiting for anything in the queue of requests maintained in
generic_make_request().

An easy way to achieve this would be to use a last-in-first-out stack
instead of a queue. However this will change the order of consecutive
bios submitted by a make_request_fn, which could have unexpected consequences.
Instead we take a slightly more complex approach.
A fresh queue is created for each call to a make_request_fn. After it completes,
any bios for a different device are placed on the front of the main queue, followed
by any bios for the same device, followed by all bios that were already on
the queue before the make_request_fn was called.
This provides the depth-first approach without reordering bios on the same level.

This, by itself, it not enough to remove all deadlocks. It just makes
it possible for drivers to take the extra step required themselves.

To avoid deadlocks, drivers must never risk waiting for a request
after submitting one to generic_make_request. This includes never
allocing from a mempool twice in the one call to a make_request_fn.

A common pattern in drivers is to call bio_split() in a loop, handling
the first part and then looping around to possibly split the next part.
Instead, a driver that finds it needs to split a bio should queue
(with generic_make_request) the second part, handle the first part,
and then return. The new code in generic_make_request will ensure the
requests to underlying bios are processed first, then the second bio
that was split off. If it splits again, the same process happens. In
each case one bio will be completely handled before the next one is attempted.

With this is place, it should be possible to disable the
punt_bios_to_recover() recovery thread for many block devices, and
eventually it may be possible to remove it completely.

Ref: http://www.spinics.net/lists/raid/msg54680.htmlTested-by: Jinpu Wang <jinpu.wang@profitbricks.com>
Inspired-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d5986e00

mm: workingset: fix premature shadow node shrinking with cgroups · e3a55294

Johannes Weiner authored Mar 31, 2017

commit 0cefabda upstream.

Commit 0a6b76dd ("mm: workingset: make shadow node shrinker memcg
aware") enabled cgroup-awareness in the shadow node shrinker, but forgot
to also enable cgroup-awareness in the list_lru the shadow nodes sit on.

Consequently, all shadow nodes are sitting on a global (per-NUMA node)
list, while the shrinker applies the limits according to the amount of
cache in the cgroup its shrinking. The result is excessive pressure on
the shadow nodes from cgroups that have very little cache.

Enable memcg-mode on the shadow node LRUs, such that per-cgroup limits
are applied to per-cgroup lists.

Fixes: 0a6b76dd ("mm: workingset: make shadow node shrinker memcg aware")
Link: http://lkml.kernel.org/r/20170322005320.8165-1-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@tarantool.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e3a55294