Commits · b40c9ac7286e6a5578649098861c7bfcd40451de · nexedi / linux

27 Jul, 2016 40 commits

drm/vmwgfx: Work around mode set failure in 2D VMs · b40c9ac7

Sinclair Yeh authored Jun 29, 2016

commit 7c20d213 upstream.

In a low-memory 2D VM, fbdev can take up a large percentage of
available memory, making them unavailable for other DRM clients.

Since we do not take fbdev into account when filtering modes,
we end up claiming to support more modes than we actually do.

As a result, users get a black screen when setting a mode too
large for current available memory.  In a low-memory VM
configuration, users can get a black screen for a mode as low
as 1024x768.

The current mode filtering mechanism keys off of
SVGA_REG_SUGGESTED_GBOBJECT_MEM_SIZE_KB, i.e. the maximum amount
of surface memory we have.  Since this value is a performance
suggestion, not a hard limit, and since there should not be much
of a performance impact for a 2D VM, rather than filtering out
more modes, we will just allow ourselves to exceed the SVGA's
performance suggestion.

Also changed assumed bpp to 32 from 16 to make sure we can
actually support all the modes listed.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b40c9ac7

drm/vmwgfx: Add an option to change assumed FB bpp · a216ed8d

Sinclair Yeh authored Jun 29, 2016

commit 04319d89 upstream.

Offer an option for advanced users who want larger modes at 16bpp.

This becomes necessary after the fix: "Work around mode set
failure in 2D VMs."  Without this patch, there would be no way
for existing advanced users to get to a high res mode, and the
regression is they will likely get a black screen after a software
update on their current VM.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a216ed8d

drm/ttm: Make ttm_bo_mem_compat available · 6c42c30a

Sinclair Yeh authored Jun 29, 2016

commit 94477bff upstream.

There are cases where it is desired to see if a proposed placement
is compatible with a buffer object before calling ttm_bo_validate().
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6c42c30a

drm: atmel-hlcdc: actually disable scaling when no scaling is required · c6a2cb3a

Boris Brezillon authored May 27, 2016

commit 1b7e38b9 upstream.

The driver is only enabling scaling, but never disabling it, thus, if you
enable the scaling feature once it stays enabled forever.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Alex Vazquez <avazquez.dev@gmail.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Fixes: 1a396789 ("drm: add Atmel HLCDC Display Controller support")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c6a2cb3a

drm: make drm_atomic_set_mode_prop_for_crtc() more reliable · f956468c

Tomi Valkeinen authored May 31, 2016

commit 6709887c upstream.

drm_atomic_set_mode_prop_for_crtc() does not clear the state->mode, so
old data may be left there when a new mode is set, possibly causing odd
issues.

This patch improves the situation by always clearing the state->mode
first.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f956468c

drm: add missing drm_mode_set_crtcinfo call · ec00d4d7

Tomi Valkeinen authored May 31, 2016

commit b201e743 upstream.

When setting mode via MODE_ID property,
drm_atomic_set_mode_prop_for_crtc() does not call
drm_mode_set_crtcinfo() which possibly causes:

"[drm:drm_calc_timestamping_constants [drm]] *ERROR* crtc 32: Can't
calculate constants, dotclock = 0!"

Whether the error is seen depends on the previous data in state->mode,
as state->mode is not cleared when setting new mode.

This patch adds drm_mode_set_crtcinfo() call to
drm_mode_convert_umode(), which is called in both legacy and atomic
paths. This should be fine as there's no reason to call
drm_mode_convert_umode() without also setting the crtc related fields.

drm_mode_set_crtcinfo() is removed from the legacy drm_mode_setcrtc() as
that is no longer needed.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ec00d4d7

drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency · 86383e48

Ville Syrjälä authored Apr 26, 2016

commit a04e23d4 upstream.

Update CDCLK_FREQ on BDW after changing the cdclk frequency. Not sure
if this is a late addition to the spec, or if I simply overlooked this
step when writing the original code.

This is what Bspec has to say about CDCLK_FREQ:
"Program this field to the CD clock frequency minus one. This is used to
 generate a divided down clock for miscellaneous timers in display."

And the "Broadwell Sequences for Changing CD Clock Frequency" section
clarifies this further:
"For CD clock 337.5 MHz, program 337 decimal.
 For CD clock 450 MHz, program 449 decimal.
 For CD clock 540 MHz, program 539 decimal.
 For CD clock 675 MHz, program 674 decimal."

Cc: stable@vger.kernel.org
Cc: Mika Kahola <mika.kahola@intel.com>
Fixes: b432e5cf ("drm/i915: BDW clock change support")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-2-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Mika Kahola <mika.kahola@intel.com>
(cherry picked from commit 7f1052a8)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

86383e48

drm/i915: Update ifdeffery for mutex->owner · edc185a4

Chris Wilson authored Jul 11, 2016

commit b1924006 upstream.

In commit 7608a43d ("locking/mutexes: Use MUTEX_SPIN_ON_OWNER when
appropriate") the owner field in the mutex was updated from being
dependent upon CONFIG_SMP to using optimistic spin. Update our peek
function to suite.

Fixes:7608a43d ("locking/mutexes: Use MUTEX_SPIN_ON_OWNER...")
Reported-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1468244777-4888-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: Matthew Auld <matthew.auld@intel.com>
(cherry picked from commit 4f074a53)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

edc185a4

drm/i915: Refresh cached DP port register value on resume · 3ea2a7e9

Ville Syrjälä authored May 13, 2016

commit 664a84d2 upstream.

During hibernation the cached DP port register value will be left with
whatever value we have there when we create the hibernation image.
Currently that means the port (and eDP PLL) will be off in the cached
value. However when we resume there is no guarantee that the value
in the actual register will match the cached value. If i915 isn't
loaded in the kernel that loads the hibernation image, the port may
well be on (eg. left on by the BIOS). The encoder state readout
does the right thing in this case and updates our encoder state
to reflect the actual hardware state. However the post-resume modeset
will then use the stale cached port register value in
intel_dp_link_down() and potentially confuse the hardware.

This was caught by the following assert
WARNING: CPU: 3 PID: 5288 at ../drivers/gpu/drm/i915/intel_dp.c:2184 assert_edp_pll+0x99/0xa0 [i915]
eDP PLL state assertion failure (expected on, current off)
on account of the eDP PLL getting prematurely turned off when
shutting down the port, since the DP_PLL_ENABLE bit wasn't set
in the cached register value.

Presumably I introduced this problem in
commit 6fec7662 ("drm/i915: Use intel_dp->DP in eDP PLL setup")
as before that we didn't update the cached value after shuttting the
port down. That's assuming the port got enabled at least once prior
to hibernating. If that didn't happen then the cached value would
still have been totally out of sync with reality (eg. first boot w/o
eDP on, then hibernate, and then resume with eDP on).

So, let's fix this properly and refresh the cached register value from
the hardware register during resume.

DDI platforms shouldn't use the cached value during port disable at
least, so shouldn't have this particular issue. They might still have
issues if we skip the initial modeset and then try to retrain the link
or something. But untangling this DP vs. DDI mess is a bigger topic,
so let's jut punt on DDI for now.

Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: 6fec7662 ("drm/i915: Use intel_dp->DP in eDP PLL setup")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463162036-27931-1-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 64989ca4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3ea2a7e9

drm/i915/ilk: Don't disable SSC source if it's in use · b17d254d

Lyude authored Jun 14, 2016

commit 476490a9 upstream.

Thanks to Ville Syrjälä for pointing me towards the cause of this issue.

Unfortunately one of the sideaffects of having the refclk for a DPLL set
to SSC is that as long as it's set to SSC, the GPU will prevent us from
powering down any of the pipes or transcoders using it. A couple of
BIOSes enable SSC in both PCH_DREF_CONTROL and in the DPLL
configurations. This causes issues on the first modeset, since we don't
expect SSC to be left on and as a result, can't successfully power down
the pipes or the transcoders using it. Here's an example from this Dell
OptiPlex 990:

[drm:intel_modeset_init] SSC enabled by BIOS, overriding VBT which says disabled
[drm:intel_modeset_init] 2 display pipes available.
[drm:intel_update_cdclk] Current CD clock rate: 400000 kHz
[drm:intel_update_max_cdclk] Max CD clock rate: 400000 kHz
[drm:intel_update_max_cdclk] Max dotclock rate: 360000 kHz
vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[drm:intel_crt_reset] crt adpa set to 0xf40000
[drm:intel_dp_init_connector] Adding DP connector on port C
[drm:intel_dp_aux_init] registering DPDDC-C bus for card0-DP-1
[drm:ironlake_init_pch_refclk] has_panel 0 has_lvds 0 has_ck505 0
[drm:ironlake_init_pch_refclk] Disabling SSC entirely
… later we try committing the first modeset …
[drm:intel_dump_pipe_config] [CRTC:26][modeset] config ffff88041b02e800 for pipe A
[drm:intel_dump_pipe_config] cpu_transcoder: A
…
[drm:intel_dump_pipe_config] dpll_hw_state: dpll: 0xc4016001, dpll_md: 0x0, fp0: 0x20e08, fp1: 0x30d07
[drm:intel_dump_pipe_config] planes on this crtc
[drm:intel_dump_pipe_config] STANDARD PLANE:23 plane: 0.0 idx: 0 enabled
[drm:intel_dump_pipe_config]     FB:42, fb = 800x600 format = 0x34325258
[drm:intel_dump_pipe_config]     scaler:0 src (0, 0) 800x600 dst (0, 0) 800x600
[drm:intel_dump_pipe_config] CURSOR PLANE:25 plane: 0.1 idx: 1 disabled, scaler_id = 0
[drm:intel_dump_pipe_config] STANDARD PLANE:27 plane: 0.1 idx: 2 disabled, scaler_id = 0
[drm:intel_get_shared_dpll] CRTC:26 allocated PCH DPLL A
[drm:intel_get_shared_dpll] using PCH DPLL A for pipe A
[drm:ilk_audio_codec_disable] Disable audio codec on port C, pipe A
[drm:intel_disable_pipe] disabling pipe A
------------[ cut here ]------------
WARNING: CPU: 1 PID: 130 at drivers/gpu/drm/i915/intel_display.c:1146 intel_disable_pipe+0x297/0x2d0 [i915]
pipe_off wait timed out
…
---[ end trace 94fc8aa03ae139e8 ]---
[drm:intel_dp_link_down]
[drm:ironlake_crtc_disable [i915]] *ERROR* failed to disable transcoder A

Later modesets succeed since they reset the DPLL's configuration anyway,
but this is enough to get stuck with a big fat warning in dmesg.

A better solution would be to add refcounts for the SSC source, but for
now leaving the source clock on should suffice.

Changes since v4:
 - Fix calculation of final for systems with LVDS panels (fixes BUG() on
   CI test suite)
Changes since v3:
 - Move temp variable into loop
 - Move checks for using_ssc_source to after we've figured out has_ck505
 - Add using_ssc_source to debug output
Changes since v2:
 - Fix debug output for when we disable the CPU source
Changes since v1:
 - Leave the SSC source clock on instead of just shutting it off on all
   of the DPLL configurations.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1465916649-10228-1-git-send-email-cpaul@redhat.comSigned-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b17d254d

drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern · 4b69c00a

Ben Skeggs authored Jul 06, 2016

commit 21721504 upstream.

Fixes a regression caused by a stupid thinko from "disp/sor/gf119: both
links use the same training register".
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4b69c00a

drm/nouveau: fix for disabled fbdev emulation · 15dc6a48

Dmitrii Tcvetkov authored Jun 20, 2016

commit 52dfcc5c upstream.

Hello,

after this commit:

commit f045f459
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Thu Jun 2 12:23:31 2016 +1000
    drm/nouveau/fbcon: fix out-of-bounds memory accesses

kernel started to oops when loading nouveau module when using GTX 780 Ti
video adapter. This patch fixes the problem.

Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=120591Signed-off-by: Dmitrii Tcvetkov <demfloro@demfloro.ru>
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: f045f459 ("nouveau_fbcon_init()")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

15dc6a48

drm/nouveau/fbcon: fix out-of-bounds memory accesses · fbf9b544

Ben Skeggs authored Jun 02, 2016

commit f045f459 upstream.

Reported by KASAN.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fbf9b544

drm/nouveau/gr/gf100-: update sm error decoding from gk20a nvgpu headers · c8c3b35d

Ben Skeggs authored Jun 01, 2016

commit 383d0a41 upstream.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c8c3b35d

drm/nouveau/disp/sor/gf119: both links use the same training register · 921daff2

Ben Skeggs authored Jun 03, 2016

commit a8953c52 upstream.

It appears that, for whatever reason, both link A and B use the same
register to control the training pattern.  It's a little odd, as the
GPUs before this (Tesla/Fermi1) have per-link registers, as do newer
GPUs (Maxwell).

Fixes the third DP output on NVS 510 (GK107).
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

921daff2

virtio_balloon: fix PFN format for virtio-1 · 7233bb86

Michael S. Tsirkin authored May 17, 2016

commit 87c9403b upstream.

Everything should be LE when using virtio-1, but
the linux balloon driver does not seem to care about that.
Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7233bb86

drm/dp/mst: Always clear proposed vcpi table for port. · b752a271

Andrey Grodzovsky authored May 25, 2016

commit fd2d2bac upstream.

Not clearing mst manager's proposed vcpis table for destroyed connectors when the manager is stopped leaves it pointing to unrefernced memory, this causes pagefault when the manager is restarted when plugging back a branch.

Fixes: 91a25e46 ("drm/dp/mst: deallocate payload on port destruction")
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Cc: Mykola Lysenko <Mykola.Lysenko@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b752a271

drm/amdkfd: destroy dbgmgr in notifier release · 83a6e52a

Oded Gabbay authored May 26, 2016

commit bc4755a4 upstream.

amdkfd need to destroy the debug manager in case amdkfd's notifier
function is called before the unbind function, because in that case,
the unbind function will exit without destroying debug manager.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

83a6e52a

drm/amdkfd: unbind only existing processes · cf2e8061

Oded Gabbay authored May 26, 2016

commit 121b78e6 upstream.

When unbinding a process from a device (initiated by amd_iommu_v2), the
driver needs to make sure that process still exists in the process table.
There is a possibility that amdkfd's own notifier handler -
kfd_process_notifier_release() - was called before the unbind function
and it already removed the process from the process table.

v2:
Because there can be only one process with the specified pasid, and
because *p can't be NULL inside the hash_for_each_rcu macro, it is more
reasonable to just put the whole code inside the if statement that
compares the pasid value. That way, when we exit hash_for_each_rcu, we
simply exit the function as well.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cf2e8061

ubi: Make recover_peb power cut aware · ca8a32b2

Richard Weinberger authored Jun 21, 2016

commit 972228d8 upstream.

recover_peb() was never power cut aware,
if a power cut happened right after writing the VID header
upon next attach UBI would blindly use the new partial written
PEB and all data from the old PEB is lost.

In order to make recover_peb() power cut aware, write the new
VID with a proper crc and copy_flag set such that the UBI attach
process will detect whether the new PEB is completely written
or not.
We cannot directly use ubi_eba_atomic_leb_change() since we'd
have to unlock the LEB which is facing a write error.
Reported-by: Jörg Pfähler <pfaehler@isse.de>
Reviewed-by: Jörg Pfähler <pfaehler@isse.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ca8a32b2

drm/amdgpu/gfx7: fix broken condition check · 69eab50d

Alex Deucher authored Jun 13, 2016

commit 8b18300c upstream.

Wrong operator.
Reported-by: David Binderman <linuxdev.baldrick@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

69eab50d

drm/radeon: fix asic initialization for virtualized environments · bc326bfa

Alex Deucher authored Jun 13, 2016

commit 05082b8b upstream.

When executing in a PCI passthrough based virtuzliation environment, the
hypervisor will usually attempt to send a PCIe bus reset signal to the
ASIC when the VM reboots. In this scenario, the card is not correctly
initialized, but we still consider it to be posted. Therefore, in a
passthrough based environemnt we should always post the card to guarantee
it is in a good state for driver initialization.

Ported from amdgpu commit:
amdgpu: fix asic initialization for virtualized environments

Cc: Andres Rodriguez <andres.rodriguez@amd.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bc326bfa

btrfs: account for non-CoW'd blocks in btrfs_abort_transaction · 13226e1e

Jeff Mahoney authored Jun 08, 2016

commit 64c12921 upstream.

The test for !trans->blocks_used in btrfs_abort_transaction is
insufficient to determine whether it's safe to drop the transaction
handle on the floor.  btrfs_cow_block, informed by should_cow_block,
can return blocks that have already been CoW'd in the current
transaction.  trans->blocks_used is only incremented for new block
allocations. If an operation overlaps the blocks in the current
transaction entirely and must abort the transaction, we'll happily
let it clean up the trans handle even though it may have modified
the blocks and will commit an incomplete operation.

In the long-term, I'd like to do closer tracking of when the fs
is actually modified so we can still recover as gracefully as possible,
but that approach will need some discussion.  In the short term,
since this is the only code using trans->blocks_used, let's just
switch it to a bool indicating whether any blocks were used and set
it when should_cow_block returns false.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

13226e1e

percpu: fix synchronization between synchronous map extension and chunk destruction · 3bb1138e

Tejun Heo authored May 25, 2016

commit 6710e594 upstream.

For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.

This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 1a4d7607 ("percpu: implement asynchronous chunk population")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3bb1138e

percpu: fix synchronization between chunk->map_extend_work and chunk destruction · c26ae537

Tejun Heo authored May 25, 2016

commit 4f996e23 upstream.

Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work.  pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.

This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 9c824b6a ("percpu: make sure chunk->map array has available space")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c26ae537

af_unix: fix hard linked sockets on overlay · 0da3127a

Miklos Szeredi authored May 20, 2016

commit eb0a4a47 upstream.

Overlayfs uses separate inodes even in the case of hard links on the
underlying filesystems.  This is a problem for AF_UNIX socket
implementation which indexes sockets based on the inode.  This resulted in
hard linked sockets not working.

The fix is to use the real, underlying inode.

Test case follows:

-- ovl-sock-test.c --
#include <unistd.h>
#include <err.h>
#include <sys/socket.h>
#include <sys/un.h>

#define SOCK "test-sock"
#define SOCK2 "test-sock2"

int main(void)
{
	int fd, fd2;
	struct sockaddr_un addr = {
		.sun_family = AF_UNIX,
		.sun_path = SOCK,
	};
	struct sockaddr_un addr2 = {
		.sun_family = AF_UNIX,
		.sun_path = SOCK2,
	};

	unlink(SOCK);
	unlink(SOCK2);
	if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
		err(1, "socket");
	if (bind(fd, (struct sockaddr *) &addr, sizeof(addr)) == -1)
		err(1, "bind");
	if (listen(fd, 0) == -1)
		err(1, "listen");
	if (link(SOCK, SOCK2) == -1)
		err(1, "link");
	if ((fd2 = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
		err(1, "socket");
	if (connect(fd2, (struct sockaddr *) &addr2, sizeof(addr2)) == -1)
		err (1, "connect");
	return 0;
}
----
Reported-by: Alexander Morozov <alexandr.morozov@docker.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0da3127a

vfs: add d_real_inode() helper · c6517079

Miklos Szeredi authored May 20, 2016

commit a1180844 upstream.

Needed by the following fix.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c6517079

arm64: Rework valid_user_regs · cff5b236

Mark Rutland authored Mar 01, 2016

commit dbd4d7ca upstream.

We validate pstate using PSR_MODE32_BIT, which is part of the
user-provided pstate (and cannot be trusted). Also, we conflate
validation of AArch32 and AArch64 pstate values, making the code
difficult to reason about.

Instead, validate the pstate value based on the associated task. The
task may or may not be current (e.g. when using ptrace), so this must be
passed explicitly by callers. To avoid circular header dependencies via
sched.h, is_compat_task is pulled out of asm/ptrace.h.

To make the code possible to reason about, the AArch64 and AArch32
validation is split into separate functions. Software must respect the
RES0 policy for SPSR bits, and thus the kernel mirrors the hardware
policy (RAZ/WI) for bits as-yet unallocated. When these acquire an
architected meaning writes may be permitted (potentially with additional
validation).
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[ rebased for v4.1+
  This avoids a user-triggerable Oops() if a task is switched to a mode
  not supported by the kernel (e.g. switching a 64-bit task to AArch32).
]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com> [backport]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cff5b236

ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg() · de0f9fa7

Junichi Nomura authored Jun 10, 2016

commit ae4ea9a2 upstream.

Commit 7ea0ed2b ("ipmi: Make the message handler easier to use for
SMI interfaces") changed handle_new_recv_msgs() to call handle_one_recv_msg()
for a smi_msg while the smi_msg is still connected to waiting_rcv_msgs list.
That could lead to following list corruption problems:

1) low-level function treats smi_msg as not connected to list

  handle_one_recv_msg() could end up calling smi_send(), which
  assumes the msg is not connected to list.

  For example, the following sequence could corrupt list by
  doing list_add_tail() for the entry still connected to other list.

    handle_new_recv_msgs()
      msg = list_entry(waiting_rcv_msgs)
      handle_one_recv_msg(msg)
        handle_ipmb_get_msg_cmd(msg)
          smi_send(msg)
            spin_lock(xmit_msgs_lock)
            list_add_tail(msg)
            spin_unlock(xmit_msgs_lock)

2) race between multiple handle_new_recv_msgs() instances

  handle_new_recv_msgs() once releases waiting_rcv_msgs_lock before calling
  handle_one_recv_msg() then retakes the lock and list_del() it.

  If others call handle_new_recv_msgs() during the window shown below
  list_del() will be done twice for the same smi_msg.

  handle_new_recv_msgs()
    spin_lock(waiting_rcv_msgs_lock)
    msg = list_entry(waiting_rcv_msgs)
    spin_unlock(waiting_rcv_msgs_lock)
  |
  | handle_one_recv_msg(msg)
  |
    spin_lock(waiting_rcv_msgs_lock)
    list_del(msg)
    spin_unlock(waiting_rcv_msgs_lock)

Fixes: 7ea0ed2b ("ipmi: Make the message handler easier to use for SMI interfaces")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
[Added a comment to describe why this works.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Ye Feng <yefeng.yl@alibaba-inc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

de0f9fa7

drm/mgag200: Black screen fix for G200e rev 4 · 084ad7f7

Mathieu Larouche authored May 27, 2016

commit d3922b69 upstream.

- Fixed black screen for some resolutions of G200e rev4
- Fixed testm & testn which had predetermined value.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

084ad7f7

iommu/amd: Fix unity mapping initialization race · e205592b

Joerg Roedel authored Jul 01, 2016

commit 522e5cb7 upstream.

There is a race condition in the AMD IOMMU init code that
causes requested unity mappings to be blocked by the IOMMU
for a short period of time. This results on boot failures
and IO_PAGE_FAULTs on some machines.

Fix this by making sure the unity mappings are installed
before all other DMA is blocked.

Fixes: aafd8ba0 ('iommu/amd: Implement add_device and remove_device')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e205592b

iommu/vt-d: Enable QI on all IOMMUs before setting root entry · 72803a72

Joerg Roedel authored Jun 17, 2016

commit a4c34ff1 upstream.

This seems to be required on some X58 chipsets on systems
with more than one IOMMU. QI does not work until it is
enabled on all IOMMUs in the system.
Reported-by: Dheeraj CVR <cvr.dheeraj@gmail.com>
Tested-by: Dheeraj CVR <cvr.dheeraj@gmail.com>
Fixes: 5f0a7f76 ('iommu/vt-d: Make root entry visible for hardware right after allocation')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

72803a72

iommu/arm-smmu: Wire up map_sg for arm-smmu-v3 · c9566f67

Jean-Philippe Brucker authored Jun 03, 2016

commit 9aeb26cf upstream.

The map_sg callback is missing from arm_smmu_ops, but is required by
iommu.h. Similarly to most other IOMMU drivers, connect it to
default_iommu_map_sg.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c9566f67

base: make module_create_drivers_dir race-free · c705db27

Jiri Slaby authored Jun 10, 2016

commit 7e1b1fc4 upstream.

Modules which register drivers via standard path (driver_register) in
parallel can cause a warning:
WARNING: CPU: 2 PID: 3492 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
sysfs: cannot create duplicate filename '/module/saa7146/drivers'
Modules linked in: hexium_gemini(+) mxb(+) ...
...
Call Trace:
...
 [<ffffffff812e63a2>] sysfs_warn_dup+0x62/0x80
 [<ffffffff812e6487>] sysfs_create_dir_ns+0x77/0x90
 [<ffffffff8140f2c4>] kobject_add_internal+0xb4/0x340
 [<ffffffff8140f5b8>] kobject_add+0x68/0xb0
 [<ffffffff8140f631>] kobject_create_and_add+0x31/0x70
 [<ffffffff8157a703>] module_add_driver+0xc3/0xd0
 [<ffffffff8155e5d4>] bus_add_driver+0x154/0x280
 [<ffffffff815604c0>] driver_register+0x60/0xe0
 [<ffffffff8145bed0>] __pci_register_driver+0x60/0x70
 [<ffffffffa0273e14>] saa7146_register_extension+0x64/0x90 [saa7146]
 [<ffffffffa0033011>] hexium_init_module+0x11/0x1000 [hexium_gemini]
...

As can be (mostly) seen, driver_register causes this call sequence:
  -> bus_add_driver
    -> module_add_driver
      -> module_create_drivers_dir
The last one creates "drivers" directory in /sys/module/<...>. When
this is done in parallel, the directory is attempted to be created
twice at the same time.

This can be easily reproduced by loading mxb and hexium_gemini in
parallel:
while :; do
  modprobe mxb &
  modprobe hexium_gemini
  wait
  rmmod mxb hexium_gemini saa7146_vv saa7146
done

saa7146 calls pci_register_driver for both mxb and hexium_gemini,
which means /sys/module/saa7146/drivers is to be created for both of
them.

Fix this by a new mutex in module_create_drivers_dir which makes the
test-and-create "drivers" dir atomic.

I inverted the condition and removed 'return' to avoid multiple
unlocks or a goto.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Fixes: fe480a26 (Modules: only add drivers/ direcory if needed)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c705db27

tracing: Handle NULL formats in hold_module_trace_bprintk_format() · bc64a839

Steven Rostedt (Red Hat) authored Jun 17, 2016

commit 70c8217a upstream.

If a task uses a non constant string for the format parameter in
trace_printk(), then the trace_printk_fmt variable is set to NULL. This
variable is then saved in the __trace_printk_fmt section.

The function hold_module_trace_bprintk_format() checks to see if duplicate
formats are used by modules, and reuses them if so (saves them to the list
if it is new). But this function calls lookup_format() that does a strcmp()
to the value (which is now NULL) and can cause a kernel oops.

This wasn't an issue till 3debb0a9 ("tracing: Fix trace_printk() to print
when not using bprintk()") which added "__used" to the trace_printk_fmt
variable, and before that, the kernel simply optimized it out (no NULL value
was saved).

The fix is simply to handle the NULL pointer in lookup_format() and have the
caller ignore the value if it was NULL.

Link: http://lkml.kernel.org/r/1464769870-18344-1-git-send-email-zhengjun.xing@intel.comReported-by: xingzhen <zhengjun.xing@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 3debb0a9 ("tracing: Fix trace_printk() to print when not using bprintk()")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bc64a839

HID: multitouch: enable palm rejection for Windows Precision Touchpad · 2f839c95

Allen Hung authored Jun 23, 2016

commit 6dd2e27a upstream.

The usage Confidence is mandary to Windows Precision Touchpad devices. If
it is examined in input_mapping on a WIndows Precision Touchpad, a new add
quirk MT_QUIRK_CONFIDENCE desgned for such devices will be applied to the
device. A touch with the confidence bit is not set is determined as
invalid.

Tested on Dell XPS13 9343
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3
Signed-off-by: Allen Hung <allen_hung@dell.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2f839c95

HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands · 300851ff

Scott Bauer authored Jun 23, 2016

commit 93a2001b upstream.

This patch validates the num_values parameter from userland during the
HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set
to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter
leading to a heap overflow.
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

300851ff

HID: elo: kill not flush the work · 2d7a2ff1

Oliver Neukum authored May 31, 2016

commit ed596a4a upstream.

Flushing a work that reschedules itself is not a sensible operation. It needs
to be killed. Failure to do so leads to a kernel panic in the timer code.
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2d7a2ff1

KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode. · 44dd5cec

Quentin Casasnovas authored Jun 18, 2016

commit ff30ef40 upstream.

I couldn't get Xen to boot a L2 HVM when it was nested under KVM - it was
getting a GP(0) on a rather unspecial vmread from Xen:

     (XEN) ----[ Xen-4.7.0-rc  x86_64  debug=n  Not tainted ]----
     (XEN) CPU:    1
     (XEN) RIP:    e008:[<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
     (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d1v0)
     (XEN) rax: ffff82d0801e6288   rbx: ffff83003ffbfb7c   rcx: fffffffffffab928
     (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: ffff83000bdd0000
     (XEN) rbp: ffff83000bdd0000   rsp: ffff83003ffbfab0   r8:  ffff830038813910
     (XEN) r9:  ffff83003faf3958   r10: 0000000a3b9f7640   r11: ffff83003f82d418
     (XEN) r12: 0000000000000000   r13: ffff83003ffbffff   r14: 0000000000004802
     (XEN) r15: 0000000000000008   cr0: 0000000080050033   cr4: 00000000001526e0
     (XEN) cr3: 000000003fc79000   cr2: 0000000000000000
     (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
     (XEN) Xen code around <ffff82d0801e629e> (vmx_get_segment_register+0x14e/0x450):
     (XEN)  00 00 41 be 02 48 00 00 <44> 0f 78 74 24 08 0f 86 38 56 00 00 b8 08 68 00
     (XEN) Xen stack trace from rsp=ffff83003ffbfab0:

     ...

     (XEN) Xen call trace:
     (XEN)    [<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
     (XEN)    [<ffff82d0801f3695>] get_page_from_gfn_p2m+0x165/0x300
     (XEN)    [<ffff82d0801bfe32>] hvmemul_get_seg_reg+0x52/0x60
     (XEN)    [<ffff82d0801bfe93>] hvm_emulate_prepare+0x53/0x70
     (XEN)    [<ffff82d0801ccacb>] handle_mmio+0x2b/0xd0
     (XEN)    [<ffff82d0801be591>] emulate.c#_hvm_emulate_one+0x111/0x2c0
     (XEN)    [<ffff82d0801cd6a4>] handle_hvm_io_completion+0x274/0x2a0
     (XEN)    [<ffff82d0801f334a>] __get_gfn_type_access+0xfa/0x270
     (XEN)    [<ffff82d08012f3bb>] timer.c#add_entry+0x4b/0xb0
     (XEN)    [<ffff82d08012f80c>] timer.c#remove_entry+0x7c/0x90
     (XEN)    [<ffff82d0801c8433>] hvm_do_resume+0x23/0x140
     (XEN)    [<ffff82d0801e4fe7>] vmx_do_resume+0xa7/0x140
     (XEN)    [<ffff82d080164aeb>] context_switch+0x13b/0xe40
     (XEN)    [<ffff82d080128e6e>] schedule.c#schedule+0x22e/0x570
     (XEN)    [<ffff82d08012c0cc>] softirq.c#__do_softirq+0x5c/0x90
     (XEN)    [<ffff82d0801602c5>] domain.c#idle_loop+0x25/0x50
     (XEN)
     (XEN)
     (XEN) ****************************************
     (XEN) Panic on CPU 1:
     (XEN) GENERAL PROTECTION FAULT
     (XEN) [error_code=0000]
     (XEN) ****************************************

Tracing my host KVM showed it was the one injecting the GP(0) when
emulating the VMREAD and checking the destination segment permissions in
get_vmx_mem_address():

     3)               |    vmx_handle_exit() {
     3)               |      handle_vmread() {
     3)               |        nested_vmx_check_permission() {
     3)               |          vmx_get_segment() {
     3)   0.074 us    |            vmx_read_guest_seg_base();
     3)   0.065 us    |            vmx_read_guest_seg_selector();
     3)   0.066 us    |            vmx_read_guest_seg_ar();
     3)   1.636 us    |          }
     3)   0.058 us    |          vmx_get_rflags();
     3)   0.062 us    |          vmx_read_guest_seg_ar();
     3)   3.469 us    |        }
     3)               |        vmx_get_cs_db_l_bits() {
     3)   0.058 us    |          vmx_read_guest_seg_ar();
     3)   0.662 us    |        }
     3)               |        get_vmx_mem_address() {
     3)   0.068 us    |          vmx_cache_reg();
     3)               |          vmx_get_segment() {
     3)   0.074 us    |            vmx_read_guest_seg_base();
     3)   0.068 us    |            vmx_read_guest_seg_selector();
     3)   0.071 us    |            vmx_read_guest_seg_ar();
     3)   1.756 us    |          }
     3)               |          kvm_queue_exception_e() {
     3)   0.066 us    |            kvm_multiple_exception();
     3)   0.684 us    |          }
     3)   4.085 us    |        }
     3)   9.833 us    |      }
     3) + 10.366 us   |    }

Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software
Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine
Control Structure", I found that we're enforcing that the destination
operand is NOT located in a read-only data segment or any code segment when
the L1 is in long mode - BUT that check should only happen when it is in
protected mode.

Shuffling the code a bit to make our emulation follow the specification
allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests
without problems.

Fixes: f9eb4af6 ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions")
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Eugene Korenevsky <ekorenevsky@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

44dd5cec

kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES · 54f87e16

Xiubo Li authored Jun 15, 2016

commit caf1ff26 upstream.

These days, we experienced one guest crash with 8 cores and 3 disks,
with qemu error logs as bellow:

qemu-system-x86_64: /build/qemu-2.0.0/kvm-all.c:984:
kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

And then we found one patch(bdf026317d) in qemu tree, which said
could fix this bug.

Execute the following script will reproduce the BUG quickly:

irq_affinity.sh
========================================================================

vda_irq_num=25
vdb_irq_num=27
while [ 1 ]
do
    for irq in {1,2,4,8,10,20,40,80}
        do
            echo $irq > /proc/irq/$vda_irq_num/smp_affinity
            echo $irq > /proc/irq/$vdb_irq_num/smp_affinity
            dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct
            dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct
        done
done
========================================================================

The following qemu log is added in the qemu code and is displayed when
this bug reproduced:

kvm_irqchip_commit_routes: max gsi: 1008, nr_allocated_irq_routes: 1024,
irq_routes->nr: 1024, gsi_count: 1024.

That's to say when irq_routes->nr == 1024, there are 1024 routing entries,
but in the kernel code when routes->nr >= 1024, will just return -EINVAL;

The nr is the number of the routing entries which is in of
[1 ~ KVM_MAX_IRQ_ROUTES], not the index in [0 ~ KVM_MAX_IRQ_ROUTES - 1].

This patch fix the BUG above.
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com>
Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

54f87e16