Commits · ac8f3bf8832a405cc6e4dccb1d26d5cb2994d234 · nexedi / linux

13 Dec, 2015 2 commits

n_tty: Fix poll() after buffer-limited eof push read · ac8f3bf8

Peter Hurley authored Nov 27, 2015

commit 40d5e090 ("n_tty: Fix EOF push handling") fixed EOF push
for reads. However, that approach still allows a condition mismatch
between poll() and read(), where poll() returns POLLIN but read()
blocks. This state can happen when a previous read() returned because
the user buffer was full and the next character was an EOF not at the
beginning of the line. While the next read() will properly identify
the condition and advance the read buffer tail without improperly
indicating an EOF file condition (ie., read() will not mistakenly
return 0), poll() will mistakenly indicate POLLIN.

Although a possible solution would be to peek at the input buffer
in n_tty_poll(), the better solution in this patch is to eat the
EOF during the previous read() (ie., fix the problem by eliminating
the condition).

The current canon line buffer copy limits the scan for next end-of-line
to the smaller of either,
   a. the remaining user buffer size
   b. completed lines in the input buffer
When the remaining user buffer size is exactly one less than the
end-of-line marked by EOF push, the EOF is not scanned nor skipped
but left for subsequent reads. In the example below, the scan
index 'eol' has stopped at the EOF because it is past the scan
limit of 5 (not because it has found the next set bit in read_flags)

   user buffer [*nr = 5]    _ _ _ _ _

   read_flags               0 0 0 0 0   1
   input buffer             h e l l o [EOF]
                            ^           ^
                           /           /
                         tail        eol

   result: found = 0, tail += 5, *nr += 5

Instead, allow the scan to peek ahead 1 byte (while still limiting the
scan to completed lines in the input buffer). For the example above,

   result: found = 1, tail += 6, *nr += 5

Because the scan limit is now bumped +1 byte, when the scan is
completed, the tail advance and the user buffer copy limit is
re-clamped to *nr when EOF is _not_ found.

Fixes: 40d5e090 ("n_tty: Fix EOF push handling")
Cc: <stable@vger.kernel.org> # 3.12+
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ac8f3bf8

serial: 8250_uniphier: fix dl_read and dl_write functions · 7be047e0

Masahiro Yamada authored Oct 19, 2015

The register offset must be shifted by regshift, otherwise the
baudrate is not set. I missed the issue probably because the
divisor register was already set by the boot loader.

Fixes: 1a8d2903 ("serial: 8250_uniphier: add UniPhier serial driver")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7be047e0

06 Dec, 2015 10 commits

Linux 4.4-rc4 · 527e9316
Linus Torvalds authored Dec 06, 2015

527e9316

staging/lustre: remove IOC_LIBCFS_PING_TEST ioctl · d035e336

James Simmons authored Dec 04, 2015

The ioctl IOC_LIBCFS_PING_TEST has not been used in ages.  The recent
nidstring changes which moved all the nidstring operations from libcfs
to the LNet layer but this ioctl code was still using an nidstring
operation that was causing a circular dependency loop between libcfs and
LNet.
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d035e336

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d8cd93ea

Linus Torvalds authored Dec 06, 2015

Pull vfs fixes from Al Viro:
 "A couple of fixes (-stable fodder) + dead code removal after the
  overlayfs fix.

  I agree that it's better to separate from the fix part to make
  backporting easier, but IMO it's not worth delaying said dead code
  removal until the next window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Don't reset ->total_link_count on nested calls of vfs_path_lookup()
  ovl: get rid of the dead code left from broken (and disabled) optimizations
  ovl: fix permission checking for setattr

d8cd93ea

Don't reset ->total_link_count on nested calls of vfs_path_lookup() · 2788cc47

Al Viro authored Dec 06, 2015

we already zero it on outermost set_nameidata(), so initialization in
path_init() is pointless and wrong.  The same DoS exists on pre-4.2
kernels, but there a slightly different fix will be needed.

Cc: stable@vger.kernel.org # v4.2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

2788cc47

ovl: get rid of the dead code left from broken (and disabled) optimizations · 0f7ff2da
Al Viro authored Dec 06, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
0f7ff2da

ovl: fix permission checking for setattr · acff81ec

Miklos Szeredi authored Dec 04, 2015

[Al Viro] The bug is in being too enthusiastic about optimizing ->setattr()
away - instead of "copy verbatim with metadata" + "chmod/chown/utimes"
(with the former being always safe and the latter failing in case of
insufficient permissions) it tries to combine these two.  Note that copyup
itself will have to do ->setattr() anyway; _that_ is where the elevated
capabilities are right.  Having these two ->setattr() (one to set verbatim
copy of metadata, another to do what overlayfs ->setattr() had been asked
to do in the first place) combined is where it breaks.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

acff81ec

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fb7b26e4

Linus Torvalds authored Dec 06, 2015

Pull scheduler fixes from Thomas Gleixner:
 "This updates contains the following changes:

   - Fix a signal handling regression in the bit wait functions.

   - Avoid false positive warnings in the wakeup path.

   - Initialize the scheduler root domain properly.

   - Handle gtime calculations in proc/$PID/stat proper.

   - Add more documentation for the barriers in try_to_wake_up().

   - Fix a subtle race in try_to_wake_up() which might cause a task to
     be scheduled on two cpus

   - Compile static helper function only when it is used"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule()
  sched/core: Better document the try_to_wake_up() barriers
  sched/cputime: Fix invalid gtime in proc
  sched/core: Clear the root_domain cpumasks in init_rootdomain()
  sched/core: Remove false-positive warning from wake_up_process()
  sched/wait: Fix signal handling in bit wait helpers
  sched/rt: Hide the push_irq_work_func() declaration

fb7b26e4

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 69d2ca60

Linus Torvalds authored Dec 06, 2015

Pull x86 fixes from Thoma Gleixner:
 "Another round of fixes for x86:

   - Move the initialization of the microcode driver to late_initcall to
     make sure everything that init function needs is available.

   - Make sure that lockdep knows about interrupts being off in the
     entry code before calling into c-code.

   - Undo the cpu hotplug init delay regression.

   - Use the proper conditionals in the mpx instruction decoder.

   - Fixup restart_syscall for x32 tasks.

   - Fix the hugepage regression on PAE kernels which was introduced
     with the latest PAT changes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/signal: Fix restart_syscall number for x32 tasks
  x86/mpx: Fix instruction decoder condition
  x86/mm: Fix regression with huge pages on PAE
  x86 smpboot: Re-enable init_udelay=0 by default on modern CPUs
  x86/entry/64: Fix irqflag tracing wrt context tracking
  x86/microcode: Initialize the driver late when facilities are up

69d2ca60

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 19190f5e

Linus Torvalds authored Dec 06, 2015

Pull SCSI fixes from James Bottomley:
 "This is quite a bumper crop of fixes: three from Arnd correcting
  various build issues in some configurations, a lock recursion in
  qla2xxx.  Two potentially exploitable issues in hpsa and mvsas, a
  potential null deref in st, a revert of a bdi registration fix that
  turned out to cause even more problems, a set of fixes to allow people
  who only defined MPT2SAS to still work after the mpt2/mpt3sas merger
  and a couple of fixes for issues turned up by the hyper-v storvsc
  driver"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  mpt3sas: fix Kconfig dependency problem for mpt2sas back compatibility
  Revert "scsi: Fix a bdi reregistration race"
  mpt3sas: Add dummy Kconfig option for backwards compatibility
  Fix a memory leak in scsi_host_dev_release()
  block/sd: Fix device-imposed transfer length limits
  scsi_debug: fix prevent_allow+verify regressions
  MAINTAINERS: Add myself as co-maintainer of the SCSI subsystem.
  sd: Make discard granularity match logical block size when LBPRZ=1
  scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
  scsi: advansys needs ISA dma api for ISA support
  scsi_sysfs: protect against double execution of __scsi_remove_device()
  st: fix potential null pointer dereference.
  scsi: report 'INQUIRY result too short' once per host
  advansys: fix big-endian builds
  qla2xxx: Fix rwlock recursion
  hpsa: logical vs bitwise AND typo
  mvsas: don't allow negative timeouts
  mpt3sas: Fix use sas_is_tlr_enabled API before enabling MPI2_SCSIIO_CONTROL_TLR_ON flag

19190f5e

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · a2dbb7b5

Linus Torvalds authored Dec 05, 2015

Pull drm fixes from Dave Airlie:
 "A bunch of change across the board, the main things are some vblank
  fallout in radeon and nouveau required some work, but I think this
  should fix it all.  There is also one drm fix for an oops in vmwgfx
  with how we pass the drm master around.

  The rest is just some amdgpu, i915, imx and rockchip fixes.

  Probably more than I'd like at this point, but hopefully things settle
  down now"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (40 commits)
  drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)
  drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)
  drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
  drm/amdgpu: add spin lock to protect freed list in vm (v2)
  drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2
  drm/amdgpu: take a BO reference for the user fence
  drm/amdgpu: take a BO reference in the display code
  drm/amdgpu: set snooped flags only on system addresses v2
  drm/nouveau: Fix pre-nv50 pageflip events (v4)
  drm: Fix an unwanted master inheritance v2
  drm/amdgpu: fix race condition in amd_sched_entity_push_job
  drm/amdgpu: add err check for pin userptr
  drm/i915: take a power domain reference while checking the HDMI live status
  drm/i915: add MISSING_CASE to a few port/aux power domain helpers
  drm/i915/ddi: fix intel_display_port_aux_power_domain() after HDMI detect
  drm/i915: Introduce a gmbus power domain
  drm/i915: Clean up AUX power domain handling
  drm/rockchip: Use CRTC vblank event interface
  drm/rockchip: Fix module autoload for OF platform driver
  drm/rockchip: vop: fix window origin calculation
  ...

a2dbb7b5

05 Dec, 2015 4 commits

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 9cfe5212

Linus Torvalds authored Dec 05, 2015

Pull crypto fixes from Herbert Xu:
 "This fixes a couple of crypto drivers that were using memcmp to verify
  authentication tags.  They now use crypto_memneq instead"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: talitos - Fix timing leak in ESP ICV verification
  crypto: nx - Fix timing leak in GCM and CCM decryption

9cfe5212

x86/signal: Fix restart_syscall number for x32 tasks · 22eab110

Dmitry V. Levin authored Dec 01, 2015

When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK,
regs->ax is assigned to a restart_syscall number. For x32 tasks, this
syscall number must have __X32_SYSCALL_BIT set, otherwise it will be
an x86_64 syscall number instead of a valid x32 syscall number. This
issue has been there since the introduction of x32.

Reported-by: strace/tests/restart_syscall.test
Reported-and-tested-by: Elvira Khabirova <lineprinter0@gmail.com>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Cc: Elvira Khabirova <lineprinter0@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151130215436.GA25996@altlinux.orgSigned-off-by: Thomas Gleixner <tglx@linutronix.de>

22eab110

x86/mpx: Fix instruction decoder condition · 8e8efe03

Dave Hansen authored Nov 30, 2015

MPX decodes instructions in order to tell which bounds register
was violated.  Part of this decoding involves looking at the "REX
prefix" which is a special instrucion prefix used to retrofit
support for new registers in to old instructions.

The X86_REX_*() macros are defined to return actual bit values:

	#define X86_REX_R(rex) ((rex) & 4)

*not* boolean values.  However, the MPX code was checking for
them like they were booleans.  This might have led to us
mis-decoding the "REX prefix" and giving false information out to
userspace about bounds violations.  X86_REX_B() actually is bit 1,
so this is really only broken for the X86_REX_X() case.

Fix the conditionals up to tolerate the non-boolean values.

Fixes: fcc7ffd6 "x86, mpx: Decode MPX instruction to get bound violation information"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151201003113.D800C1E0@viggo.jf.intel.comSigned-off-by: Thomas Gleixner <tglx@linutronix.de>

8e8efe03

Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-next · df4d4aa9

Dave Airlie authored Dec 05, 2015

A few more last minute fixes for 4.4 on top of my pull request from
earlier this week.  The big change here is a vblank regression fix due to
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks
were missed".  Beyond that, a hotplug fix and a few VM fixes.

* 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)
  drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)
  drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
  drm/amdgpu: add spin lock to protect freed list in vm (v2)
  drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2
  drm/amdgpu: take a BO reference for the user fence
  drm/amdgpu: take a BO reference in the display code
  drm/amdgpu: set snooped flags only on system addresses v2
  drm/amdgpu: fix race condition in amd_sched_entity_push_job
  drm/amdgpu: add err check for pin userptr
  add blacklist for thinkpad T40p
  drm/amdgpu: fix VM page table reference counting
  drm/amdgpu: fix userptr flags check

df4d4aa9

04 Dec, 2015 24 commits

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 849ee3d4

Linus Torvalds authored Dec 04, 2015

Pull Ceph fix from Sage Weil:
 "This addresses a refcounting bug that leads to a use-after-free"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: don't put snap_context twice in rbd_queue_workfn()

849ee3d4

drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3) · 8e36f9d3

Alex Deucher authored Dec 03, 2015

commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many
vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core
more fragile to drivers which don't update hw vblank counters and
vblank timestamps in sync with firing of the vblank irq and
essentially at leading edge of vblank.

This exposed a problem with radeon-kms/amdgpu-kms which do not
satisfy above requirements:

The vblank irq fires a few scanlines before start of vblank, but
programmed pageflips complete at start of vblank and
vblank timestamps update at start of vblank, whereas the
hw vblank counter increments only later, at start of vsync.

This leads to problems like off by one errors for vblank counter
updates, vblank counters apparently going backwards or vblank
timestamps apparently having time going backwards. The net result
is stuttering of graphics in games, or little hangs, as well as
total failure of timing sensitive applications.

See bug #93147 for an example of the regression on Linux 4.4-rc:

https://bugs.freedesktop.org/show_bug.cgi?id=93147

This patch tries to align all above events better from the
viewpoint of the drm core / of external callers to fix the problem:

1. The apparent start of vblank is shifted a few scanlines earlier,
so the vblank irq now always happens after start of this extended
vblank interval and thereby drm_update_vblank_count() always samples
the updated vblank count and timestamp of the new vblank interval.

To achieve this, the reporting of scanout positions by
radeon_get_crtc_scanoutpos() now operates as if the vblank starts
radeon_crtc->lb_vblank_lead_lines before the real start of the hw
vblank interval. This means that the vblank timestamps which are based
on these scanout positions will now update at this earlier start of
vblank.

2. The driver->get_vblank_counter() function will bump the returned
vblank count as read from the hw by +1 if the query happens after
the shifted earlier start of the vblank, but before the real hw increment
at start of vsync, so the counter appears to increment at start of vblank
in sync with the timestamp update.

3. Calls from vblank irq-context and regular non-irq calls are now
treated identical, always simulating the shifted vblank start, to
avoid inconsistent results for queries happening from vblank irq vs.
happening from drm_vblank_enable() or vblank_disable_fn().

4. The radeon_flip_work_func will delay mmio programming a pageflip until
the start of the real vblank iff it happens to execute inside the shifted
earlier start of the vblank, so pageflips now also appear to execute at
start of the shifted vblank, in sync with vblank counter and timestamp
updates. This to avoid some races between updates of vblank count and
timestamps that are used for swap scheduling and pageflip execution which
could cause pageflips to execute before the scheduled target vblank.

The lb_vblank_lead_lines "fudge" value is calculated as the size of
the display controllers line buffer in scanlines for the given video
mode: Vblank irq's are triggered by the line buffer logic when the line
buffer refill for a video frame ends, ie. when the line buffer source read
position enters the hw vblank. This means that a vblank irq could fire at
most as many scanlines before the current reported scanout position of the
crtc timing generator as the number of scanlines the line buffer can
maximally hold for a given video mode.

This patch has been successfully tested on a RV730 card with DCE-3 display
engine and on a evergreen card with DCE-4 display engine, in single-display
and dual-display configuration, with different video modes.

A similar patch is needed for amdgpu-kms to fix the same problem.

Limitations:

- Maybe replace the udelay() in the flip_work_func() by a suitable
  usleep_range() for a bit better efficiency? Will try that.

- Line buffer sizes in pixels are hard-coded on < DCE-4 to a value
  i just guessed to be high enough to work ok, lacking info on the true
  sizes atm.

Probably fixes: fdo#93147

Port of Mario's radeon fix to amdgpu.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v1) Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>

(v2) Refine amdgpu_flip_work_func() for better efficiency.

     In amdgpu_flip_work_func, replace the busy waiting udelay(5)
     with event lock held by a more performance and energy efficient
     usleep_range() until at least predicted true start of hw vblank,
     with some slack for scheduler happiness. Release the event lock
     during waits to not delay other outputs in doing their stuff, as
     the waiting can last up to 200 usecs in some cases.

     Also small fix to code comment and formatting in that function.

(v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>

(v3) Fix crash in crtc disabled case

8e36f9d3

Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · fb39cbda

Linus Torvalds authored Dec 04, 2015

Pull libnvdimm fixes from Dan Williams:

 - NFIT parsing regression fixes from Linda.  The nvdimm hot-add
   implementation merged in 4.4-rc1 interpreted the specification in a
   way that breaks actual HPE platforms.  We are also closing the loop
   with the ACPI Working Group to get this clarification added to the
   spec.

 - Andy pointed out that his laptop without nvdimm resources is loading
   the e820-nvdimm module by default, fix that up to only load the
   module when an e820-type-12 range is present.

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nfit: Adjust for different _FIT and NFIT headers
  nfit: Fix the check for a successful NFIT merge
  nfit: Account for table size length variation
  libnvdimm, e820: skip module loading when no type-12

fb39cbda

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · db281766

Linus Torvalds authored Dec 04, 2015

Pull ARM KVM fixes from Paolo Bonzini:

 - a series of fixes to deal with the aliasing between the sp and xzr
   register

 - a fix for the cache flush fix that went in -rc3

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  ARM/arm64: KVM: correct PTE uncachedness check
  arm64: KVM: Get rid of old vcpu_reg()
  arm64: KVM: Correctly handle zero register in system register accesses
  arm64: KVM: Remove const from struct sys_reg_params
  arm64: KVM: Correctly handle zero register during MMIO

db281766

drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2) · 5b5561b3

Mario Kleiner authored Nov 25, 2015

commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many
vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core
more fragile to drivers which don't update hw vblank counters and
vblank timestamps in sync with firing of the vblank irq and
essentially at leading edge of vblank.

This exposed a problem with radeon-kms/amdgpu-kms which do not
satisfy above requirements:

The vblank irq fires a few scanlines before start of vblank, but
programmed pageflips complete at start of vblank and
vblank timestamps update at start of vblank, whereas the
hw vblank counter increments only later, at start of vsync.

This leads to problems like off by one errors for vblank counter
updates, vblank counters apparently going backwards or vblank
timestamps apparently having time going backwards. The net result
is stuttering of graphics in games, or little hangs, as well as
total failure of timing sensitive applications.

See bug #93147 for an example of the regression on Linux 4.4-rc:

https://bugs.freedesktop.org/show_bug.cgi?id=93147

This patch tries to align all above events better from the
viewpoint of the drm core / of external callers to fix the problem:

1. The apparent start of vblank is shifted a few scanlines earlier,
so the vblank irq now always happens after start of this extended
vblank interval and thereby drm_update_vblank_count() always samples
the updated vblank count and timestamp of the new vblank interval.

To achieve this, the reporting of scanout positions by
radeon_get_crtc_scanoutpos() now operates as if the vblank starts
radeon_crtc->lb_vblank_lead_lines before the real start of the hw
vblank interval. This means that the vblank timestamps which are based
on these scanout positions will now update at this earlier start of
vblank.

2. The driver->get_vblank_counter() function will bump the returned
vblank count as read from the hw by +1 if the query happens after
the shifted earlier start of the vblank, but before the real hw increment
at start of vsync, so the counter appears to increment at start of vblank
in sync with the timestamp update.

3. Calls from vblank irq-context and regular non-irq calls are now
treated identical, always simulating the shifted vblank start, to
avoid inconsistent results for queries happening from vblank irq vs.
happening from drm_vblank_enable() or vblank_disable_fn().

4. The radeon_flip_work_func will delay mmio programming a pageflip until
the start of the real vblank iff it happens to execute inside the shifted
earlier start of the vblank, so pageflips now also appear to execute at
start of the shifted vblank, in sync with vblank counter and timestamp
updates. This to avoid some races between updates of vblank count and
timestamps that are used for swap scheduling and pageflip execution which
could cause pageflips to execute before the scheduled target vblank.

The lb_vblank_lead_lines "fudge" value is calculated as the size of
the display controllers line buffer in scanlines for the given video
mode: Vblank irq's are triggered by the line buffer logic when the line
buffer refill for a video frame ends, ie. when the line buffer source read
position enters the hw vblank. This means that a vblank irq could fire at
most as many scanlines before the current reported scanout position of the
crtc timing generator as the number of scanlines the line buffer can
maximally hold for a given video mode.

This patch has been successfully tested on a RV730 card with DCE-3 display
engine and on a evergreen card with DCE-4 display engine, in single-display
and dual-display configuration, with different video modes.

A similar patch is needed for amdgpu-kms to fix the same problem.

Limitations:

- Line buffer sizes in pixels are hard-coded on < DCE-4 to a value
  i just guessed to be high enough to work ok, lacking info on the true
  sizes atm.

Fixes: fdo#93147
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

(v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net>

(v2) Refine radeon_flip_work_func() for better efficiency:

     In radeon_flip_work_func, replace the busy waiting udelay(5)
     with event lock held by a more performance and energy efficient
     usleep_range() until at least predicted true start of hw vblank,
     with some slack for scheduler happiness. Release the event lock
     during waits to not delay other outputs in doing their stuff, as
     the waiting can last up to 200 usecs in some cases.

     Retested on DCE-3 and DCE-4 to verify it still works nicely.

(v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5b5561b3

drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt · cb5d4166

Lyude authored Dec 03, 2015

HPD signals on DVI ports can be fired off before the pins required for
DDC probing actually make contact, due to the pins for HPD making
contact first. This results in a HPD signal being asserted but DDC
probing failing, resulting in hotplugging occasionally failing.

This is somewhat rare on most cards (depending on what angle you plug
the DVI connector in), but on some cards it happens constantly. The
Radeon R5 on the machine used for testing this patch for instance, runs
into this issue just about every time I try to hotplug a DVI monitor and
as a result hotplugging almost never works.

Rescheduling the hotplug work for a second when we run into an HPD
signal with a failing DDC probe usually gives enough time for the rest
of the connector's pins to make contact, and fixes this issue.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cb5d4166

drm/amdgpu: add spin lock to protect freed list in vm (v2) · 81d75a30

jimqu authored Dec 04, 2015

there is a protection fault about freed list when OCL test.
add a spin lock to protect it.

v2: drop changes in vm_fini
Signed-off-by: JimQu <jim.qu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

81d75a30

drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2 · 9c97b5ab

Christian König authored Dec 03, 2015

The gtt_end is already inclusive, we don't need to subtract one here.

v2 (chk): keep the fix for the VM code, cause here it really applies.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9c97b5ab

drm/amdgpu: take a BO reference for the user fence · f3f17692

Christian König authored Dec 03, 2015

No need for a GEM reference here.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f3f17692

drm/amdgpu: take a BO reference in the display code · e9d951a8

Christian König authored Dec 03, 2015

No need for the GEM reference here.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e9d951a8

Merge tag 'kvm-arm-for-v4.4-rc4' of... · 09922076

Paolo Bonzini authored Dec 04, 2015

Merge tag 'kvm-arm-for-v4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM fixes for v4.4-rc4

- A series of fixes to deal with the aliasing between the sp and xzr register
- A fix for the cache flush fix that went in -rc3

09922076

drm/amdgpu: set snooped flags only on system addresses v2 · 6d99905a

Christian König authored Dec 04, 2015

Not necessary for VRAM.

v2: no need to check if ttm is NULL.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6d99905a

Merge tag 'sound-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 8cdef969

Linus Torvalds authored Dec 04, 2015

Pull sound fixes from Takashi Iwai:
 "This time we've got a larger number of updates, mainly from ASoC
  world.  The only significant LOCs found here are for Realtek codecs,
  where most of changes are quite systematic replacements.

  There are also a few fixes in ASoC core side: one is the PM call order
  fix to ensure the DPAM resume working properly.  Another is the proper
  cleanup call after freeing DAPM widgets, and the correction of the
  wrong callback set in topology API.

  The rest are a wide range of driver-specific small fixes, including
  HD-audio"

* tag 'sound-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
  ALSA: hda - Add Conexant CX8200 (14f1:2008) codec entry
  ALSA: hda - Correct codec names for 14f1:50f1 and 14f1:50f3
  ALSA: hda - Skip ELD notification during system suspend
  ASoC: core: Change power state before rechecking endpoint
  ASoC: fix kernel-doc warnings in sound/soc/soc-ops.c
  ASoC: rt5645: Add dmi_system_id "Google Terra"
  ASoC: rockchip: Fix incorrect VDW value for 24 bit
  ASoC: fsl: clarify ac97 dependency
  ASoC: Intel: Skylake: fix memory leak
  ASoC: davinci-mcasp: Fix master capture only mode
  ASoC: es8328: Fix shifts for mixer switches
  ASoC: rt5645: Add dmi_system_id "Google Wizpig"
  ASoC: sti: set player private data
  ASoC: sti: rename ST proprietary DT properties
  ASoC: sti: remove wrong error message
  ASoC: Intel: Skylake: Add I2C depends for SKL machine
  ASoC: topology: fix info callback for TLV byte control
  ASoC: rt5670: fix wrong bit def for pll src
  ASoC: nau8825: add pm function
  ASoC: rt5645: Add struct dmi_system_id "Google Edgar" for Chrome OS
  ...

8cdef969

Merge tag 'pm+acpi-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b1007e73

Linus Torvalds authored Dec 04, 2015

Pull power management and ACPI fixes from Rafael Wysocki:
 "These fix a recent regression in the ACPI PCI host bridge
  initialization code, clean up some recent changes (generic power
  domains framework, ACPI AML debugger support), fix three older but
  annoying bugs (PCI power management.  generic power domains framework,
  cpufreq) and a build problem (device properties framework), and update
  a stale MAINTAINERS entry (ACPI backlight driver).

  Specifics:

   - Fix a regression in the ACPI PCI host bridge initialization code
     introduced by the recent consolidation of the host bridge handling
     on x86 and ia64 that forgot to take one special piece of code
     related to NUMA on x86 into account (Liu Jiang).

   - Improve the Kconfig help description of the new ACPI AML debugger
     support option to avoid possible confusion (Peter Zijlstra).

   - Remove a piece of code in the generic power domains framework that
     should have been removed by one of the recent commits modifying
     that code (Ulf Hansson).

   - Reduce the log level of a PCI PM message that generates a lot of
     false-positive log noise for some drivers and improve the message
     itself while at it (Imre Deak).

   - Fix the OF-based domain lookup code in the generic power domains
     framework to make it drop references to DT nodes correctly (Eric
     Anholt).

   - Prevent the cpufreq core from setting the policy back to the
     default after a CPU offline/online cycle for cpufreq drivers
     providing the ->setpolicy callback (Srinivas Pandruvada).

   - Fix a build problem for CONFIG_ACPI unset in the device properties
     framework (Hanjun Guo).

   - Fix a stale file path in the ACPI backlight driver entry in
     MAINTAINERS (Dan Carpenter)"

* tag 'pm+acpi-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()
  cpufreq: use last policy after online for drivers with ->setpolicy
  PCI / PM: Tune down retryable runtime suspend error messages
  PM / Domains: Validate cases of a non-bound driver in genpd governor
  MAINTAINERS: ACPI / video: update a file name in drivers/acpi/
  ACPI / property: fix compile error for acpi_node_get_property_reference() when CONFIG_ACPI=n
  x86/PCI/ACPI: Fix regression caused by commit 4d6b4e69
  ACPI: Better describe ACPI_DEBUGGER

b1007e73

ARM/arm64: KVM: correct PTE uncachedness check · 0de58f85

Ard Biesheuvel authored Dec 03, 2015

Commit e6fab544 ("ARM/arm64: KVM: test properly for a PTE's
uncachedness") modified the logic to test whether a HYP or stage-2
mapping needs flushing, from [incorrectly] interpreting the page table
attributes to [incorrectly] checking whether the PFN that backs the
mapping is covered by host system RAM. The PFN number is part of the
output of the translation, not the input, so we have to use pte_pfn()
on the contents of the PTE, not __phys_to_pfn() on the HYP virtual
address or stage-2 intermediate physical address.

Fixes: e6fab544 ("ARM/arm64: KVM: test properly for a PTE's uncachedness")
Cc: stable@vger.kernel.org
Tested-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

0de58f85

arm64: KVM: Get rid of old vcpu_reg() · f6be563a

Pavel Fedin authored Dec 04, 2015

Using oldstyle vcpu_reg() accessor is proven to be inappropriate and
unsafe on ARM64. This patch converts the rest of use cases to new
accessors and completely removes vcpu_reg() on ARM64.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

f6be563a

arm64: KVM: Correctly handle zero register in system register accesses · 2ec5be3d

Pavel Fedin authored Dec 04, 2015

System register accesses also use zero register for Rt == 31, and
therefore using it will also result in getting SP value instead. This
patch makes them also using new accessors, introduced by the previous
patch. Since register value is no longer directly associated with storage
inside vCPU context structure, we introduce a dedicated storage for it in
struct sys_reg_params.

This refactor also gets rid of "massive hack" in kvm_handle_cp_64().
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

2ec5be3d

arm64: KVM: Remove const from struct sys_reg_params · 3fec037d

Pavel Fedin authored Dec 04, 2015

Further rework is going to introduce a dedicated storage for transfer
register value in struct sys_reg_params. Before doing this we have to
remove 'const' modifiers from it in all accessor functions and their
callers.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

3fec037d

arm64: KVM: Correctly handle zero register during MMIO · bc45a516

Pavel Fedin authored Dec 04, 2015

On ARM64 register index of 31 corresponds to both zero register and SP.
However, all memory access instructions, use ZR as transfer register. SP
is used only as a base register in indirect memory addressing, or by
register-register arithmetics, which cannot be trapped here.

Correct emulation is achieved by introducing new register accessor
functions, which can do special handling for reg_num == 31. These new
accessors intentionally do not rely on old vcpu_reg() on ARM64, because
it is to be removed. Since the affected code is shared by both ARM
flavours, implementations of these accessors are also added to ARM32 code.

This patch fixes setting MMIO register to a random value (actually SP)
instead of zero by something like:

 *((volatile int *)reg) = 0;

compilers tend to generate "str wzr, [xx]" here

[Marc: Fixed 32bit splat]
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

bc45a516

rbd: don't put snap_context twice in rbd_queue_workfn() · 70b16db8

Ilya Dryomov authored Nov 27, 2015

Commit 4e752f0a ("rbd: access snapshot context and mapping size
safely") moved ceph_get_snap_context() out of rbd_img_request_create()
and into rbd_queue_workfn(), adding a ceph_put_snap_context() to the
error path in rbd_queue_workfn().  However, rbd_img_request_create()
consumes a ref on snapc, so calling ceph_put_snap_context() after
a successful rbd_img_request_create() leads to an extra put.  Fix it.

Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>

70b16db8

Merge branches 'pm-domains' and 'pm-cpufreq' · d441fe25

Rafael J. Wysocki authored Dec 04, 2015

* pm-domains:
  PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()
  PM / Domains: Validate cases of a non-bound driver in genpd governor

* pm-cpufreq:
  cpufreq: use last policy after online for drivers with ->setpolicy

d441fe25

Merge branches 'acpica', 'acpi-video' and 'device-properties' · 3e5050e6

Rafael J. Wysocki authored Dec 04, 2015

* acpica:
  ACPI: Better describe ACPI_DEBUGGER

* acpi-video:
  MAINTAINERS: ACPI / video: update a file name in drivers/acpi/

* device-properties:
  ACPI / property: fix compile error for acpi_node_get_property_reference() when CONFIG_ACPI=n

3e5050e6

Merge branches 'acpi-pci' and 'pm-pci' · c09c9dd2

Rafael J. Wysocki authored Dec 04, 2015

* acpi-pci:
  x86/PCI/ACPI: Fix regression caused by commit 4d6b4e69

* pm-pci:
  PCI / PM: Tune down retryable runtime suspend error messages

c09c9dd2

sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule() · ecf7d01c

Peter Zijlstra authored Oct 07, 2015

Oleg noticed that its possible to falsely observe p->on_cpu == 0 such
that we'll prematurely continue with the wakeup and effectively run p on
two CPUs at the same time.

Even though the overlap is very limited; the task is in the middle of
being scheduled out; it could still result in corruption of the
scheduler data structures.

        CPU0                            CPU1

        set_current_state(...)

        <preempt_schedule>
          context_switch(X, Y)
            prepare_lock_switch(Y)
              Y->on_cpu = 1;
            finish_lock_switch(X)
              store_release(X->on_cpu, 0);

                                        try_to_wake_up(X)
                                          LOCK(p->pi_lock);

                                          t = X->on_cpu; // 0

          context_switch(Y, X)
            prepare_lock_switch(X)
              X->on_cpu = 1;
            finish_lock_switch(Y)
              store_release(Y->on_cpu, 0);
        </preempt_schedule>

        schedule();
          deactivate_task(X);
          X->on_rq = 0;

                                          if (X->on_rq) // false

                                          if (t) while (X->on_cpu)
                                            cpu_relax();

          context_switch(X, ..)
            finish_lock_switch(X)
              store_release(X->on_cpu, 0);

Avoid the load of X->on_cpu being hoisted over the X->on_rq load.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

ecf7d01c