1. 30 May, 2018 17 commits
    • Davidlohr Bueso's avatar
      ipc/shm: fix shmat() nil address after round-down when remapping · 9c798bc1
      Davidlohr Bueso authored
      commit 8f89c007 upstream.
      
      shmat()'s SHM_REMAP option forbids passing a nil address for; this is in
      fact the very first thing we check for.  Andrea reported that for
      SHM_RND|SHM_REMAP cases we can end up bypassing the initial addr check,
      but we need to check again if the address was rounded down to nil.  As
      of this patch, such cases will return -EINVAL.
      
      Link: http://lkml.kernel.org/r/20180503204934.kk63josdu6u53fbd@linux-n805Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Reported-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c798bc1
    • Davidlohr Bueso's avatar
      Revert "ipc/shm: Fix shmat mmap nil-page protection" · 2ef44a3c
      Davidlohr Bueso authored
      commit a73ab244 upstream.
      
      Patch series "ipc/shm: shmat() fixes around nil-page".
      
      These patches fix two issues reported[1] a while back by Joe and Andrea
      around how shmat(2) behaves with nil-page.
      
      The first reverts a commit that it was incorrectly thought that mapping
      nil-page (address=0) was a no no with MAP_FIXED.  This is not the case,
      with the exception of SHM_REMAP; which is address in the second patch.
      
      I chose two patches because it is easier to backport and it explicitly
      reverts bogus behaviour.  Both patches ought to be in -stable and ltp
      testcases need updated (the added testcase around the cve can be
      modified to just test for SHM_RND|SHM_REMAP).
      
      [1] lkml.kernel.org/r/20180430172152.nfa564pvgpk3ut7p@linux-n805
      
      This patch (of 2):
      
      Commit 95e91b83 ("ipc/shm: Fix shmat mmap nil-page protection")
      worked on the idea that we should not be mapping as root addr=0 and
      MAP_FIXED.  However, it was reported that this scenario is in fact
      valid, thus making the patch both bogus and breaks userspace as well.
      
      For example X11's libint10.so relies on shmat(1, SHM_RND) for lowmem
      initialization[1].
      
      [1] https://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/os-support/linux/int10/linux.c#n347
      Link: http://lkml.kernel.org/r/20180503203243.15045-2-dave@stgolabs.net
      Fixes: 95e91b83 ("ipc/shm: Fix shmat mmap nil-page protection")
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Reported-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Reported-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ef44a3c
    • Michael J. Ruhl's avatar
      IB/hfi1: Use after free race condition in send context error path · 36017b0c
      Michael J. Ruhl authored
      commit f9e76ca3 upstream.
      
      A pio send egress error can occur when the PSM library attempts to
      to send a bad packet.  That issue is still being investigated.
      
      The pio error interrupt handler then attempts to progress the recovery
      of the errored pio send context.
      
      Code inspection reveals that the handling lacks the necessary locking
      if that recovery interleaves with a PSM close of the "context" object
      contains the pio send context.
      
      The lack of the locking can cause the recovery to access the already
      freed pio send context object and incorrectly deduce that the pio
      send context is actually a kernel pio send context as shown by the
      NULL deref stack below:
      
      [<ffffffff8143d78c>] _dev_info+0x6c/0x90
      [<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
      [<ffffffff816ab124>] ? __schedule+0x424/0x9b0
      [<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
      [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
      [<ffffffff810ab086>] worker_thread+0x126/0x3c0
      [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
      [<ffffffff810b252f>] kthread+0xcf/0xe0
      [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
      [<ffffffff816b8798>] ret_from_fork+0x58/0x90
      [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
      
      This is the best case scenario and other scenarios can corrupt the
      already freed memory.
      
      Fix by adding the necessary locking in the pio send context error
      handler.
      
      Cc: <stable@vger.kernel.org> # 4.9.x
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36017b0c
    • Thomas Hellstrom's avatar
      drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros · 50af4036
      Thomas Hellstrom authored
      commit 938ae725 upstream.
      
      Depending on whether the kernel is compiled with frame-pointer or not,
      the temporary memory location used for the bp parameter in these macros
      is referenced relative to the stack pointer or the frame pointer.
      Hence we can never reference that parameter when we've modified either
      the stack pointer or the frame pointer, because then the compiler would
      generate an incorrect stack reference.
      
      Fix this by pushing the temporary memory parameter on a known location on
      the stack before modifying the stack- and frame pointers.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarBrian Paul <brianp@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50af4036
    • Joe Jin's avatar
      xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent · 3246d2e5
      Joe Jin authored
      commit 4855c92d upstream.
      
      When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
      but Dom Heap is increased by the same size. Tracing raidconfig we found
      that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
      to apply memory. If the memory allocated by Dom0 is not in the DMA area,
      it will exchange memory with Xen to meet the requiment. Later drivers
      call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
      the check condition (dev_addr + size - 1 <= dma_mask) is always false,
      it prevents calling xen_destroy_contiguous_region() to return the memory
      to the Xen DMA heap.
      
      This issue introduced by commit 6810df88 "xen-swiotlb: When doing
      coherent alloc/dealloc check before swizzling the MFNs.".
      Signed-off-by: default avatarJoe Jin <joe.jin@oracle.com>
      Tested-by: default avatarJohn Sobecki <john.sobecki@oracle.com>
      Reviewed-by: default avatarRzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3246d2e5
    • Sudip Mukherjee's avatar
      libata: blacklist Micron 500IT SSD with MU01 firmware · 1b9c86c9
      Sudip Mukherjee authored
      commit 136d769e upstream.
      
      While whitelisting Micron M500DC drives, the tweaked blacklist entry
      enabled queued TRIM from M500IT variants also. But these do not support
      queued TRIM. And while using those SSDs with the latest kernel we have
      seen errors and even the partition table getting corrupted.
      
      Some part from the dmesg:
      [    6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133
      [    6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
      [    6.741026] ata1.00: supports DRM functions and may not be fully accessible
      [    6.759887] ata1.00: configured for UDMA/133
      [    6.762256] scsi 0:0:0:0: Direct-Access     ATA      Micron_M500IT_MT MU01 PQ: 0 ANSI: 5
      
      and then for the error:
      [  120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen
      [  120.860338] ata1.00: irq_stat 0x40000008
      [  120.860342] ata1.00: failed command: SEND FPDMA QUEUED
      [  120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out
               res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout)
      [  120.860353] ata1.00: status: { DRDY }
      [  120.860543] ata1: hard resetting link
      [  121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [  121.166376] ata1.00: supports DRM functions and may not be fully accessible
      [  121.186238] ata1.00: supports DRM functions and may not be fully accessible
      [  121.204445] ata1.00: configured for UDMA/133
      [  121.204454] ata1.00: device reported invalid CHS sector 0
      [  121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
      [  121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current]
      [  121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4
      [  121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00
      [  121.204559] print_req_error: I/O error, dev sda, sector 272512
      
      After few reboots with these errors, and the SSD is corrupted.
      After blacklisting it, the errors are not seen and the SSD does not get
      corrupted any more.
      
      Fixes: 243918be ("libata: Do not blacklist Micron M500DC")
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b9c86c9
    • Tejun Heo's avatar
      libata: Blacklist some Sandisk SSDs for NCQ · 31eeaaf5
      Tejun Heo authored
      commit 322579dc upstream.
      
      Sandisk SSDs SD7SN6S256G and SD8SN8U256G are regularly locking up
      regularly under sustained moderate load with NCQ enabled.  Blacklist
      for now.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31eeaaf5
    • Corneliu Doban's avatar
      mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register · 352f4375
      Corneliu Doban authored
      commit 5f651b87 upstream.
      
      When the host controller accepts only 32bit writes, the value of the
      16bit TRANSFER_MODE register, that has the same 32bit address as the
      16bit COMMAND register, needs to be saved and it will be written
      in a 32bit write together with the command as this will trigger the
      host to send the command on the SD interface.
      When sending the tuning command, TRANSFER_MODE is written and then
      sdhci_set_transfer_mode reads it back to clear AUTO_CMD12 bit and
      write it again resulting in wrong value to be written because the
      initial write value was saved in a shadow and the read-back returned
      a wrong value, from the register.
      Fix sdhci_iproc_readw to return the saved value of TRANSFER_MODE
      when a saved value exist.
      Same fix for read of BLOCK_SIZE and BLOCK_COUNT registers, that are
      saved for a different reason, although a scenario that will cause the
      mentioned problem on this registers is not probable.
      
      Fixes: b580c52d ("mmc: sdhci-iproc: add IPROC SDHCI driver")
      Signed-off-by: default avatarCorneliu Doban <corneliu.doban@broadcom.com>
      Signed-off-by: default avatarScott Branden <scott.branden@broadcom.com>
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      352f4375
    • Srinath Mannam's avatar
      mmc: sdhci-iproc: remove hard coded mmc cap 1.8v · 8d33d468
      Srinath Mannam authored
      commit 4c94238f upstream.
      
      Remove hard coded mmc cap 1.8v from platform data as it is board specific.
      The 1.8v DDR mmc caps can be enabled using DTS property for those
      boards that support it.
      
      Fixes: b17b4ab8 ("mmc: sdhci-iproc: define MMC caps in platform data")
      Signed-off-by: default avatarSrinath Mannam <srinath.mannam@broadcom.com>
      Signed-off-by: default avatarScott Branden <scott.branden@broadcom.com>
      Reviewed-by: default avatarRay Jui <ray.jui@broadcom.com>
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d33d468
    • Al Viro's avatar
      do d_instantiate/unlock_new_inode combinations safely · 2d2d3f1e
      Al Viro authored
      commit 1e2e547a upstream.
      
      For anything NFS-exported we do _not_ want to unlock new inode
      before it has grown an alias; original set of fixes got the
      ordering right, but missed the nasty complication in case of
      lockdep being enabled - unlock_new_inode() does
      	lockdep_annotate_inode_mutex_key(inode)
      which can only be done before anyone gets a chance to touch
      ->i_mutex.  Unfortunately, flipping the order and doing
      unlock_new_inode() before d_instantiate() opens a window when
      mkdir can race with open-by-fhandle on a guessed fhandle, leading
      to multiple aliases for a directory inode and all the breakage
      that follows from that.
      
      	Correct solution: a new primitive (d_instantiate_new())
      combining these two in the right order - lockdep annotate, then
      d_instantiate(), then the rest of unlock_new_inode().  All
      combinations of d_instantiate() with unlock_new_inode() should
      be converted to that.
      
      Cc: stable@kernel.org	# 2.6.29 and later
      Tested-by: default avatarMike Marshall <hubcap@omnibond.com>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d2d3f1e
    • Ben Hutchings's avatar
      ALSA: timer: Fix pause event notification · 416808fb
      Ben Hutchings authored
      commit 3ae18097 upstream.
      
      Commit f65e0d29 ("ALSA: timer: Call notifier in the same spinlock")
      combined the start/continue and stop/pause functions, and in doing so
      changed the event code for the pause case to SNDRV_TIMER_EVENT_CONTINUE.
      Change it back to SNDRV_TIMER_EVENT_PAUSE.
      
      Fixes: f65e0d29 ("ALSA: timer: Call notifier in the same spinlock")
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      416808fb
    • Al Viro's avatar
      aio: fix io_destroy(2) vs. lookup_ioctx() race · b14cfa26
      Al Viro authored
      commit baf10564 upstream.
      
      kill_ioctx() used to have an explicit RCU delay between removing the
      reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
      At some point that delay had been removed, on the theory that
      percpu_ref_kill() itself contained an RCU delay.  Unfortunately, that was
      the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
      by lookup_ioctx().  As the result, we could get ctx freed right under
      lookup_ioctx().  Tejun has fixed that in a6d7cff4 ("fs/aio: Add explicit
      RCU grace period when freeing kioctx"); however, that fix is not enough.
      
      Suppose io_destroy() from one thread races with e.g. io_setup() from another;
      CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
      has picked it (under rcu_read_lock()).  Then CPU1 proceeds to drop the
      refcount, getting it to 0 and triggering a call of free_ioctx_users(),
      which proceeds to drop the secondary refcount and once that reaches zero
      calls free_ioctx_reqs().  That does
              INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
              queue_rcu_work(system_wq, &ctx->free_rwork);
      and schedules freeing the whole thing after RCU delay.
      
      In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
      refcount from 0 to 1 and returned the reference to io_setup().
      
      Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
      freed until after percpu_ref_get().  Sure, we'd increment the counter before
      ctx can be freed.  Now we are out of rcu_read_lock() and there's nothing to
      stop freeing of the whole thing.  Unfortunately, CPU2 assumes that since it
      has grabbed the reference, ctx is *NOT* going away until it gets around to
      dropping that reference.
      
      The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
      It's not costlier than what we currently do in normal case, it's safe to
      call since freeing *is* delayed and it closes the race window - either
      lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
      won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
      fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
      the object in question at all.
      
      Cc: stable@kernel.org
      Fixes: a6d7cff4 "fs/aio: Add explicit RCU grace period when freeing kioctx"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b14cfa26
    • Al Viro's avatar
      affs_lookup(): close a race with affs_remove_link() · 5aba1dc0
      Al Viro authored
      commit 30da870c upstream.
      
      we unlock the directory hash too early - if we are looking at secondary
      link and primary (in another directory) gets removed just as we unlock,
      we could have the old primary moved in place of the secondary, leaving
      us to look into freed entry (and leaving our dentry with ->d_fsdata
      pointing to a freed entry).
      
      Cc: stable@vger.kernel.org # 2.4.4+
      Acked-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5aba1dc0
    • Colin Ian King's avatar
      KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable" · 211922cf
      Colin Ian King authored
      commit ba3696e9 upstream.
      
      Trivial fix to spelling mistake in debugfs_entries text.
      
      Fixes: 669e846e ("KVM/MIPS32: MIPS arch specific APIs for KVM")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kernel-janitors@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10+
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      211922cf
    • Maciej W. Rozycki's avatar
      MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs · 0ed5a213
      Maciej W. Rozycki authored
      commit 9a3a92cc upstream.
      
      Check the TIF_32BIT_FPREGS task setting of the tracee rather than the
      tracer in determining the layout of floating-point general registers in
      the floating-point context, correcting access to odd-numbered registers
      for o32 tracees where the setting disagrees between the two processes.
      
      Fixes: 597ce172 ("MIPS: Support for 64-bit FP with O32 binaries")
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.14+
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ed5a213
    • Maciej W. Rozycki's avatar
      MIPS: ptrace: Expose FIR register through FP regset · b1e0cf61
      Maciej W. Rozycki authored
      commit 71e909c0 upstream.
      
      Correct commit 7aeb753b ("MIPS: Implement task_user_regset_view.")
      and expose the FIR register using the unused 4 bytes at the end of the
      NT_PRFPREG regset.  Without that register included clients cannot use
      the PTRACE_GETREGSET request to retrieve the complete FPU register set
      and have to resort to one of the older interfaces, either PTRACE_PEEKUSR
      or PTRACE_GETFPREGS, to retrieve the missing piece of data.  Also the
      register is irreversibly missing from core dumps.
      
      This register is architecturally hardwired and read-only so the write
      path does not matter.  Ignore data supplied on writes then.
      
      Fixes: 7aeb753b ("MIPS: Implement task_user_regset_view.")
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.13+
      Patchwork: https://patchwork.linux-mips.org/patch/19273/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1e0cf61
    • NeilBrown's avatar
      MIPS: c-r4k: Fix data corruption related to cache coherence · aa9a00ef
      NeilBrown authored
      commit 55a2aa08 upstream.
      
      When DMA will be performed to a MIPS32 1004K CPS, the L1-cache for the
      range needs to be flushed and invalidated first.
      The code currently takes one of two approaches.
      1/ If the range is less than the size of the dcache, then HIT type
         requests flush/invalidate cache lines for the particular addresses.
         HIT-type requests a globalised by the CPS so this is safe on SMP.
      
      2/ If the range is larger than the size of dcache, then INDEX type
         requests flush/invalidate the whole cache. INDEX type requests affect
         the local cache only. CPS does not propagate them in any way. So this
         invalidation is not safe on SMP CPS systems.
      
      Data corruption due to '2' can quite easily be demonstrated by
      repeatedly "echo 3 > /proc/sys/vm/drop_caches" and then sha1sum a file
      that is several times the size of available memory. Dropping caches
      means that large contiguous extents (large than dcache) are more likely.
      
      This was not a problem before Linux-4.8 because option 2 was never used
      if CONFIG_MIPS_CPS was defined. The commit which removed that apparently
      didn't appreciate the full consequence of the change.
      
      We could, in theory, globalize the INDEX based flush by sending an IPI
      to other cores. These cache invalidation routines can be called with
      interrupts disabled and synchronous IPI require interrupts to be
      enabled. Asynchronous IPI may not trigger writeback soon enough. So we
      cannot use IPI in practice.
      
      We can already test if IPI would be needed for an INDEX operation with
      r4k_op_needs_ipi(R4K_INDEX). If this is true then we mustn't try the
      INDEX approach as we cannot use IPI. If this is false (e.g. when there
      is only one core and hence one L1 cache) then it is safe to use the
      INDEX approach without IPI.
      
      This patch avoids options 2 if r4k_op_needs_ipi(R4K_INDEX), and so
      eliminates the corruption.
      
      Fixes: c00ab489 ("MIPS: Remove cpu_has_safe_index_cacheops")
      Signed-off-by: default avatarNeilBrown <neil@brown.name>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 4.8+
      Patchwork: https://patchwork.linux-mips.org/patch/19259/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa9a00ef
  2. 25 May, 2018 23 commits