1. 04 Mar, 2016 34 commits
    • Toshi Kani's avatar
      x86/mm: Fix vmalloc_fault() to handle large pages properly · 468eeda0
      Toshi Kani authored
      [ Upstream commit f4eafd8b ]
      
      A kernel page fault oops with the callstack below was observed
      when a read syscall was made to a pmem device after a huge amount
      (>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
      system:
      
           BUG: unable to handle kernel paging request at ffff880840000ff8
           IP: vmalloc_fault+0x1be/0x300
           PGD c7f03a067 PUD 0
           Oops: 0000 [#1] SM
           Call Trace:
              __do_page_fault+0x285/0x3e0
              do_page_fault+0x2f/0x80
              ? put_prev_entity+0x35/0x7a0
              page_fault+0x28/0x30
              ? memcpy_erms+0x6/0x10
              ? schedule+0x35/0x80
              ? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
              ? schedule_timeout+0x183/0x240
              btt_log_read+0x63/0x140 [nd_btt]
               :
              ? __symbol_put+0x60/0x60
              ? kernel_read+0x50/0x80
              SyS_finit_module+0xb9/0xf0
              entry_SYSCALL_64_fastpath+0x1a/0xa4
      
      Since v4.1, ioremap() supports large page (pud/pmd) mappings in
      x86_64 and PAE.  vmalloc_fault() however assumes that the vmalloc
      range is limited to pte mappings.
      
      vmalloc faults do not normally happen in ioremap'd ranges since
      ioremap() sets up the kernel page tables, which are shared by
      user processes.  pgd_ctor() sets the kernel's PGD entries to
      user's during fork().  When allocation of the vmalloc ranges
      crosses a 512GB boundary, ioremap() allocates a new pud table
      and updates the kernel PGD entry to point it.  If user process's
      PGD entry does not have this update yet, a read/write syscall
      to the range will cause a vmalloc fault, which hits the Oops
      above as it does not handle a large page properly.
      
      Following changes are made to vmalloc_fault().
      
      64-bit:
      
       - No change for the PGD sync operation as it handles large
         pages already.
       - Add pud_huge() and pmd_huge() to the validation code to
         handle large pages.
       - Change pud_page_vaddr() to pud_pfn() since an ioremap range
         is not directly mapped (while the if-statement still works
         with a bogus addr).
       - Change pmd_page() to pmd_pfn() since an ioremap range is not
         backed by struct page (while the if-statement still works
         with a bogus addr).
      
      32-bit:
       - No change for the sync operation since the index3 PGD entry
         covers the entire vmalloc range, which is always valid.
         (A separate change to sync PGD entry is necessary if this
          memory layout is changed regardless of the page size.)
       - Add pmd_huge() to the validation code to handle large pages.
         This is for completeness since vmalloc_fault() won't happen
         in ioremap'd ranges as its PGD entry is always valid.
      Reported-by: default avatarHenning Schild <henning.schild@siemens.com>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Acked-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: <stable@vger.kernel.org> # 4.1+
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-mm@kvack.org
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      468eeda0
    • Gerd Hoffmann's avatar
      drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command · 1d1338cf
      Gerd Hoffmann authored
      [ Upstream commit 34855706 ]
      
      This avoids integer overflows on 32bit machines when calculating
      reloc_info size, as reported by Alan Cox.
      
      Cc: stable@vger.kernel.org
      Cc: gnomes@lxorguk.ukuu.org.uk
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1d1338cf
    • Rasmus Villemoes's avatar
      drm/radeon: use post-decrement in error handling · 1a82e899
      Rasmus Villemoes authored
      [ Upstream commit bc3f5d8c ]
      
      We need to use post-decrement to get the pci_map_page undone also for
      i==0, and to avoid some very unpleasant behaviour if pci_map_page
      failed already at i==0.
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1a82e899
    • Takashi Iwai's avatar
      ALSA: seq: Fix double port list deletion · ad4ad1ac
      Takashi Iwai authored
      [ Upstream commit 13d5e5d4 ]
      
      The commit [7f0973e9: ALSA: seq: Fix lockdep warnings due to
      double mutex locks] split the management of two linked lists (source
      and destination) into two individual calls for avoiding the AB/BA
      deadlock.  However, this may leave the possible double deletion of one
      of two lists when the counterpart is being deleted concurrently.
      It ends up with a list corruption, as revealed by syzkaller fuzzer.
      
      This patch fixes it by checking the list emptiness and skipping the
      deletion and the following process.
      
      BugLink: http://lkml.kernel.org/r/CACT4Y+bay9qsrz6dQu31EcGaH9XwfW7o3oBzSQUG9fMszoh=Sg@mail.gmail.com
      Fixes: 7f0973e9 ('ALSA: seq: Fix lockdep warnings due to 'double mutex locks)
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ad4ad1ac
    • Arnd Bergmann's avatar
      tracing: Fix freak link error caused by branch tracer · a77b1f3d
      Arnd Bergmann authored
      [ Upstream commit b33c8ff4 ]
      
      In my randconfig tests, I came across a bug that involves several
      components:
      
      * gcc-4.9 through at least 5.3
      * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files
      * CONFIG_PROFILE_ALL_BRANCHES overriding every if()
      * The optimized implementation of do_div() that tries to
        replace a library call with an division by multiplication
      * code in drivers/media/dvb-frontends/zl10353.c doing
      
              u32 adc_clock = 450560; /* 45.056 MHz */
              if (state->config.adc_clock)
                      adc_clock = state->config.adc_clock;
              do_div(value, adc_clock);
      
      In this case, gcc fails to determine whether the divisor
      in do_div() is __builtin_constant_p(). In particular, it
      concludes that __builtin_constant_p(adc_clock) is false, while
      __builtin_constant_p(!!adc_clock) is true.
      
      That in turn throws off the logic in do_div() that also uses
      __builtin_constant_p(), and instead of picking either the
      constant- optimized division, and the code in ilog2() that uses
      __builtin_constant_p() to figure out whether it knows the answer at
      compile time. The result is a link error from failing to find
      multiple symbols that should never have been called based on
      the __builtin_constant_p():
      
      dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN'
      dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod'
      ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined!
      ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined!
      
      This patch avoids the problem by changing __trace_if() to check
      whether the condition is known at compile-time to be nonzero, rather
      than checking whether it is actually a constant.
      
      I see this one link error in roughly one out of 1600 randconfig builds
      on ARM, and the patch fixes all known instances.
      
      Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.deAcked-by: default avatarNicolas Pitre <nico@linaro.org>
      Fixes: ab3c9c68 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y")
      Cc: stable@vger.kernel.org # v2.6.30+
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a77b1f3d
    • Steven Rostedt (Red Hat)'s avatar
      tracepoints: Do not trace when cpu is offline · 999c2cea
      Steven Rostedt (Red Hat) authored
      [ Upstream commit f3775549 ]
      
      The tracepoint infrastructure uses RCU sched protection to enable and
      disable tracepoints safely. There are some instances where tracepoints are
      used in infrastructure code (like kfree()) that get called after a CPU is
      going offline, and perhaps when it is coming back online but hasn't been
      registered yet.
      
      This can probuce the following warning:
      
       [ INFO: suspicious RCU usage. ]
       4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
       -------------------------------
       include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
      
       other info that might help us debug this:
      
       RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
       no locks held by swapper/8/0.
      
       stack backtrace:
        CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty #34
        Call Trace:
        [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
        [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
        [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
        [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
        [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
        [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
        [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
        [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
        [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
        [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
        [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
        [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
      
      This warning is not a false positive either. RCU is not protecting code that
      is being executed while the CPU is offline.
      
      Instead of playing "whack-a-mole(TM)" and adding conditional statements to
      the tracepoints we find that are used in this instance, simply add a
      cpu_online() test to the tracepoint code where the tracepoint will be
      ignored if the CPU is offline.
      
      Use of raw_smp_processor_id() is fine, as there should never be a case where
      the tracepoint code goes from running on a CPU that is online and suddenly
      gets migrated to a CPU that is offline.
      
      Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.orgReported-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Fixes: 97e1c18e ("tracing: Kernel Tracepoints")
      Cc: stable@vger.kernel.org # v2.6.28+
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      999c2cea
    • Andy Shevchenko's avatar
      dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer · 6bdec2b3
      Andy Shevchenko authored
      [ Upstream commit ee1cdcda ]
      
      The commit 2895b2ca ("dmaengine: dw: fix cyclic transfer callbacks")
      re-enabled BLOCK interrupts with regard to make cyclic transfers work. However,
      this change becomes a regression for non-cyclic transfers as interrupt counters
      under stress test had been grown enormously (approximately per 4-5 bytes in the
      UART loop back test).
      
      Taking into consideration above enable BLOCK interrupts if and only if channel
      is programmed to perform cyclic transfer.
      
      Fixes: 2895b2ca ("dmaengine: dw: fix cyclic transfer callbacks")
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: default avatarMans Rullgard <mans@mansr.com>
      Tested-by: default avatarMans Rullgard <mans@mansr.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6bdec2b3
    • Takashi Iwai's avatar
      ALSA: hda - Cancel probe work instead of flush at remove · 800b2663
      Takashi Iwai authored
      [ Upstream commit 0b8c8219 ]
      
      The commit [991f86d7: ALSA: hda - Flush the pending probe work at
      remove] introduced the sync of async probe work at remove for fixing
      the race.  However, this may lead to another hangup when the module
      removal is performed quickly before starting the probe work, because
      it issues flush_work() and it's blocked forever.
      
      The workaround is to use cancel_work_sync() instead of flush_work()
      there.
      
      Fixes: 991f86d7 ('ALSA: hda - Flush the pending probe work at remove')
      Cc: <stable@vger.kernel.org> # v3.17+
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      800b2663
    • Takashi Iwai's avatar
      ALSA: seq: Fix leak of pool buffer at concurrent writes · d26d8b48
      Takashi Iwai authored
      [ Upstream commit d99a36f4 ]
      
      When multiple concurrent writes happen on the ALSA sequencer device
      right after the open, it may try to allocate vmalloc buffer for each
      write and leak some of them.  It's because the presence check and the
      assignment of the buffer is done outside the spinlock for the pool.
      
      The fix is to move the check and the assignment into the spinlock.
      
      (The current implementation is suboptimal, as there can be multiple
       unnecessary vmallocs because the allocation is done before the check
       in the spinlock.  But the pool size is already checked beforehand, so
       this isn't a big problem; that is, the only possible path is the
       multiple writes before any pool assignment, and practically seen, the
       current coverage should be "good enough".)
      
      The issue was triggered by syzkaller fuzzer.
      
      BugLink: http://lkml.kernel.org/r/CACT4Y+bSzazpXNvtAr=WXaL8hptqjHwqEyFA+VN2AWEx=aurkg@mail.gmail.comReported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d26d8b48
    • Gavin Shan's avatar
      powerpc/eeh: Fix stale cached primary bus · 71239452
      Gavin Shan authored
      [ Upstream commit 05ba75f8 ]
      
      When PE is created, its primary bus is cached to pe->bus. At later
      point, the cached primary bus is returned from eeh_pe_bus_get().
      However, we could get stale cached primary bus and run into kernel
      crash in one case: full hotplug as part of fenced PHB error recovery
      releases all PCI busses under the PHB at unplugging time and recreate
      them at plugging time. pe->bus is still dereferencing the PCI bus
      that was released.
      
      This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
      of pe->bus. pe->bus is updated when its first child EEH device is
      online and the flag is set. Before unplugging in full hotplug for
      error recovery, the flag is cleared.
      
      Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
      Cc: stable@vger.kernel.org #v3.11+
      Reported-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reported-by: default avatarPradipta Ghosh <pradghos@in.ibm.com>
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Tested-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      71239452
    • Andrey Konovalov's avatar
      ALSA: usb-audio: avoid freeing umidi object twice · 1ea63b62
      Andrey Konovalov authored
      [ Upstream commit 07d86ca9 ]
      
      The 'umidi' object will be free'd on the error path by snd_usbmidi_free()
      when tearing down the rawmidi interface. So we shouldn't try to free it
      in snd_usbmidi_create() after having registered the rawmidi interface.
      
      Found by KASAN.
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Acked-by: default avatarClemens Ladisch <clemens@ladisch.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1ea63b62
    • Hannes Reinecke's avatar
      bio: return EINTR if copying to user space got interrupted · f3dd341b
      Hannes Reinecke authored
      [ Upstream commit 2d99b55d ]
      
      Commit 35dc2483 introduced a check for
      current->mm to see if we have a user space context and only copies data
      if we do. Now if an IO gets interrupted by a signal data isn't copied
      into user space any more (as we don't have a user space context) but
      user space isn't notified about it.
      
      This patch modifies the behaviour to return -EINTR from bio_uncopy_user()
      to notify userland that a signal has interrupted the syscall, otherwise
      it could lead to a situation where the caller may get a buffer with
      no data returned.
      
      This can be reproduced by issuing SG_IO ioctl()s in one thread while
      constantly sending signals to it.
      
      Fixes: 35dc2483 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org # v.3.11+
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f3dd341b
    • Ryan Ware's avatar
      EVM: Use crypto_memneq() for digest comparisons · d185fa45
      Ryan Ware authored
      [ Upstream commit 613317bd ]
      
      This patch fixes vulnerability CVE-2016-2085.  The problem exists
      because the vm_verify_hmac() function includes a use of memcmp().
      Unfortunately, this allows timing side channel attacks; specifically
      a MAC forgery complexity drop from 2^128 to 2^12.  This patch changes
      the memcmp() to the cryptographically safe crypto_memneq().
      Reported-by: default avatarXiaofei Rex Guo <xiaofei.rex.guo@intel.com>
      Signed-off-by: default avatarRyan Ware <ware@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d185fa45
    • Eryu Guan's avatar
      ext4: don't read blocks from disk after extents being swapped · a285eee1
      Eryu Guan authored
      [ Upstream commit bcff2488 ]
      
      I notice ext4/307 fails occasionally on ppc64 host, reporting md5
      checksum mismatch after moving data from original file to donor file.
      
      The reason is that move_extent_per_page() calls __block_write_begin()
      and block_commit_write() to write saved data from original inode blocks
      to donor inode blocks, but __block_write_begin() not only maps buffer
      heads but also reads block content from disk if the size is not block
      size aligned.  At this time the physical block number in mapped buffer
      head is pointing to the donor file not the original file, and that
      results in reading wrong data to page, which get written to disk in
      following block_commit_write call.
      
      This also can be reproduced by the following script on 1k block size ext4
      on x86_64 host:
      
          mnt=/mnt/ext4
          donorfile=$mnt/donor
          testfile=$mnt/testfile
          e4compact=~/xfstests/src/e4compact
      
          rm -f $donorfile $testfile
      
          # reserve space for donor file, written by 0xaa and sync to disk to
          # avoid EBUSY on EXT4_IOC_MOVE_EXT
          xfs_io -fc "pwrite -S 0xaa 0 1m" -c "fsync" $donorfile
      
          # create test file written by 0xbb
          xfs_io -fc "pwrite -S 0xbb 0 1023" -c "fsync" $testfile
      
          # compute initial md5sum
          md5sum $testfile | tee md5sum.txt
          # drop cache, force e4compact to read data from disk
          echo 3 > /proc/sys/vm/drop_caches
      
          # test defrag
          echo "$testfile" | $e4compact -i -v -f $donorfile
          # check md5sum
          md5sum -c md5sum.txt
      
      Fix it by creating & mapping buffer heads only but not reading blocks
      from disk, because all the data in page is guaranteed to be up-to-date
      in mext_page_mkuptodate().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a285eee1
    • Insu Yun's avatar
      ext4: fix potential integer overflow · 9d767979
      Insu Yun authored
      [ Upstream commit 46901760 ]
      
      Since sizeof(ext_new_group_data) > sizeof(ext_new_flex_group_data),
      integer overflow could be happened.
      Therefore, need to fix integer overflow sanitization.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarInsu Yun <wuninsu@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9d767979
    • David Sterba's avatar
      btrfs: properly set the termination value of ctx->pos in readdir · 3a68deb6
      David Sterba authored
      [ Upstream commit bc4ef759 ]
      
      The value of ctx->pos in the last readdir call is supposed to be set to
      INT_MAX due to 32bit compatibility, unless 'pos' is intentially set to a
      larger value, then it's LLONG_MAX.
      
      There's a report from PaX SIZE_OVERFLOW plugin that "ctx->pos++"
      overflows (https://forums.grsecurity.net/viewtopic.php?f=1&t=4284), on a
      64bit arch, where the value is 0x7fffffffffffffff ie. LLONG_MAX before
      the increment.
      
      We can get to that situation like that:
      
      * emit all regular readdir entries
      * still in the same call to readdir, bump the last pos to INT_MAX
      * next call to readdir will not emit any entries, but will reach the
        bump code again, finds pos to be INT_MAX and sets it to LLONG_MAX
      
      Normally this is not a problem, but if we call readdir again, we'll find
      'pos' set to LLONG_MAX and the unconditional increment will overflow.
      
      The report from Victor at
      (http://thread.gmane.org/gmane.comp.file-systems.btrfs/49500) with debugging
      print shows that pattern:
      
       Overflow: e
       Overflow: 7fffffff
       Overflow: 7fffffffffffffff
       PAX: size overflow detected in function btrfs_real_readdir
         fs/btrfs/inode.c:5760 cicus.935_282 max, count: 9, decl: pos; num: 0;
         context: dir_context;
       CPU: 0 PID: 2630 Comm: polkitd Not tainted 4.2.3-grsec #1
       Hardware name: Gigabyte Technology Co., Ltd. H81ND2H/H81ND2H, BIOS F3 08/11/2015
        ffffffff81901608 0000000000000000 ffffffff819015e6 ffffc90004973d48
        ffffffff81742f0f 0000000000000007 ffffffff81901608 ffffc90004973d78
        ffffffff811cb706 0000000000000000 ffff8800d47359e0 ffffc90004973ed8
       Call Trace:
        [<ffffffff81742f0f>] dump_stack+0x4c/0x7f
        [<ffffffff811cb706>] report_size_overflow+0x36/0x40
        [<ffffffff812ef0bc>] btrfs_real_readdir+0x69c/0x6d0
        [<ffffffff811dafc8>] iterate_dir+0xa8/0x150
        [<ffffffff811e6d8d>] ? __fget_light+0x2d/0x70
        [<ffffffff811dba3a>] SyS_getdents+0xba/0x1c0
       Overflow: 1a
        [<ffffffff811db070>] ? iterate_dir+0x150/0x150
        [<ffffffff81749b69>] entry_SYSCALL_64_fastpath+0x12/0x83
      
      The jump from 7fffffff to 7fffffffffffffff happens when new dir entries
      are not yet synced and are processed from the delayed list. Then the code
      could go to the bump section again even though it might not emit any new
      dir entries from the delayed list.
      
      The fix avoids entering the "bump" section again once we've finished
      emitting the entries, both for synced and delayed entries.
      
      References: https://forums.grsecurity.net/viewtopic.php?f=1&t=4284Reported-by: default avatarVictor <services@swwu.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Tested-by: default avatarHolger Hoffstätte <holger.hoffstaette@googlemail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3a68deb6
    • Linus Walleij's avatar
      ARM: 8519/1: ICST: try other dividends than 1 · 1c35d875
      Linus Walleij authored
      [ Upstream commit e972c374 ]
      
      Since the dawn of time the ICST code has only supported divide
      by one or hang in an eternal loop. Luckily we were always dividing
      by one because the reference frequency for the systems using
      the ICSTs is 24MHz and the [min,max] values for the PLL input
      if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop
      will always terminate immediately without assigning any divisor
      for the reference frequency.
      
      But for the code to make sense, let's insert the missing i++
      Reported-by: default avatarDavid Binderman <dcb314@hotmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1c35d875
    • Stefan Haberland's avatar
      s390/dasd: fix refcount for PAV reassignment · 5ae3f5ad
      Stefan Haberland authored
      [ Upstream commit 9d862aba ]
      
      Add refcount to the DASD device when a summary unit check worker is
      scheduled. This prevents that the device is set offline with worker
      in place.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarStefan Haberland <stefan.haberland@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5ae3f5ad
    • Stefan Haberland's avatar
      s390/dasd: prevent incorrect length error under z/VM after PAV changes · 2e35e9c4
      Stefan Haberland authored
      [ Upstream commit 020bf042 ]
      
      The channel checks the specified length and the provided amount of
      data for CCWs and provides an incorrect length error if the size does
      not match. Under z/VM with simulation activated the length may get
      changed. Having the suppress length indication bit set is stated as
      good CCW coding practice and avoids errors under z/VM.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarStefan Haberland <stefan.haberland@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2e35e9c4
    • Anton Protopopov's avatar
      cifs: fix erroneous return value · 4f55047b
      Anton Protopopov authored
      [ Upstream commit 4b550af5 ]
      
      The setup_ntlmv2_rsp() function may return positive value ENOMEM instead
      of -ENOMEM in case of kmalloc failure.
      Signed-off-by: default avatarAnton Protopopov <a.s.protopopov@gmail.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4f55047b
    • Nicolai Hähnle's avatar
      drm/radeon: hold reference to fences in radeon_sa_bo_new · 7071cc9b
      Nicolai Hähnle authored
      [ Upstream commit f6ff4f67 ]
      
      An arbitrary amount of time can pass between spin_unlock and
      radeon_fence_wait_any, so we need to ensure that nobody frees the
      fences from under us.
      
      Based on the analogous fix for amdgpu.
      Signed-off-by: default avatarNicolai Hähnle <nicolai.haehnle@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7071cc9b
    • Tejun Heo's avatar
      workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup · f3d69fd8
      Tejun Heo authored
      [ Upstream commit d6e022f1 ]
      
      When looking up the pool_workqueue to use for an unbound workqueue,
      workqueue assumes that the target CPU is always bound to a valid NUMA
      node.  However, currently, when a CPU goes offline, the mapping is
      destroyed and cpu_to_node() returns NUMA_NO_NODE.
      
      This has always been broken but hasn't triggered often enough before
      874bbfe6 ("workqueue: make sure delayed work run in local cpu").
      After the commit, workqueue forcifully assigns the local CPU for
      delayed work items without explicit target CPU to fix a different
      issue.  This widens the window where CPU can go offline while a
      delayed work item is pending causing delayed work items dispatched
      with target CPU set to an already offlined CPU.  The resulting
      NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
      NULL pool_workqueue and thus crash.
      
      While 874bbfe6 has been reverted for a different reason making the
      bug less visible again, it can still happen.  Fix it by mapping
      NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
      This is a temporary workaround.  The long term solution is keeping CPU
      -> NODE mapping stable across CPU off/online cycles which is being
      worked on.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarMike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
      Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.comSigned-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f3d69fd8
    • Lai Jiangshan's avatar
      workqueue: wq_pool_mutex protects the attrs-installation · d3c4dd88
      Lai Jiangshan authored
      [ Upstream commit 5b95e1af ]
      
      Current wq_pool_mutex doesn't proctect the attrs-installation, it results
      that ->unbound_attrs, ->numa_pwq_tbl[] and ->dfl_pwq can only be accessed
      under wq->mutex and causes some inconveniences. Example, wq_update_unbound_numa()
      has to acquire wq->mutex before fetching the wq->unbound_attrs->no_numa
      and the old_pwq.
      
      attrs-installation is a short operation, so this change will no cause any
      latency for other operations which also acquire the wq_pool_mutex.
      
      The only unprotected attrs-installation code is in apply_workqueue_attrs(),
      so this patch touches code less than comments.
      
      It is also a preparation patch for next several patches which read
      wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq with
      only wq_pool_mutex held.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d3c4dd88
    • Lai Jiangshan's avatar
      workqueue: split apply_workqueue_attrs() into 3 stages · 9e1a3771
      Lai Jiangshan authored
      [ Upstream commit 2d5f0764 ]
      
      Current apply_workqueue_attrs() includes pwqs-allocation and pwqs-installation,
      so when we batch multiple apply_workqueue_attrs()s as a transaction, we can't
      ensure the transaction must succeed or fail as a complete unit.
      
      To solve this, we split apply_workqueue_attrs() into three stages.
      The first stage does the preparation: allocation memory, pwqs.
      The second stage does the attrs-installaion and pwqs-installation.
      The third stage frees the allocated memory and (old or unused) pwqs.
      
      As the result, batching multiple apply_workqueue_attrs()s can
      succeed or fail as a complete unit:
      	1) batch do all the first stage for all the workqueues
      	2) only commit all when all the above succeed.
      
      This patch is a preparation for the next patch ("Allow modifying low level
      unbound workqueue cpumask") which will do a multiple apply_workqueue_attrs().
      
      The patch doesn't have functionality changed except two minor adjustment:
      	1) free_unbound_pwq() for the error path is removed, we use the
      	   heavier version put_pwq_unlocked() instead since the error path
      	   is rare. this adjustment simplifies the code.
      	2) the memory-allocation is also moved into wq_pool_mutex.
      	   this is needed to avoid to do the further splitting.
      
      tj: minor updates to comments.
      Suggested-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Mike Galbraith <bitbucket@online.de>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9e1a3771
    • Alexandra Yates's avatar
      ahci: Intel DNV device IDs SATA · de7b555f
      Alexandra Yates authored
      [ Upstream commit 342decff ]
      
      Adding Intel codename DNV platform device IDs for SATA.
      Signed-off-by: default avatarAlexandra Yates <alexandra.yates@linux.intel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      de7b555f
    • Tony Lindgren's avatar
      phy: twl4030-usb: Fix unbalanced pm_runtime_enable on module reload · bb789d04
      Tony Lindgren authored
      [ Upstream commit 58a66dba ]
      
      If we reload phy-twl4030-usb, we get a warning about unbalanced
      pm_runtime_enable. Let's fix the issue and also fix idling of the
      device on unload before we attempt to shut it down.
      
      If we don't properly idle the PHY before shutting it down on removal,
      the twl4030 ends up consuming about 62mW of extra power compared to
      running idle with the module loaded.
      
      Cc: stable@vger.kernel.org
      Cc: Bin Liu <b-liu@ti.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Kishon Vijay Abraham I <kishon@ti.com>
      Cc: NeilBrown <neil@brown.name>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      bb789d04
    • Tony Lindgren's avatar
      phy: twl4030-usb: Relase usb phy on unload · feebbdd7
      Tony Lindgren authored
      [ Upstream commit b241d31e ]
      
      Otherwise rmmod omap2430; rmmod phy-twl4030-usb; modprobe omap2430
      will try to use a non-existing phy and oops:
      
      Unable to handle kernel paging request at virtual address b6f7c1f0
      ...
      [<c048a284>] (devm_usb_get_phy_by_node) from [<bf0758ac>]
      (omap2430_musb_init+0x44/0x2b4 [omap2430])
      [<bf0758ac>] (omap2430_musb_init [omap2430]) from [<bf055ec0>]
      (musb_init_controller+0x194/0x878 [musb_hdrc])
      
      Cc: stable@vger.kernel.org
      Cc: Bin Liu <b-liu@ti.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Kishon Vijay Abraham I <kishon@ti.com>
      Cc: NeilBrown <neil@brown.name>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      feebbdd7
    • Shawn Lin's avatar
      phy: core: fix wrong err handle for phy_power_on · 7822d8eb
      Shawn Lin authored
      [ Upstream commit b82fcabe ]
      
      If phy_pm_runtime_get_sync failed but we already
      enable regulator, current code return directly without
      doing regulator_disable. This patch fix this problem
      and cleanup err handle of phy_power_on to be more readable.
      
      Fixes: 3be88125 ("phy: core: Support regulator ...")
      Cc: <stable@vger.kernel.org> # v3.18+
      Cc: Roger Quadros <rogerq@ti.com>
      Cc: Axel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7822d8eb
    • Tejun Heo's avatar
      Revert "workqueue: make sure delayed work run in local cpu" · 68fce03b
      Tejun Heo authored
      [ Upstream commit 041bd12e ]
      
      This reverts commit 874bbfe6.
      
      Workqueue used to implicity guarantee that work items queued without
      explicit CPU specified are put on the local CPU.  Recent changes in
      timer broke the guarantee and led to vmstat breakage which was fixed
      by 176bed1d ("vmstat: explicitly schedule per-cpu work on the CPU
      we need it to run on").
      
      vmstat is the most likely to expose the issue and it's quite possible
      that there are other similar problems which are a lot more difficult
      to trigger.  As a preventive measure, 874bbfe6 ("workqueue: make
      sure delayed work run in local cpu") was applied to restore the local
      CPU guarnatee.  Unfortunately, the change exposed a bug in timer code
      which got fixed by 22b886dd ("timers: Use proper base migration in
      add_timer_on()").  Due to code restructuring, the commit couldn't be
      backported beyond certain point and stable kernels which only had
      874bbfe6 started crashing.
      
      The local CPU guarantee was accidental more than anything else and we
      want to get rid of it anyway.  As, with the vmstat case fixed,
      874bbfe6 is causing more problems than it's fixing, it has been
      decided to take the chance and officially break the guarantee by
      reverting the commit.  A debug feature will be added to force foreign
      CPU assignment to expose cases relying on the guarantee and fixes for
      the individual cases will be backported to stable as necessary.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 874bbfe6 ("workqueue: make sure delayed work run in local cpu")
      Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz
      Cc: stable@vger.kernel.org
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
      Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      68fce03b
    • Takashi Iwai's avatar
      ALSA: timer: Fix race at concurrent reads · 0163f1a7
      Takashi Iwai authored
      [ Upstream commit 4dff5c7b ]
      
      snd_timer_user_read() has a potential race among parallel reads, as
      qhead and qused are updated outside the critical section due to
      copy_to_user() calls.  Move them into the critical section, and also
      sanitize the relevant code a bit.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      0163f1a7
    • Takashi Iwai's avatar
      ALSA: hda - Fix bad dereference of jack object · dbe8ccc2
      Takashi Iwai authored
      [ Upstream commit 2ebab40e ]
      
      The hda_jack_tbl entries are managed by snd_array for allowing
      multiple jacks.  It's good per se, but the problem is that struct
      hda_jack_callback keeps the hda_jack_tbl pointer.  Since snd_array
      doesn't preserve each pointer at resizing the array, we can't keep the
      original pointer but have to deduce the pointer at each time via
      snd_array_entry() instead.  Actually, this resulted in the deference
      to the wrong pointer on codecs that have many pins such as CS4208.
      
      This patch replaces the pointer to the NID value as the search key.
      As an unexpected good side effect, this even simplifies the code, as
      only NID is needed in most cases.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      dbe8ccc2
    • Takashi Iwai's avatar
      ALSA: timer: Fix race between stop and interrupt · 9b728394
      Takashi Iwai authored
      [ Upstream commit ed8b1d6d ]
      
      A slave timer element also unlinks at snd_timer_stop() but it takes
      only slave_active_lock.  When a slave is assigned to a master,
      however, this may become a race against the master's interrupt
      handling, eventually resulting in a list corruption.  The actual bug
      could be seen with a syzkaller fuzzer test case in BugLink below.
      
      As a fix, we need to take timeri->timer->lock when timer isn't NULL,
      i.e. assigned to a master, while the assignment to a master itself is
      protected by slave_active_lock.
      
      BugLink: http://lkml.kernel.org/r/CACT4Y+Y_Bm+7epAb=8Wi=AaWd+DYS7qawX52qxdCfOfY49vozQ@mail.gmail.com
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9b728394
    • Linus Walleij's avatar
      ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz() · ca54f8d1
      Linus Walleij authored
      [ Upstream commit 5070fb14 ]
      
      When trying to set the ICST 307 clock to 25174000 Hz I ran into
      this arithmetic error: the icst_hz_to_vco() correctly figure out
      DIVIDE=2, RDW=100 and VDW=99 yielding a frequency of
      25174000 Hz out of the VCO. (I replicated the icst_hz() function
      in a spreadsheet to verify this.)
      
      However, when I called icst_hz() on these VCO settings it would
      instead return 4122709 Hz. This causes an error in the common
      clock driver for ICST as the common clock framework will call
      .round_rate() on the clock which will utilize icst_hz_to_vco()
      followed by icst_hz() suggesting the erroneous frequency, and
      then the clock gets set to this.
      
      The error did not manifest in the old clock framework since
      this high frequency was only used by the CLCD, which calls
      clk_set_rate() without first calling clk_round_rate() and since
      the old clock framework would not call clk_round_rate() before
      setting the frequency, the correct values propagated into
      the VCO.
      
      After some experimenting I figured out that it was due to a simple
      arithmetic overflow: the divisor for 24Mhz reference frequency
      as reference becomes 24000000*2*(99+8)=0x132212400 and the "1"
      in bit 32 overflows and is lost.
      
      But introducing an explicit 64-by-32 bit do_div() and casting
      the divisor into (u64) we get the right frequency back, and the
      right frequency gets set.
      
      Tested on the ARM Versatile.
      
      Cc: stable@vger.kernel.org
      Cc: linux-clk@vger.kernel.org
      Cc: Pawel Moll <pawel.moll@arm.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ca54f8d1
    • Takashi Iwai's avatar
      ALSA: timer: Fix wrong instance passed to slave callbacks · 47eef347
      Takashi Iwai authored
      [ Upstream commit 117159f0 ]
      
      In snd_timer_notify1(), the wrong timer instance was passed for slave
      ccallback function.  This leads to the access to the wrong data when
      an incompatible master is handled (e.g. the master is the sequencer
      timer and the slave is a user timer), as spotted by syzkaller fuzzer.
      
      This patch fixes that wrong assignment.
      
      BugLink: http://lkml.kernel.org/r/CACT4Y+Y_Bm+7epAb=8Wi=AaWd+DYS7qawX52qxdCfOfY49vozQ@mail.gmail.comReported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      47eef347
  2. 28 Feb, 2016 6 commits