1. 08 Nov, 2013 1 commit
  2. 07 Nov, 2013 1 commit
    • Ben Widawsky's avatar
      drm/i915/bdw: Handle forcewake for writes on gen8 · ab2aa47e
      Ben Widawsky authored
      GEN8 removes the GT FIFO which we've all come to know and love. Instead
      it offers a wider range of optimized registers which always keep a
      shadowed copy, and are fed to the GPU when it wakes.
      
      How this is implemented in hardware is still somewhat of a mystery. As
      far as I can tell, the basic design is as follows:
      
      If the register is not optimized, you must use the old forcewake
      mechanism to bring the GT out of sleep. [1]
      
      If register is in the optimized list the write will signal that the
      GT should begin to come out of whatever sleep state it is in.
      
      While the GT is coming out of sleep, the requested write will be stored
      in an intermediate shadow register.
      
      Do to the fact that the implementation details are not clear, I see
      several risks:
      1. Order is not preserved as it is with GT FIFO. If we issue multiple
      writes to optimized registers, where order matters, we may need to
      serialize it with forcewake.
      2. The optimized registers have only 1 shadowed slot, meaning if we
      issue multiple writes to the same register, and those values need to
      reach the GPU in order, forcewake will be required.
      
      [1] We could implement a SW queue the way the GT FIFO used to work if
      desired.
      
      NOTE: Compile tested only until we get real silicon.
      
      v2:
      - Use a default case to make future platforms also work.
      - Get rid of IS_BROADWELL since that's not yet defined, but we want to
        MMIO as soon as possible.
      
      v3: Apply suggestions from Mika's review:
      - s/optimized/shadowed/
      - invert the logic of the helper so that it does what it says (the
        code itself was correct, just confusing to read).
      
      v4:
      - Squash in lost break.
      
      Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      ab2aa47e
  3. 05 Nov, 2013 1 commit
  4. 04 Nov, 2013 1 commit
    • Daniel Vetter's avatar
      Merge tag 'v3.12' into drm-intel-next · 7f16e5c1
      Daniel Vetter authored
      I want to merge in the new Broadwell support as a late hw enabling
      pull request. But since the internal branch was based upon our
      drm-intel-nightly integration branch I need to resolve all the
      oustanding conflicts in drm/i915 with a backmerge to make the 60+
      patches apply properly.
      
      We'll propably have some fun because Linus will come up with a
      slightly different merge solution.
      
      Conflicts:
      	drivers/gpu/drm/i915/i915_dma.c
      	drivers/gpu/drm/i915/i915_drv.c
      	drivers/gpu/drm/i915/intel_crt.c
      	drivers/gpu/drm/i915/intel_ddi.c
      	drivers/gpu/drm/i915/intel_display.c
      	drivers/gpu/drm/i915/intel_dp.c
      	drivers/gpu/drm/i915/intel_drv.h
      
      All rather simple adjacent lines changed or partial backports from
      -next to -fixes, with the exception of the thaw code in i915_dma.c.
      That one needed a bit of shuffling to restore the intent.
      
      Oh and the massive header file reordering in intel_drv.h is a bit
      trouble. But not much.
      
      v2: Also don't forget the fixup for the silent conflict that results
      in compile fail ...
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      7f16e5c1
  5. 03 Nov, 2013 3 commits
  6. 02 Nov, 2013 2 commits
  7. 01 Nov, 2013 25 commits
  8. 31 Oct, 2013 6 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (fixes from Andrew Morton) · 4f794ee8
      Linus Torvalds authored
      Merge four more fixes from Andrew Morton.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
        mm: memcg: fix test for child groups
        mm: memcg: lockdep annotation for memcg OOM lock
        mm: memcg: use proper memcg in limit bypass
      4f794ee8
    • Ming Lei's avatar
      lib/scatterlist.c: don't flush_kernel_dcache_page on slab page · 3d77b50c
      Ming Lei authored
      Commit b1adaf65 ("[SCSI] block: add sg buffer copy helper
      functions") introduces two sg buffer copy helpers, and calls
      flush_kernel_dcache_page() on pages in SG list after these pages are
      written to.
      
      Unfortunately, the commit may introduce a potential bug:
      
       - Before sending some SCSI commands, kmalloc() buffer may be passed to
         block layper, so flush_kernel_dcache_page() can see a slab page
         finally
      
       - According to cachetlb.txt, flush_kernel_dcache_page() is only called
         on "a user page", which surely can't be a slab page.
      
       - ARCH's implementation of flush_kernel_dcache_page() may use page
         mapping information to do optimization so page_mapping() will see the
         slab page, then VM_BUG_ON() is triggered.
      
      Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
      and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
      before calling flush_kernel_dcache_page().
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Tested-by: default avatarSimon Baatz <gmbnomis@gmail.com>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@vger.kernel.org>	[3.2+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d77b50c
    • Johannes Weiner's avatar
      mm: memcg: fix test for child groups · 696ac172
      Johannes Weiner authored
      When memcg code needs to know whether any given memcg has children, it
      uses the cgroup child iteration primitives and returns true/false
      depending on whether the iteration loop is executed at least once or
      not.
      
      Because a cgroup's list of children is RCU protected, these primitives
      require the RCU read-lock to be held, which is not the case for all
      memcg callers.  This results in the following splat when e.g.  enabling
      hierarchy mode:
      
        WARNING: CPU: 3 PID: 1 at kernel/cgroup.c:3043 css_next_child+0xa3/0x160()
        CPU: 3 PID: 1 Comm: systemd Not tainted 3.12.0-rc5-00117-g83f11a9c-dirty #18
        Hardware name: LENOVO 3680B56/3680B56, BIOS 6QET69WW (1.39 ) 04/26/2012
        Call Trace:
          dump_stack+0x54/0x74
          warn_slowpath_common+0x78/0xa0
          warn_slowpath_null+0x1a/0x20
          css_next_child+0xa3/0x160
          mem_cgroup_hierarchy_write+0x5b/0xa0
          cgroup_file_write+0x108/0x2a0
          vfs_write+0xbd/0x1e0
          SyS_write+0x4c/0xa0
          system_call_fastpath+0x16/0x1b
      
      In the memcg case, we only care about children when we are attempting to
      modify inheritable attributes interactively.  Racing with deletion could
      mean a spurious -EBUSY, no problem.  Racing with addition is handled
      just fine as well through the memcg_create_mutex: if the child group is
      not on the list after the mutex is acquired, it won't be initialized
      from the parent's attributes until after the unlock.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      696ac172
    • Johannes Weiner's avatar
      mm: memcg: lockdep annotation for memcg OOM lock · 0056f4e6
      Johannes Weiner authored
      The memcg OOM lock is a mutex-type lock that is open-coded due to
      memcg's special needs.  Add annotations for lockdep coverage.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0056f4e6
    • Johannes Weiner's avatar
      mm: memcg: use proper memcg in limit bypass · 3168ecbe
      Johannes Weiner authored
      Commit 84235de3 ("fs: buffer: move allocation failure loop into the
      allocator") allowed __GFP_NOFAIL allocations to bypass the limit if they
      fail to reclaim enough memory for the charge.  But because the main test
      case was on a 3.2-based system, the patch missed the fact that on newer
      kernels the charge function needs to return root_mem_cgroup when
      bypassing the limit, and not NULL.  This will corrupt whatever memory is
      at NULL + percpu pointer offset.  Fix this quickly before problems are
      reported.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3168ecbe
    • Linus Torvalds's avatar
      vfs: decrapify dput(), fix cache behavior under normal load · 358eec18
      Linus Torvalds authored
      We do not want to dirty the dentry->d_flags cacheline in dput() just to
      set the DCACHE_REFERENCED flag when it is already set in the common case
      anyway.  This way the first cacheline of the dentry (which contains the
      RCU lookup information etc) can stay shared among multiple CPU's.
      
      This finishes off some of the details of all the scalability patches
      merged during the merge window.
      
      Also don't mark dentry_kill() for inlining, since it's the uncommon path
      and inlining it just makes the common path slower due to extra function
      entry/exit overhead.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      358eec18