1. 11 Feb, 2015 2 commits
  2. 06 Feb, 2015 34 commits
  3. 30 Jan, 2015 4 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.14.31 · 016ea480
      Greg Kroah-Hartman authored
      016ea480
    • NeilBrown's avatar
      md/raid5: fetch_block must fetch all the blocks handle_stripe_dirtying wants. · 50606f9a
      NeilBrown authored
      commit 108cef3a upstream.
      
      It is critical that fetch_block() and handle_stripe_dirtying()
      are consistent in their analysis of what needs to be loaded.
      Otherwise raid5 can wait forever for a block that won't be loaded.
      
      Currently when writing to a RAID5 that is resyncing, to a location
      beyond the resync offset, handle_stripe_dirtying chooses a
      reconstruct-write cycle, but fetch_block() assumes a
      read-modify-write, and a lockup can happen.
      
      So treat that case just like RAID6, just as we do in
      handle_stripe_dirtying.  RAID6 always does reconstruct-write.
      
      This bug was introduced when the behaviour of handle_stripe_dirtying
      was changed in 3.7, so the patch is suitable for any kernel since,
      though it will need careful merging for some versions.
      
      Cc: stable@vger.kernel.org (v3.7+)
      Fixes: a7854487Reported-by: default avatarHenry Cai <henryplusplus@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50606f9a
    • Michal Hocko's avatar
      mm: get rid of radix tree gfp mask for pagecache_get_page · 525f2d01
      Michal Hocko authored
      commit 45f87de5 upstream.
      
      Commit 2457aec6 ("mm: non-atomically mark page accessed during page
      cache allocation where possible") has added a separate parameter for
      specifying gfp mask for radix tree allocations.
      
      Not only this is less than optimal from the API point of view because it
      is error prone, it is also buggy currently because
      grab_cache_page_write_begin is using GFP_KERNEL for radix tree and if
      fgp_flags doesn't contain FGP_NOFS (mostly controlled by fs by
      AOP_FLAG_NOFS flag) but the mapping_gfp_mask has __GFP_FS cleared then
      the radix tree allocation wouldn't obey the restriction and might
      recurse into filesystem and cause deadlocks.  This is the case for most
      filesystems unfortunately because only ext4 and gfs2 are using
      AOP_FLAG_NOFS.
      
      Let's simply remove radix_gfp_mask parameter because the allocation
      context is same for both page cache and for the radix tree.  Just make
      sure that the radix tree gets only the sane subset of the mask (e.g.  do
      not pass __GFP_WRITE).
      
      Long term it is more preferable to convert remaining users of
      AOP_FLAG_NOFS to use mapping_gfp_mask instead and simplify this
      interface even further.
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      525f2d01
    • Mel Gorman's avatar
      mm: page_alloc: reduce cost of the fair zone allocation policy · a0d134d3
      Mel Gorman authored
      commit 4ffeaf35 upstream.
      
      The fair zone allocation policy round-robins allocations between zones
      within a node to avoid age inversion problems during reclaim.  If the
      first allocation fails, the batch counts are reset and a second attempt
      made before entering the slow path.
      
      One assumption made with this scheme is that batches expire at roughly
      the same time and the resets each time are justified.  This assumption
      does not hold when zones reach their low watermark as the batches will
      be consumed at uneven rates.  Allocation failure due to watermark
      depletion result in additional zonelist scans for the reset and another
      watermark check before hitting the slowpath.
      
      On UMA, the benefit is negligible -- around 0.25%.  On 4-socket NUMA
      machine it's variable due to the variability of measuring overhead with
      the vmstat changes.  The system CPU overhead comparison looks like
      
                3.16.0-rc3  3.16.0-rc3  3.16.0-rc3
                   vanilla   vmstat-v5 lowercost-v5
      User          746.94      774.56      802.00
      System      65336.22    32847.27    40852.33
      Elapsed     27553.52    27415.04    27368.46
      
      However it is worth noting that the overall benchmark still completed
      faster and intuitively it makes sense to take as few passes as possible
      through the zonelists.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0d134d3