1. 24 Jun, 2016 4 commits
  2. 08 Jun, 2016 36 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.13 · ba760d43
      Greg Kroah-Hartman authored
      ba760d43
    • Dave Chinner's avatar
      xfs: handle dquot buffer readahead in log recovery correctly · 55f6ddfc
      Dave Chinner authored
      commit 7d6a13f0 upstream.
      
      When we do dquot readahead in log recovery, we do not use a verifier
      as the underlying buffer may not have dquots in it. e.g. the
      allocation operation hasn't yet been replayed. Hence we do not want
      to fail recovery because we detect an operation to be replayed has
      not been run yet. This problem was addressed for inodes in commit
      d8914002 ("xfs: inode buffers may not be valid during recovery
      readahead") but the problem was not recognised to exist for dquots
      and their buffers as the dquot readahead did not have a verifier.
      
      The result of not using a verifier is that when the buffer is then
      next read to replay a dquot modification, the dquot buffer verifier
      will only be attached to the buffer if *readahead is not complete*.
      Hence we can read the buffer, replay the dquot changes and then add
      it to the delwri submission list without it having a verifier
      attached to it. This then generates warnings in xfs_buf_ioapply(),
      which catches and warns about this case.
      
      Fix this and make it handle the same readahead verifier error cases
      as for inode buffers by adding a new readahead verifier that has a
      write operation as well as a read operation that marks the buffer as
      not done if any corruption is detected.  Also make sure we don't run
      readahead if the dquot buffer has been marked as cancelled by
      recovery.
      
      This will result in readahead either succeeding and the buffer
      having a valid write verifier, or readahead failing and the buffer
      state requiring the subsequent read to resubmit the IO with the new
      verifier.  In either case, this will result in the buffer always
      ending up with a valid write verifier on it.
      
      Note: we also need to fix the inode buffer readahead error handling
      to mark the buffer with EIO. Brian noticed the code I copied from
      there wrong during review, so fix it at the same time. Add comments
      linking the two functions that handle readahead verifier errors
      together so we don't forget this behavioural link in future.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55f6ddfc
    • Eric Sandeen's avatar
      xfs: print name of verifier if it fails · 063b0dc8
      Eric Sandeen authored
      commit 233135b7 upstream.
      
      This adds a name to each buf_ops structure, so that if
      a verifier fails we can print the type of verifier that
      failed it.  Should be a slight debugging aid, I hope.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: Holger Hoffstätte <holger@applied-asynchrony.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      063b0dc8
    • Dave Chinner's avatar
      xfs: skip stale inodes in xfs_iflush_cluster · 21cfd6cc
      Dave Chinner authored
      commit 7d3aa7fe upstream.
      
      We don't write back stale inodes so we should skip them in
      xfs_iflush_cluster, too.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21cfd6cc
    • Dave Chinner's avatar
      xfs: fix inode validity check in xfs_iflush_cluster · baa7a74d
      Dave Chinner authored
      commit 51b07f30 upstream.
      
      Some careless idiot(*) wrote crap code in commit 1a3e8f3d ("xfs:
      convert inode cache lookups to use RCU locking") back in late 2010,
      and so xfs_iflush_cluster checks the wrong inode for whether it is
      still valid under RCU protection. Fix it to lock and check the
      correct inode.
      
      (*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
      Discovered-by: default avatarBrain Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      baa7a74d
    • Dave Chinner's avatar
      xfs: xfs_iflush_cluster fails to abort on error · 7dc8f21b
      Dave Chinner authored
      commit b1438f47 upstream.
      
      When a failure due to an inode buffer occurs, the error handling
      fails to abort the inode writeback correctly. This can result in the
      inode being reclaimed whilst still in the AIL, leading to
      use-after-free situations as well as filesystems that cannot be
      unmounted as the inode log items left in the AIL never get removed.
      
      Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
      the inode flush being aborted correctly.
      Reported-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Diagnosed-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Tested-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7dc8f21b
    • Dave Chinner's avatar
      xfs: Don't wrap growfs AGFL indexes · d7d92ca7
      Dave Chinner authored
      commit ad747e3b upstream.
      
      Commit 96f859d5 ("libxfs: pack the agfl header structure so
      XFS_AGFL_SIZE is correct") allowed the freelist to use the empty
      slot at the end of the freelist on 64 bit systems that was not
      being used due to sizeof() rounding up the structure size.
      
      This has caused versions of xfs_repair prior to 4.5.0 (which also
      has the fix) to report this as a corruption once the filesystem has
      been grown. Older kernels can also have problems (seen from a whacky
      container/vm management environment) mounting filesystems grown on a
      system with a newer kernel than the vm/container it is deployed on.
      
      To avoid this problem, change the initial free list indexes not to
      wrap across the end of the AGFL, hence avoiding the initialisation
      of agf_fllast to the last index in the AGFL.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7d92ca7
    • Eric Sandeen's avatar
      xfs: disallow rw remount on fs with unknown ro-compat features · ec86bfec
      Eric Sandeen authored
      commit d0a58e83 upstream.
      
      Today, a kernel which refuses to mount a filesystem read-write
      due to unknown ro-compat features can still transition to read-write
      via the remount path.  The old kernel is most likely none the wiser,
      because it's unaware of the new feature, and isn't using it.  However,
      writing to the filesystem may well corrupt metadata related to that
      new feature, and moving to a newer kernel which understand the feature
      will have problems.
      
      Right now the only ro-compat feature we have is the free inode btree,
      which showed up in v3.16.  It would be good to push this back to
      all the active stable kernels, I think, so that if anyone is using
      newer mkfs (which enables the finobt feature) with older kernel
      releases, they'll be protected.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarBill O'Donnell <billodo@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec86bfec
    • Arnd Bergmann's avatar
      gcov: disable tree-loop-im to reduce stack usage · 8edc7f04
      Arnd Bergmann authored
      commit c87bf431 upstream.
      
      Enabling CONFIG_GCOV_PROFILE_ALL produces us a lot of warnings like
      
      lib/lz4/lz4hc_compress.c: In function 'lz4_compresshcctx':
      lib/lz4/lz4hc_compress.c:514:1: warning: the frame size of 1504 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      After some investigation, I found that this behavior started with gcc-4.9,
      and opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702.
      A suggested workaround for it is to use the -fno-tree-loop-im
      flag that turns off one of the optimization stages in gcc, so the
      code runs a little slower but does not use excessive amounts
      of stack.
      
      We could make this conditional on the gcc version, but I could not
      find an easy way to do this in Kbuild and the benefit would be
      fairly small, given that most of the gcc version in production are
      affected now.
      
      I'm marking this for 'stable' backports because it addresses a bug
      with code generation in gcc that exists in all kernel versions
      with the affected gcc releases.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8edc7f04
    • Srinivas Pandruvada's avatar
      scripts/package/Makefile: rpmbuild add support of RPMOPTS · 4b2fb176
      Srinivas Pandruvada authored
      commit 65a9f31c upstream.
      
      After commit 21a59991 ("scripts/package/Makefile: rpmbuild is needed
      for rpm targets"), it is no longer possible to specify RPMOPTS.
      For example, we can no longer able to control _topdir using the following
      make command.
      make RPMOPTS="--define '_topdir /home/xyz/workspace/'" binrpm-pkg
      
      Fixes: 21a59991 ("scripts/package/Makefile: rpmbuild is needed for rpm targets")
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b2fb176
    • Ville Syrjälä's avatar
      dma-debug: avoid spinlock recursion when disabling dma-debug · 7d0b4945
      Ville Syrjälä authored
      commit 3017cd63 upstream.
      
      With netconsole (at least) the pr_err("...  disablingn") call can
      recurse back into the dma-debug code, where it'll try to grab
      free_entries_lock again.  Avoid the problem by doing the printk after
      dropping the lock.
      
      Link: http://lkml.kernel.org/r/1463678421-18683-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d0b4945
    • Rafael J. Wysocki's avatar
      PM / sleep: Handle failures in device_suspend_late() consistently · 98c28450
      Rafael J. Wysocki authored
      commit 3a17fb32 upstream.
      
      Grygorii Strashko reports:
      
       The PM runtime will be left disabled for the device if its
       .suspend_late() callback fails and async suspend is not allowed
       for this device. In this case device will not be added in
       dpm_late_early_list and dpm_resume_early() will ignore this
       device, as result PM runtime will be disabled for it forever
       (side effect: after 8 subsequent failures for the same device
       the PM runtime will be reenabled due to disable_depth overflow).
      
      To fix this problem, add devices to dpm_late_early_list regardless
      of whether or not device_suspend_late() returns errors for them.
      
      That will ensure failures in there to be handled consistently for
      all devices regardless of their async suspend/resume status.
      Reported-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Tested-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98c28450
    • Nicolai Stange's avatar
      ext4: silence UBSAN in ext4_mb_init() · 8b8de1c9
      Nicolai Stange authored
      commit 935244cd upstream.
      
      Currently, in ext4_mb_init(), there's a loop like the following:
      
        do {
          ...
          offset += 1 << (sb->s_blocksize_bits - i);
          i++;
        } while (i <= sb->s_blocksize_bits + 1);
      
      Note that the updated offset is used in the loop's next iteration only.
      
      However, at the last iteration, that is at i == sb->s_blocksize_bits + 1,
      the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3))
      and UBSAN reports
      
        UBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15
        shift exponent 4294967295 is too large for 32-bit type 'int'
        [...]
        Call Trace:
         [<ffffffff818c4d25>] dump_stack+0xbc/0x117
         [<ffffffff818c4c69>] ? _atomic_dec_and_lock+0x169/0x169
         [<ffffffff819411ab>] ubsan_epilogue+0xd/0x4e
         [<ffffffff81941cac>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
         [<ffffffff81941ab1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
         [<ffffffff814b6dc1>] ? kmem_cache_alloc+0x101/0x390
         [<ffffffff816fc13b>] ? ext4_mb_init+0x13b/0xfd0
         [<ffffffff814293c7>] ? create_cache+0x57/0x1f0
         [<ffffffff8142948a>] ? create_cache+0x11a/0x1f0
         [<ffffffff821c2168>] ? mutex_lock+0x38/0x60
         [<ffffffff821c23ab>] ? mutex_unlock+0x1b/0x50
         [<ffffffff814c26ab>] ? put_online_mems+0x5b/0xc0
         [<ffffffff81429677>] ? kmem_cache_create+0x117/0x2c0
         [<ffffffff816fcc49>] ext4_mb_init+0xc49/0xfd0
         [...]
      
      Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1.
      
      Unless compilers start to do some fancy transformations (which at least
      GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
      such calculated value of offset is never used again.
      
      Silence UBSAN by introducing another variable, offset_incr, holding the
      next increment to apply to offset and adjust that one by right shifting it
      by one position per loop iteration.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b8de1c9
    • Nicolai Stange's avatar
      ext4: address UBSAN warning in mb_find_order_for_block() · 12aa7d95
      Nicolai Stange authored
      commit b5cb316c upstream.
      
      Currently, in mb_find_order_for_block(), there's a loop like the following:
      
        while (order <= e4b->bd_blkbits + 1) {
          ...
          bb += 1 << (e4b->bd_blkbits - order);
        }
      
      Note that the updated bb is used in the loop's next iteration only.
      
      However, at the last iteration, that is at order == e4b->bd_blkbits + 1,
      the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reports
      
        UBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11
        shift exponent -1 is negative
        [...]
        Call Trace:
         [<ffffffff818c4d35>] dump_stack+0xbc/0x117
         [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
         [<ffffffff819411bb>] ubsan_epilogue+0xd/0x4e
         [<ffffffff81941cbc>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
         [<ffffffff81941ac1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
         [<ffffffff816e93a0>] ? ext4_mb_generate_from_pa+0x590/0x590
         [<ffffffff816502c8>] ? ext4_read_block_bitmap_nowait+0x598/0xe80
         [<ffffffff816e7b7e>] mb_find_order_for_block+0x1ce/0x240
         [...]
      
      Unless compilers start to do some fancy transformations (which at least
      GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
      such calculated value of bb is never used again.
      
      Silence UBSAN by introducing another variable, bb_incr, holding the next
      increment to apply to bb and adjust that one by right shifting it by one
      position per loop iteration.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12aa7d95
    • Jan Kara's avatar
      ext4: fix oops on corrupted filesystem · b2601bb0
      Jan Kara authored
      commit 74177f55 upstream.
      
      When filesystem is corrupted in the right way, it can happen
      ext4_mark_iloc_dirty() in ext4_orphan_add() returns error and we
      subsequently remove inode from the in-memory orphan list. However this
      deletion is done with list_del(&EXT4_I(inode)->i_orphan) and thus we
      leave i_orphan list_head with a stale content. Later we can look at this
      content causing list corruption, oops, or other issues. The reported
      trace looked like:
      
      WARNING: CPU: 0 PID: 46 at lib/list_debug.c:53 __list_del_entry+0x6b/0x100()
      list_del corruption, 0000000061c1d6e0->next is LIST_POISON1
      0000000000100100)
      CPU: 0 PID: 46 Comm: ext4.exe Not tainted 4.1.0-rc4+ #250
      Stack:
       60462947 62219960 602ede24 62219960
       602ede24 603ca293 622198f0 602f02eb
       62219950 6002c12c 62219900 601b4d6b
      Call Trace:
       [<6005769c>] ? vprintk_emit+0x2dc/0x5c0
       [<602ede24>] ? printk+0x0/0x94
       [<600190bc>] show_stack+0xdc/0x1a0
       [<602ede24>] ? printk+0x0/0x94
       [<602ede24>] ? printk+0x0/0x94
       [<602f02eb>] dump_stack+0x2a/0x2c
       [<6002c12c>] warn_slowpath_common+0x9c/0xf0
       [<601b4d6b>] ? __list_del_entry+0x6b/0x100
       [<6002c254>] warn_slowpath_fmt+0x94/0xa0
       [<602f4d09>] ? __mutex_lock_slowpath+0x239/0x3a0
       [<6002c1c0>] ? warn_slowpath_fmt+0x0/0xa0
       [<60023ebf>] ? set_signals+0x3f/0x50
       [<600a205a>] ? kmem_cache_free+0x10a/0x180
       [<602f4e88>] ? mutex_lock+0x18/0x30
       [<601b4d6b>] __list_del_entry+0x6b/0x100
       [<601177ec>] ext4_orphan_del+0x22c/0x2f0
       [<6012f27c>] ? __ext4_journal_start_sb+0x2c/0xa0
       [<6010b973>] ? ext4_truncate+0x383/0x390
       [<6010bc8b>] ext4_write_begin+0x30b/0x4b0
       [<6001bb50>] ? copy_from_user+0x0/0xb0
       [<601aa840>] ? iov_iter_fault_in_readable+0xa0/0xc0
       [<60072c4f>] generic_perform_write+0xaf/0x1e0
       [<600c4166>] ? file_update_time+0x46/0x110
       [<60072f0f>] __generic_file_write_iter+0x18f/0x1b0
       [<6010030f>] ext4_file_write_iter+0x15f/0x470
       [<60094e10>] ? unlink_file_vma+0x0/0x70
       [<6009b180>] ? unlink_anon_vmas+0x0/0x260
       [<6008f169>] ? free_pgtables+0xb9/0x100
       [<600a6030>] __vfs_write+0xb0/0x130
       [<600a61d5>] vfs_write+0xa5/0x170
       [<600a63d6>] SyS_write+0x56/0xe0
       [<6029fcb0>] ? __libc_waitpid+0x0/0xa0
       [<6001b698>] handle_syscall+0x68/0x90
       [<6002633d>] userspace+0x4fd/0x600
       [<6002274f>] ? save_registers+0x1f/0x40
       [<60028bd7>] ? arch_prctl+0x177/0x1b0
       [<60017bd5>] fork_handler+0x85/0x90
      
      Fix the problem by using list_del_init() as we always should with
      i_orphan list.
      Reported-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2601bb0
    • Theodore Ts'o's avatar
      ext4: clean up error handling when orphan list is corrupted · b2044c3f
      Theodore Ts'o authored
      commit 7827a7f6 upstream.
      
      Instead of just printing warning messages, if the orphan list is
      corrupted, declare the file system is corrupted.  If there are any
      reserved inodes in the orphaned inode list, declare the file system
      corrupted and stop right away to avoid doing more potential damage to
      the file system.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2044c3f
    • Theodore Ts'o's avatar
      ext4: fix hang when processing corrupted orphaned inode list · c5ce3898
      Theodore Ts'o authored
      commit c9eb13a9 upstream.
      
      If the orphaned inode list contains inode #5, ext4_iget() returns a
      bad inode (since the bootloader inode should never be referenced
      directly).  Because of the bad inode, we end up processing the inode
      repeatedly and this hangs the machine.
      
      This can be reproduced via:
      
         mke2fs -t ext4 /tmp/foo.img 100
         debugfs -w -R "ssv last_orphan 5" /tmp/foo.img
         mount -o loop /tmp/foo.img /mnt
      
      (But don't do this if you are using an unpatched kernel if you care
      about the system staying functional.  :-)
      
      This bug was found by the port of American Fuzzy Lop into the kernel
      to find file system problems[1].  (Since it *only* happens if inode #5
      shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not
      surprising that AFL needed two hours before it found it.)
      
      [1] http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf
      
      Reported by: Vegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5ce3898
    • Philipp Zabel's avatar
      drm/imx: Match imx-ipuv3-crtc components using device node in platform data · 137bd124
      Philipp Zabel authored
      commit 310944d1 upstream.
      
      The component master driver imx-drm-core matches component devices using
      their of_node. Since commit 950b410dd1ab ("gpu: ipu-v3: Fix imx-ipuv3-crtc
      module autoloading"), the imx-ipuv3-crtc dev->of_node is not set during
      probing. Before that, of_node was set and caused an of: modalias to be
      used instead of the platform: modalias, which broke module autoloading.
      
      On the other hand, if dev->of_node is not set yet when the imx-ipuv3-crtc
      probe function calls component_add, component matching in imx-drm-core
      fails. While dev->of_node will be set once the next component tries to
      bring up the component master, imx-drm-core component binding will never
      succeed if one of the crtc devices is probed last.
      
      Add of_node to the component platform data and match against the
      pdata->of_node instead of dev->of_node in imx-drm-core to work around
      this problem.
      
      Fixes: 950b410dd1ab ("gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading")
      Signed-off-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Tested-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Tested-by: default avatarLothar Waßmann <LW@KARO-electronics.de>
      Tested-by: default avatarHeiko Schocher <hs@denx.de>
      Tested-by: default avatarChris Ruehl <chris.ruehl@gtsys.com.hk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      137bd124
    • Ville Syrjälä's avatar
      drm/i915: Don't leave old junk in ilk active watermarks on readout · d7d5e9be
      Ville Syrjälä authored
      commit 7045c368 upstream.
      
      When we read out the watermark state from the hardware we're supposed to
      transfer that into the active watermarks, but currently we fail to any
      part of the active watermarks that isn't explicitly written. Let's clear
      it all upfront.
      
      Looks like this has been like this since the beginning, when I added the
      readout. No idea why I didn't clear it up.
      
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Fixes: 243e6a44 ("drm/i915: Init HSW watermark tracking in intel_modeset_setup_hw_state()")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Signed-off-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1463151318-14719-2-git-send-email-ville.syrjala@linux.intel.com
      (cherry picked from commit 15606534)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7d5e9be
    • Lyude's avatar
      drm/atomic: Verify connector->funcs != NULL when clearing states · 8453324b
      Lyude authored
      Unfortunately since we don't have Dave's connector refcounting patch
      here yet, it's very possible that drm_atomic_state_default_clear() could
      get called by intel_display_resume() when
      intel_dp_mst_destroy_connector() isn't completely finished destroying an
      mst connector, but has already finished setting connector->funcs to
      NULL. As such, we need to treat the connector like it's already been
      destroyed and just skip it, otherwise we'll end up dereferencing a NULL
      pointer.
      
      This fix is only required for 4.6 and below. David Airlie's patchseries
      for 4.7 to add connector reference counting provides a more proper fix
      for this.
      
      Changes since v1:
       - Fix leftover whitespace
      
      Upstream fix: 0552f765 ("drm/i915/mst: use reference counted
      connectors. (v3)")
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      8453324b
    • Lyude's avatar
      drm/fb_helper: Fix references to dev->mode_config.num_connector · c5b424e7
      Lyude authored
      commit 255f0e7c upstream.
      
      During boot, MST hotplugs are generally expected (even if no physical
      hotplugging occurs) and result in DRM's connector topology changing.
      This means that using num_connector from the current mode configuration
      can lead to the number of connectors changing under us. This can lead to
      some nasty scenarios in fbcon:
      
      - We allocate an array to the size of dev->mode_config.num_connectors.
      - MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
      - We try to loop through each element in the array using the new value
        of dev->mode_config.num_connectors, and end up going out of bounds
        since dev->mode_config.num_connectors is now larger then the array we
        allocated.
      
      fb_helper->connector_count however, will always remain consistent while
      we do a modeset in fb_helper.
      
      Note: This is just polish for 4.7, Dave Airlie's drm_connector
      refcounting fixed these bugs for real. But it's good enough duct-tape
      for stable kernel backporting, since backporting the refcounting
      changes is way too invasive.
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      [danvet: Clarify why we need this. Also remove the now unused "dev"
      local variable to appease gcc.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-3-git-send-email-cpaul@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5b424e7
    • Lyude's avatar
      drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() · c0217001
      Lyude authored
      commit 14a3842a upstream.
      
      During boot time, MST devices usually send a ton of hotplug events
      irregardless of whether or not any physical hotplugs actually occurred.
      Hotplugs mean connectors being created/destroyed, and the number of DRM
      connectors changing under us. This isn't a problem if we use
      fb_helper->connector_count since we only set it once in the code,
      however if we use num_connector from struct drm_mode_config we risk it's
      value changing under us. On top of that, there's even a chance that
      dev->mode_config.num_connector != fb_helper->connector_count. If the
      number of connectors happens to increase under us, we'll end up using
      the wrong array size for memcpy and start writing beyond the actual
      length of the array, occasionally resulting in kernel panics.
      
      Note: This is just polish for 4.7, Dave Airlie's drm_connector
      refcounting fixed these bugs for real. But it's good enough duct-tape
      for stable kernel backporting, since backporting the refcounting
      changes is way too invasive.
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      [danvet: Clarify why we need this.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-2-git-send-email-cpaul@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0217001
    • Mario Kleiner's avatar
      drm/amdgpu: Fix hdmi deep color support. · 4630a1d7
      Mario Kleiner authored
      commit 9d746ab6 upstream.
      
      When porting the hdmi deep color detection code from
      radeon-kms to amdgpu-kms apparently some kind of
      copy and paste error happened, attaching an else
      branch to the wrong if statement.
      
      The result is that hdmi deep color mode is always
      disabled, regardless of gpu and display capabilities and
      user wishes, as the code mistakenly thinks that the display
      doesn't provide the required max_tmds_clock limit and falls
      back to 8 bpc.
      
      This patch fixes deep color support, as tested on a
      R9 380 Tonga Pro + suitable display, and should be
      backported to all kernels with amdgpu-kms support.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4630a1d7
    • Alex Deucher's avatar
      drm/amdgpu: use drm_mode_vrefresh() rather than mode->vrefresh · bf9be904
      Alex Deucher authored
      commit 6b8812eb upstream.
      
      This is a port of radeon commit:
      3d2d98ee
      drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh
      to amdgpu.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf9be904
    • Sinclair Yeh's avatar
      drm/vmwgfx: Fix order of operation · 55d851a9
      Sinclair Yeh authored
      commit 7851496a upstream.
      
      mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before
      the division, potentially making the pitch larger than it should
      be.
      
      Since the original intention is to do a div-round-up, just use
      the macro instead.
      Signed-off-by: default avatarSinclair Yeh <syeh@vmware.com>
      Reviewed-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55d851a9
    • Charmaine Lee's avatar
      drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands. · c1708334
      Charmaine Lee authored
      commit e02e5884 upstream.
      
      Instead of calling vmw_cmd_ok, call vmw_cmd_dx_cid_check to
      validate the context id for query commands.
      Signed-off-by: default avatarCharmaine Lee <charmainel@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1708334
    • Charmaine Lee's avatar
      drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION · 267706b9
      Charmaine Lee authored
      commit 1883598d upstream.
      
      Fixes piglit tests nv_conditional_render-* crashes.
      Signed-off-by: default avatarCharmaine Lee <charmainel@vmware.com>
      Reviewed-by: default avatarBrian Paul <brianp@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      267706b9
    • Itai Handler's avatar
      drm/gma500: Fix possible out of bounds read · 50dd02e7
      Itai Handler authored
      commit 7ccca1d5 upstream.
      
      Fix possible out of bounds read, by adding missing comma.
      The code may read pass the end of the dsi_errors array
      when the most significant bit (bit #31) in the intr_stat register
      is set.
      This bug has been detected using CppCheck (static analysis tool).
      Signed-off-by: default avatarItai Handler <itai_handler@hotmail.com>
      Signed-off-by: default avatarPatrik Jakobsson <patrik.r.jakobsson@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50dd02e7
    • Tomáš Trnka's avatar
      sunrpc: fix stripping of padded MIC tokens · 6c1e441c
      Tomáš Trnka authored
      commit c0cb8bf3 upstream.
      
      The length of the GSS MIC token need not be a multiple of four bytes.
      It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
      would previously only trim mic.len + 4 B. The remaining up to three
      bytes would then trigger a check in nfs4svc_decode_compoundargs(),
      leading to a "garbage args" error and mount failure:
      
      nfs4svc_decode_compoundargs: compound not properly padded!
      nfsd: failed to decode arguments!
      
      This would prevent older clients using the pre-RFC 4121 MIC format
      (37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
      servers using krb5i.
      
      The trimming was introduced by commit 4c190e2f ("sunrpc: trim off
      trailing checksum before returning decrypted or integrity authenticated
      buffer").
      
      Fixes: 4c190e2f "unrpc: trim off trailing checksum..."
      Signed-off-by: default avatarTomáš Trnka <ttrnka@mail.muni.cz>
      Acked-by: default avatarJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c1e441c
    • Juergen Gross's avatar
      xen: use same main loop for counting and remapping pages · aa1cc4d4
      Juergen Gross authored
      commit dd14be92 upstream.
      
      Instead of having two functions for cycling through the E820 map in
      order to count to be remapped pages and remap them later, just use one
      function with a caller supplied sub-function called for each region to
      be processed. This eliminates the possibility of a mismatch between
      both loops which showed up in certain configurations.
      Suggested-by: default avatarEd Swierk <eswierk@skyportsystems.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa1cc4d4
    • Ross Lagerwall's avatar
      xen/events: Don't move disabled irqs · 6232876e
      Ross Lagerwall authored
      commit f0f39387 upstream.
      
      Commit ff1e22e7 ("xen/events: Mask a moving irq") open-coded
      irq_move_irq() but left out checking if the IRQ is disabled. This broke
      resuming from suspend since it tries to move a (disabled) irq without
      holding the IRQ's desc->lock. Fix it by adding in a check for disabled
      IRQs.
      
      The resulting stacktrace was:
      kernel BUG at /build/linux-UbQGH5/linux-4.4.0/kernel/irq/migration.c:31!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: xenfs xen_privcmd ...
      CPU: 0 PID: 9 Comm: migration/0 Not tainted 4.4.0-22-generic #39-Ubuntu
      Hardware name: Xen HVM domU, BIOS 4.6.1-xs125180 05/04/2016
      task: ffff88003d75ee00 ti: ffff88003d7bc000 task.ti: ffff88003d7bc000
      RIP: 0010:[<ffffffff810e26e2>]  [<ffffffff810e26e2>] irq_move_masked_irq+0xd2/0xe0
      RSP: 0018:ffff88003d7bfc50  EFLAGS: 00010046
      RAX: 0000000000000000 RBX: ffff88003d40ba00 RCX: 0000000000000001
      RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff88003d40bad8
      RBP: ffff88003d7bfc68 R08: 0000000000000000 R09: ffff88003d000000
      R10: 0000000000000000 R11: 000000000000023c R12: ffff88003d40bad0
      R13: ffffffff81f3a4a0 R14: 0000000000000010 R15: 00000000ffffffff
      FS:  0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fd4264de624 CR3: 0000000037922000 CR4: 00000000003406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Stack:
       ffff88003d40ba38 0000000000000024 0000000000000000 ffff88003d7bfca0
       ffffffff814c8d92 00000010813ef89d 00000000805ea732 0000000000000009
       0000000000000024 ffff88003cc39b80 ffff88003d7bfce0 ffffffff814c8f66
      Call Trace:
       [<ffffffff814c8d92>] eoi_pirq+0xb2/0xf0
       [<ffffffff814c8f66>] __startup_pirq+0xe6/0x150
       [<ffffffff814ca659>] xen_irq_resume+0x319/0x360
       [<ffffffff814c7e75>] xen_suspend+0xb5/0x180
       [<ffffffff81120155>] multi_cpu_stop+0xb5/0xe0
       [<ffffffff811200a0>] ? cpu_stop_queue_work+0x80/0x80
       [<ffffffff811203d0>] cpu_stopper_thread+0xb0/0x140
       [<ffffffff810a94e6>] ? finish_task_switch+0x76/0x220
       [<ffffffff810ca731>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
       [<ffffffff810a3935>] smpboot_thread_fn+0x105/0x160
       [<ffffffff810a3830>] ? sort_range+0x30/0x30
       [<ffffffff810a0588>] kthread+0xd8/0xf0
       [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
       [<ffffffff8182568f>] ret_from_fork+0x3f/0x70
       [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6232876e
    • Gavin Shan's avatar
      powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() · 0118086d
      Gavin Shan authored
      commit 5a0cdbfd upstream.
      
      The function eeh_pe_reset_and_recover() is used to recover EEH
      error when the passthrou device are transferred to guest and
      backwards. The content in the device's config space will be lost
      on PE reset issued in the middle of the recovery. The function
      saves/restores it before/after the reset. However, config access
      to some adapters like Broadcom BCM5719 at this point will causes
      fenced PHB. The config space is always blocked and we save 0xFF's
      that are restored at late point. The memory BARs are totally
      corrupted, causing another EEH error upon access to one of the
      memory BARs.
      
      This restores the config space on those adapters like BCM5719
      from the content saved to the EEH device when it's populated,
      to resolve above issue.
      
      Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0118086d
    • Guilherme G. Piccoli's avatar
      Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" · af64f74e
      Guilherme G. Piccoli authored
      commit c2078d9e upstream.
      
      This reverts commit 89a51df5.
      
      The function eeh_add_device_early() is used to perform EEH
      initialization in devices added later on the system, like in
      hotplug/DLPAR scenarios. Since the commit 89a51df5 ("powerpc/eeh:
      Fix crash in eeh_add_device_early() on Cell") a new check was introduced
      in this function - Cell has no EEH capabilities which led to kernel oops
      if hotplug was performed, so checking for eeh_enabled() was introduced
      to avoid the issue.
      
      However, in architectures that EEH is present like pSeries or PowerNV,
      we might reach a case in which no PCI devices are present on boot time
      and so EEH is not initialized. Then, if a device is added via DLPAR for
      example, eeh_add_device_early() fails because eeh_enabled() is false,
      and EEH end up not being enabled at all.
      
      This reverts the aforementioned patch since a new verification was
      introduced by the commit d91dafc0 ("powerpc/eeh: Delay probing EEH
      device during hotplug") and so the original Cell issue does not happen
      anymore.
      Reviewed-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af64f74e
    • Gavin Shan's avatar
      powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() · d140d142
      Gavin Shan authored
      commit affeb0f2 upstream.
      
      The function eeh_pe_reset_and_recover() is used to recover EEH
      error when the passthrough device are transferred to guest and
      backwards, meaning the device's driver is vfio-pci or none.
      When the driver is vfio-pci that provides error_detected() error
      handler only, the handler simply stops the guest and it's not
      expected behaviour. On the other hand, no error handlers will
      be called if we don't have a bound driver.
      
      This ignores the error handler in eeh_pe_reset_and_recover()
      that reports the error to device driver to avoid the exceptional
      behaviour.
      
      Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d140d142
    • Hari Bathini's avatar
      powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel · 5d3bb5e6
      Hari Bathini authored
      commit 8ed8ab40 upstream.
      
      Some of the interrupt vectors on 64-bit POWER server processors are only
      32 bytes long (8 instructions), which is not enough for the full
      first-level interrupt handler. For these we need to branch to an
      out-of-line (OOL) handler. But when we are running a relocatable kernel,
      interrupt vectors till __end_interrupts marker are copied down to real
      address 0x100. So, branching to labels (ie. OOL handlers) outside this
      section must be handled differently (see LOAD_HANDLER()), considering
      relocatable kernel, which would need at least 4 instructions.
      
      However, branching from interrupt vector means that we corrupt the
      CFAR (come-from address register) on POWER7 and later processors as
      mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
      that contains the part up to the point where the CFAR is saved in the
      PACA should be part of the short interrupt vectors before we branch out
      to OOL handlers.
      
      But as mentioned already, there are interrupt vectors on 64-bit POWER
      server processors that are only 32 bytes long (like vectors 0x4f00,
      0x4f20, etc.), which cannot accomodate the above two cases at the same
      time owing to space constraint. Currently, in these interrupt vectors,
      we simply branch out to OOL handlers, without using LOAD_HANDLER(),
      which leaves us vulnerable when running a relocatable kernel (eg. kdump
      case). While this has been the case for sometime now and kdump is used
      widely, we were fortunate not to see any problems so far, for three
      reasons:
      
        1. In almost all cases, production kernel (relocatable) is used for
           kdump as well, which would mean that crashed kernel's OOL handler
           would be at the same place where we end up branching to, from short
           interrupt vector of kdump kernel.
        2. Also, OOL handler was unlikely the reason for crash in almost all
           the kdump scenarios, which meant we had a sane OOL handler from
           crashed kernel that we branched to.
        3. On most 64-bit POWER server processors, page size is large enough
           that marking interrupt vector code as executable (see commit
           429d2e83) leads to marking OOL handler code from crashed kernel,
           that sits right below interrupt vector code from kdump kernel, as
           executable as well.
      
      Let us fix this by moving the __end_interrupts marker down past OOL
      handlers to make sure that we also copy OOL handlers to real address
      0x100 when running a relocatable kernel.
      
      This fix has been tested successfully in kdump scenario, on an LPAR with
      4K page size by using different default/production kernel and kdump
      kernel.
      
      Also tested by manually corrupting the OOL handlers in the first kernel
      and then kdump'ing, and then causing the OOL handlers to fire - mpe.
      
      Fixes: c1fb6816 ("powerpc: Add relocation on exception vector handlers")
      Signed-off-by: default avatarHari Bathini <hbathini@linux.vnet.ibm.com>
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d3bb5e6
    • Willy Tarreau's avatar
      pipe: limit the per-user amount of pages allocated in pipes · fa6d0ba1
      Willy Tarreau authored
      commit 759c0114 upstream.
      
      On no-so-small systems, it is possible for a single process to cause an
      OOM condition by filling large pipes with data that are never read. A
      typical process filling 4000 pipes with 1 MB of data will use 4 GB of
      memory. On small systems it may be tricky to set the pipe max size to
      prevent this from happening.
      
      This patch makes it possible to enforce a per-user soft limit above
      which new pipes will be limited to a single page, effectively limiting
      them to 4 kB each, as well as a hard limit above which no new pipes may
      be created for this user. This has the effect of protecting the system
      against memory abuse without hurting other users, and still allowing
      pipes to work correctly though with less data at once.
      
      The limit are controlled by two new sysctls : pipe-user-pages-soft, and
      pipe-user-pages-hard. Both may be disabled by setting them to zero. The
      default soft limit allows the default number of FDs per process (1024)
      to create pipes of the default size (64kB), thus reaching a limit of 64MB
      before starting to create only smaller pipes. With 256 processes limited
      to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
      1084 MB of memory allocated for a user. The hard limit is disabled by
      default to avoid breaking existing applications that make intensive use
      of pipes (eg: for splicing).
      
      Reported-by: socketpair@gmail.com
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Mitigates: CVE-2013-4312 (Linux 2.0+)
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Moritz Muehlenhoff <moritz@wikimedia.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa6d0ba1