1. 06 Sep, 2019 25 commits
  2. 29 Aug, 2019 15 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.69 · 97ab07e1
      Greg Kroah-Hartman authored
      97ab07e1
    • David Howells's avatar
      rxrpc: Fix local refcounting · 6d471741
      David Howells authored
      [ Upstream commit 68553f1a ]
      
      Fix rxrpc_unuse_local() to handle a NULL local pointer as it can be called
      on an unbound socket on which rx->local is not yet set.
      
      The following reproduced (includes omitted):
      
      	int main(void)
      	{
      		socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
      		return 0;
      	}
      
      causes the following oops to occur:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000010
      	...
      	RIP: 0010:rxrpc_unuse_local+0x8/0x1b
      	...
      	Call Trace:
      	 rxrpc_release+0x2b5/0x338
      	 __sock_release+0x37/0xa1
      	 sock_close+0x14/0x17
      	 __fput+0x115/0x1e9
      	 task_work_run+0x72/0x98
      	 do_exit+0x51b/0xa7a
      	 ? __context_tracking_exit+0x4e/0x10e
      	 do_group_exit+0xab/0xab
      	 __x64_sys_exit_group+0x14/0x17
      	 do_syscall_64+0x89/0x1d4
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Reported-by: syzbot+20dee719a2e090427b5f@syzkaller.appspotmail.com
      Fixes: 730c5fd4 ("rxrpc: Fix local endpoint refcounting")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6d471741
    • David Howells's avatar
      rxrpc: Fix local endpoint replacement · ce3f9e19
      David Howells authored
      [ Upstream commit b00df840 ]
      
      When a local endpoint (struct rxrpc_local) ceases to be in use by any
      AF_RXRPC sockets, it starts the process of being destroyed, but this
      doesn't cause it to be removed from the namespace endpoint list immediately
      as tearing it down isn't trivial and can't be done in softirq context, so
      it gets deferred.
      
      If a new socket comes along that wants to bind to the same endpoint, a new
      rxrpc_local object will be allocated and rxrpc_lookup_local() will use
      list_replace() to substitute the new one for the old.
      
      Then, when the dying object gets to rxrpc_local_destroyer(), it is removed
      unconditionally from whatever list it is on by calling list_del_init().
      
      However, list_replace() doesn't reset the pointers in the replaced
      list_head and so the list_del_init() will likely corrupt the local
      endpoints list.
      
      Fix this by using list_replace_init() instead.
      
      Fixes: 730c5fd4 ("rxrpc: Fix local endpoint refcounting")
      Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ce3f9e19
    • David Howells's avatar
      rxrpc: Fix read-after-free in rxrpc_queue_local() · a05354cb
      David Howells authored
      commit 06d9532f upstream.
      
      rxrpc_queue_local() attempts to queue the local endpoint it is given and
      then, if successful, prints a trace line.  The trace line includes the
      current usage count - but we're not allowed to look at the local endpoint
      at this point as we passed our ref on it to the workqueue.
      
      Fix this by reading the usage count before queuing the work item.
      
      Also fix the reading of local->debug_id for trace lines, which must be done
      with the same consideration as reading the usage count.
      
      Fixes: 09d2bf59 ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
      Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a05354cb
    • David Howells's avatar
      rxrpc: Fix local endpoint refcounting · f28023c4
      David Howells authored
      commit 730c5fd4 upstream.
      
      The object lifetime management on the rxrpc_local struct is broken in that
      the rxrpc_local_processor() function is expected to clean up and remove an
      object - but it may get requeued by packets coming in on the backing UDP
      socket once it starts running.
      
      This may result in the assertion in rxrpc_local_rcu() firing because the
      memory has been scheduled for RCU destruction whilst still queued:
      
      	rxrpc: Assertion failed
      	------------[ cut here ]------------
      	kernel BUG at net/rxrpc/local_object.c:468!
      
      Note that if the processor comes around before the RCU free function, it
      will just do nothing because ->dead is true.
      
      Fix this by adding a separate refcount to count active users of the
      endpoint that causes the endpoint to be destroyed when it reaches 0.
      
      The original refcount can then be used to refcount objects through the work
      processor and cause the memory to be rcu freed when that reaches 0.
      
      Fixes: 4f95dd78 ("rxrpc: Rework local endpoint management")
      Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f28023c4
    • Alastair D'Silva's avatar
      powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB · 32df8a30
      Alastair D'Silva authored
      The upstream commit:
      22e9c88d ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
      has a similar effect, but since it is a rewrite of the assembler to C, is
      too invasive for stable. This patch is a minimal fix to address the issue in
      assembler.
      
      This patch applies cleanly to v5.2, v4.19 & v4.14.
      
      When calling flush_(inval_)dcache_range with a size >4GB, we were masking
      off the upper 32 bits, so we would incorrectly flush a range smaller
      than intended.
      
      This patch replaces the 32 bit shifts with 64 bit ones, so that
      the full size is accounted for.
      Signed-off-by: default avatarAlastair D'Silva <alastair@d-silva.org>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32df8a30
    • Dan Carpenter's avatar
      dm zoned: fix potential NULL dereference in dmz_do_reclaim() · 0d5e34c1
      Dan Carpenter authored
      [ Upstream commit e0702d90 ]
      
      This function is supposed to return error pointers so it matches the
      dmz_get_rnd_zone_for_reclaim() function.  The current code could lead to
      a NULL dereference in dmz_do_reclaim()
      
      Fixes: b234c6d7 ("dm zoned: improve error handling in reclaim")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarDmitry Fomichev <dmitry.fomichev@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0d5e34c1
    • Darrick J. Wong's avatar
      xfs: always rejoin held resources during defer roll · 655bb2c4
      Darrick J. Wong authored
      commit 710d707d upstream.
      
      During testing of xfs/141 on a V4 filesystem, I observed some
      inconsistent behavior with regards to resources that are held (i.e.
      remain locked) across a defer roll.  The transaction roll always gives
      the defer roll function a new transaction, even if committing the old
      transaction fails.  However, the defer roll function only rejoins the
      held resources if the transaction commit succeedied.  This means that
      callers of defer roll have to figure out whether the held resources are
      attached to the transaction being passed back.
      
      Worse yet, if the defer roll was part of a defer finish call, we have a
      third possibility: the defer finish could pass back a dirty transaction
      with dirty held resources and an error code.
      
      The only sane way to handle all of these scenarios is to require that
      the code that held the resource either cancel the transaction before
      unlocking and releasing the resources, or use functions that detach
      resources from a transaction properly (e.g.  xfs_trans_brelse) if they
      need to drop the reference before committing or cancelling the
      transaction.
      
      In order to make this so, change the defer roll code to join held
      resources to the new transaction unconditionally and fix all the bhold
      callers to release the held buffers correctly.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      [mcgrof: fixes kz#204223 ]
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      655bb2c4
    • Allison Henderson's avatar
      xfs: Add attibute remove and helper functions · 83a8e6b2
      Allison Henderson authored
      commit 068f985a upstream.
      
      This patch adds xfs_attr_remove_args. These sub-routines remove
      the attributes specified in @args. We will use this later for setting
      parent pointers as a deferred attribute operation.
      Signed-off-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      83a8e6b2
    • Allison Henderson's avatar
      xfs: Add attibute set and helper functions · b21ff6cf
      Allison Henderson authored
      commit 2f3cd809 upstream.
      
      This patch adds xfs_attr_set_args and xfs_bmap_set_attrforkoff.
      These sub-routines set the attributes specified in @args.
      We will use this later for setting parent pointers as a deferred
      attribute operation.
      
      [dgc: remove attr fork init code from xfs_attr_set_args().]
      [dgc: xfs_attr_try_sf_addname() NULLs args.trans after commit.]
      [dgc: correct sf add error handling.]
      Signed-off-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b21ff6cf
    • Allison Henderson's avatar
      xfs: Add helper function xfs_attr_try_sf_addname · b3a248f2
      Allison Henderson authored
      commit 4c74a56b upstream.
      
      This patch adds a subroutine xfs_attr_try_sf_addname
      used by xfs_attr_set.  This subrotine will attempt to
      add the attribute name specified in args in shortform,
      as well and perform error handling previously done in
      xfs_attr_set.
      
      This patch helps to pre-simplify xfs_attr_set for reviewing
      purposes and reduce indentation.  New function will be added
      in the next patch.
      
      [dgc: moved commit to helper function, too.]
      Signed-off-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b3a248f2
    • Allison Henderson's avatar
      xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h · a9912f34
      Allison Henderson authored
      commit e2421f0b upstream.
      
      This patch moves fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
      since xfs_attr.c is in libxfs.  We will need these later in
      xfsprogs.
      Signed-off-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a9912f34
    • Brian Foster's avatar
      xfs: don't trip over uninitialized buffer on extent read of corrupted inode · 17c2b7af
      Brian Foster authored
      commit 6958d11f upstream.
      
      We've had rather rare reports of bmap btree block corruption where
      the bmap root block has a level count of zero. The root cause of the
      corruption is so far unknown. We do have verifier checks to detect
      this form of on-disk corruption, but this doesn't cover a memory
      corruption variant of the problem. The latter is a reasonable
      possibility because the root block is part of the inode fork and can
      reside in-core for some time before inode extents are read.
      
      If this occurs, it leads to a system crash such as the following:
      
       BUG: unable to handle kernel paging request at ffffffff00000221
       PF error: [normal kernel read fault]
       ...
       RIP: 0010:xfs_trans_brelse+0xf/0x200 [xfs]
       ...
       Call Trace:
        xfs_iread_extents+0x379/0x540 [xfs]
        xfs_file_iomap_begin_delay+0x11a/0xb40 [xfs]
        ? xfs_attr_get+0xd1/0x120 [xfs]
        ? iomap_write_begin.constprop.40+0x2d0/0x2d0
        xfs_file_iomap_begin+0x4c4/0x6d0 [xfs]
        ? __vfs_getxattr+0x53/0x70
        ? iomap_write_begin.constprop.40+0x2d0/0x2d0
        iomap_apply+0x63/0x130
        ? iomap_write_begin.constprop.40+0x2d0/0x2d0
        iomap_file_buffered_write+0x62/0x90
        ? iomap_write_begin.constprop.40+0x2d0/0x2d0
        xfs_file_buffered_aio_write+0xe4/0x3b0 [xfs]
        __vfs_write+0x150/0x1b0
        vfs_write+0xba/0x1c0
        ksys_pwrite64+0x64/0xa0
        do_syscall_64+0x5a/0x1d0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The crash occurs because xfs_iread_extents() attempts to release an
      uninitialized buffer pointer as the level == 0 value prevented the
      buffer from ever being allocated or read. Change the level > 0
      assert to an explicit error check in xfs_iread_extents() to avoid
      crashing the kernel in the event of localized, in-core inode
      corruption.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      17c2b7af
    • Darrick J. Wong's avatar
      xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT · 11f85d4d
      Darrick J. Wong authored
      commit 1fb254aa upstream.
      
      Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
      fails on account of being out of disk quota.  I ran his reproducer
      script:
      
      # adduser dummy
      # adduser dummy plugdev
      
      # dd if=/dev/zero bs=1M count=100 of=test.img
      # mkfs.xfs test.img
      # mount -t xfs -o gquota test.img /mnt
      # mkdir -p /mnt/dummy
      # chown -c dummy /mnt/dummy
      # xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt
      
      (and then as user dummy)
      
      $ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
      $ chgrp plugdev /mnt/dummy/foo
      
      and saw:
      
      ================================================
      WARNING: lock held when returning to user space!
      5.3.0-rc5 #rc5 Tainted: G        W
      ------------------------------------------------
      chgrp/47006 is leaving the kernel with locks still held!
      1 lock held by chgrp/47006:
       #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]
      
      ...which is clearly caused by xfs_setattr_nonsize failing to unlock the
      ILOCK after the xfs_qm_vop_chown_reserve call fails.  Add the missing
      unlock.
      
      Reported-by: benjamin.moody@gmail.com
      Fixes: 253f4911 ("xfs: better xfs_trans_alloc interface")
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Tested-by: default avatarSalvatore Bonaccorso <carnil@debian.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11f85d4d
    • Henry Burns's avatar
      mm/zsmalloc.c: fix race condition in zs_destroy_pool · ed11e600
      Henry Burns authored
      commit 701d6785 upstream.
      
      In zs_destroy_pool() we call flush_work(&pool->free_work).  However, we
      have no guarantee that migration isn't happening in the background at
      that time.
      
      Since migration can't directly free pages, it relies on free_work being
      scheduled to free the pages.  But there's nothing preventing an
      in-progress migrate from queuing the work *after*
      zs_unregister_migration() has called flush_work().  Which would mean
      pages still pointing at the inode when we free it.
      
      Since we know at destroy time all objects should be free, no new
      migrations can come in (since zs_page_isolate() fails for fully-free
      zspages).  This means it is sufficient to track a "# isolated zspages"
      count by class, and have the destroy logic ensure all such pages have
      drained before proceeding.  Keeping that state under the class spinlock
      keeps the logic straightforward.
      
      In this case a memory leak could lead to an eventual crash if compaction
      hits the leaked page.  This crash would only occur if people are
      changing their zswap backend at runtime (which eventually starts
      destruction).
      
      Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
      Fixes: 48b4800a ("zsmalloc: page migration support")
      Signed-off-by: default avatarHenry Burns <henryburns@google.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Henry Burns <henrywolfeburns@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed11e600