1. 30 May, 2018 2 commits
  2. 28 May, 2018 1 commit
  3. 26 May, 2018 33 commits
  4. 24 May, 2018 1 commit
    • Al Viro's avatar
      fix io_destroy()/aio_complete() race · 4faa9996
      Al Viro authored
      If io_destroy() gets to cancelling everything that can be cancelled and
      gets to kiocb_cancel() calling the function driver has left in ->ki_cancel,
      it becomes vulnerable to a race with IO completion.  At that point req
      is already taken off the list and aio_complete() does *NOT* spin until
      we (in free_ioctx_users()) releases ->ctx_lock.  As the result, it proceeds
      to kiocb_free(), freing req just it gets passed to ->ki_cancel().
      
      Fix is simple - remove from the list after the call of kiocb_cancel().  All
      instances of ->ki_cancel() already have to cope with the being called with
      iocb still on list - that's what happens in io_cancel(2).
      
      Cc: stable@kernel.org
      Fixes: 0460fef2 "aio: use cancellation list lazily"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4faa9996
  5. 21 May, 2018 3 commits
    • Al Viro's avatar
      aio: fix io_destroy(2) vs. lookup_ioctx() race · baf10564
      Al Viro authored
      kill_ioctx() used to have an explicit RCU delay between removing the
      reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
      At some point that delay had been removed, on the theory that
      percpu_ref_kill() itself contained an RCU delay.  Unfortunately, that was
      the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
      by lookup_ioctx().  As the result, we could get ctx freed right under
      lookup_ioctx().  Tejun has fixed that in a6d7cff4 ("fs/aio: Add explicit
      RCU grace period when freeing kioctx"); however, that fix is not enough.
      
      Suppose io_destroy() from one thread races with e.g. io_setup() from another;
      CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
      has picked it (under rcu_read_lock()).  Then CPU1 proceeds to drop the
      refcount, getting it to 0 and triggering a call of free_ioctx_users(),
      which proceeds to drop the secondary refcount and once that reaches zero
      calls free_ioctx_reqs().  That does
              INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
              queue_rcu_work(system_wq, &ctx->free_rwork);
      and schedules freeing the whole thing after RCU delay.
      
      In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
      refcount from 0 to 1 and returned the reference to io_setup().
      
      Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
      freed until after percpu_ref_get().  Sure, we'd increment the counter before
      ctx can be freed.  Now we are out of rcu_read_lock() and there's nothing to
      stop freeing of the whole thing.  Unfortunately, CPU2 assumes that since it
      has grabbed the reference, ctx is *NOT* going away until it gets around to
      dropping that reference.
      
      The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
      It's not costlier than what we currently do in normal case, it's safe to
      call since freeing *is* delayed and it closes the race window - either
      lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
      won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
      fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
      the object in question at all.
      
      Cc: stable@kernel.org
      Fixes: a6d7cff4 "fs/aio: Add explicit RCU grace period when freeing kioctx"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      baf10564
    • Al Viro's avatar
      ext2: fix a block leak · 5aa1437d
      Al Viro authored
      open file, unlink it, then use ioctl(2) to make it immutable or
      append only.  Now close it and watch the blocks *not* freed...
      
      Immutable/append-only checks belong in ->setattr().
      Note: the bug is old and backport to anything prior to 737f2e93
      ("ext2: convert to use the new truncate convention") will need
      these checks lifted into ext2_setattr().
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5aa1437d
    • Al Viro's avatar
      nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed · 3819bb0d
      Al Viro authored
      That can (and does, on some filesystems) happen - ->mkdir() (and thus
      vfs_mkdir()) can legitimately leave its argument negative and just
      unhash it, counting upon the lookup to pick the object we'd created
      next time we try to look at that name.
      
      Some vfs_mkdir() callers forget about that possibility...
      Acked-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3819bb0d