1. 24 Mar, 2016 1 commit
    • Yang Shi's avatar
      arm64: replace read_lock to rcu lock in call_break_hook · 1a138f3e
      Yang Shi authored
      [ Upstream commit 62c6c61a ]
      
      BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
      in_atomic(): 0, irqs_disabled(): 128, pid: 342, name: perf
      1 lock held by perf/342:
       #0:  (break_hook_lock){+.+...}, at: [<ffffffc0000851ac>] call_break_hook+0x34/0xd0
      irq event stamp: 62224
      hardirqs last  enabled at (62223): [<ffffffc00010b7bc>] __call_rcu.constprop.59+0x104/0x270
      hardirqs last disabled at (62224): [<ffffffc0000fbe20>] vprintk_emit+0x68/0x640
      softirqs last  enabled at (0): [<ffffffc000097928>] copy_process.part.8+0x428/0x17f8
      softirqs last disabled at (0): [<          (null)>]           (null)
      CPU: 0 PID: 342 Comm: perf Not tainted 4.1.6-rt5 #4
      Hardware name: linux,dummy-virt (DT)
      Call trace:
      [<ffffffc000089968>] dump_backtrace+0x0/0x128
      [<ffffffc000089ab0>] show_stack+0x20/0x30
      [<ffffffc0007030d0>] dump_stack+0x7c/0xa0
      [<ffffffc0000c878c>] ___might_sleep+0x174/0x260
      [<ffffffc000708ac8>] __rt_spin_lock+0x28/0x40
      [<ffffffc000708db0>] rt_read_lock+0x60/0x80
      [<ffffffc0000851a8>] call_break_hook+0x30/0xd0
      [<ffffffc000085a70>] brk_handler+0x30/0x98
      [<ffffffc000082248>] do_debug_exception+0x50/0xb8
      Exception stack(0xffffffc00514fe30 to 0xffffffc00514ff50)
      fe20:                                     00000000 00000000 c1594680 0000007f
      fe40: ffffffff ffffffff 92063940 0000007f 0550dcd8 ffffffc0 00000000 00000000
      fe60: 0514fe70 ffffffc0 000be1f8 ffffffc0 0514feb0 ffffffc0 0008948c ffffffc0
      fe80: 00000004 00000000 0514fed0 ffffffc0 ffffffff ffffffff 9282a948 0000007f
      fea0: 00000000 00000000 9282b708 0000007f c1592820 0000007f 00083914 ffffffc0
      fec0: 00000000 00000000 00000010 00000000 00000064 00000000 00000001 00000000
      fee0: 005101e0 00000000 c1594680 0000007f c1594740 0000007f ffffffd8 ffffff80
      ff00: 00000000 00000000 00000000 00000000 c1594770 0000007f c1594770 0000007f
      ff20: 00665e10 00000000 7f7f7f7f 7f7f7f7f 01010101 01010101 00000000 00000000
      ff40: 928e4cc0 0000007f 91ff11e8 0000007f
      
      call_break_hook is called in atomic context (hard irq disabled), so replace
      the sleepable lock to rcu lock, replace relevant list operations to rcu
      version and call synchronize_rcu() in unregister_break_hook().
      
      And, replace write lock to spinlock in {un}register_break_hook.
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1a138f3e
  2. 23 Mar, 2016 4 commits
    • Jan Kara's avatar
      ext4: fix races of writeback with punch hole and zero range · f2b13259
      Jan Kara authored
      When doing delayed allocation, update of on-disk inode size is postponed
      until IO submission time. However hole punch or zero range fallocate
      calls can end up discarding the tail page cache page and thus on-disk
      inode size would never be properly updated.
      
      Make sure the on-disk inode size is updated before truncating page
      cache.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f2b13259
    • Jan Kara's avatar
      ext4: fix races between buffered IO and collapse / insert range · 181aaebd
      Jan Kara authored
      Current code implementing FALLOC_FL_COLLAPSE_RANGE and
      FALLOC_FL_INSERT_RANGE is prone to races with buffered writes and page
      faults. If buffered write or write via mmap manages to squeeze between
      filemap_write_and_wait_range() and truncate_pagecache() in the fallocate
      implementations, the written data is simply discarded by
      truncate_pagecache() although it should have been shifted.
      
      Fix the problem by moving filemap_write_and_wait_range() call inside
      i_mutex and i_mmap_sem. That way we are protected against races with
      both buffered writes and page faults.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      181aaebd
    • Jan Kara's avatar
      ext4: move unlocked dio protection from ext4_alloc_file_blocks() · 9621787d
      Jan Kara authored
      Currently ext4_alloc_file_blocks() was handling protection against
      unlocked DIO. However we now need to sometimes call it under i_mmap_sem
      and sometimes not and DIO protection ranks above it (although strictly
      speaking this cannot currently create any deadlocks). Also
      ext4_zero_range() was actually getting & releasing unlocked DIO
      protection twice in some cases. Luckily it didn't introduce any real bug
      but it was a land mine waiting to be stepped on.  So move DIO protection
      out from ext4_alloc_file_blocks() into the two callsites.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9621787d
    • Jan Kara's avatar
      ext4: fix races between page faults and hole punching · 248766f0
      Jan Kara authored
      Currently, page faults and hole punching are completely unsynchronized.
      This can result in page fault faulting in a page into a range that we
      are punching after truncate_pagecache_range() has been called and thus
      we can end up with a page mapped to disk blocks that will be shortly
      freed. Filesystem corruption will shortly follow. Note that the same
      race is avoided for truncate by checking page fault offset against
      i_size but there isn't similar mechanism available for punching holes.
      
      Fix the problem by creating new rw semaphore i_mmap_sem in inode and
      grab it for writing over truncate, hole punching, and other functions
      removing blocks from extent tree and for read over page faults. We
      cannot easily use i_data_sem for this since that ranks below transaction
      start and we need something ranking above it so that it can be held over
      the whole truncate / hole punching operation. Also remove various
      workarounds we had in the code to reduce race window when page fault
      could have created pages with stale mapping information.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      248766f0
  3. 22 Mar, 2016 26 commits
  4. 18 Mar, 2016 9 commits