• Douglas Anderson's avatar
    dm bufio: avoid sleeping while holding the dm_bufio lock · 9ea61cac
    Douglas Anderson authored
    We've seen in-field reports showing _lots_ (18 in one case, 41 in
    another) of tasks all sitting there blocked on:
    
      mutex_lock+0x4c/0x68
      dm_bufio_shrink_count+0x38/0x78
      shrink_slab.part.54.constprop.65+0x100/0x464
      shrink_zone+0xa8/0x198
    
    In the two cases analyzed, we see one task that looks like this:
    
      Workqueue: kverityd verity_prefetch_io
    
      __switch_to+0x9c/0xa8
      __schedule+0x440/0x6d8
      schedule+0x94/0xb4
      schedule_timeout+0x204/0x27c
      schedule_timeout_uninterruptible+0x44/0x50
      wait_iff_congested+0x9c/0x1f0
      shrink_inactive_list+0x3a0/0x4cc
      shrink_lruvec+0x418/0x5cc
      shrink_zone+0x88/0x198
      try_to_free_pages+0x51c/0x588
      __alloc_pages_nodemask+0x648/0xa88
      __get_free_pages+0x34/0x7c
      alloc_buffer+0xa4/0x144
      __bufio_new+0x84/0x278
      dm_bufio_prefetch+0x9c/0x154
      verity_prefetch_io+0xe8/0x10c
      process_one_work+0x240/0x424
      worker_thread+0x2fc/0x424
      kthread+0x10c/0x114
    
    ...and that looks to be the one holding the mutex.
    
    The problem has been reproduced on fairly easily:
    0. Be running Chrome OS w/ verity enabled on the root filesystem
    1. Pick test patch: http://crosreview.com/412360
    2. Install launchBalloons.sh and balloon.arm from
         http://crbug.com/468342
       ...that's just a memory stress test app.
    3. On a 4GB rk3399 machine, run
         nice ./launchBalloons.sh 4 900 100000
       ...that tries to eat 4 * 900 MB of memory and keep accessing.
    4. Login to the Chrome web browser and restore many tabs
    
    With that, I've seen printouts like:
      DOUG: long bufio 90758 ms
    ...and stack trace always show's we're in dm_bufio_prefetch().
    
    The problem is that we try to allocate memory with GFP_NOIO while
    we're holding the dm_bufio lock.  Instead we should be using
    GFP_NOWAIT.  Using GFP_NOIO can cause us to sleep while holding the
    lock and that causes the above problems.
    
    The current behavior explained by David Rientjes:
    
      It will still try reclaim initially because __GFP_WAIT (or
      __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO.  This is the cause of
      contention on dm_bufio_lock() that the thread holds.  You want to
      pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
      mutex that can be contended by a concurrent slab shrinker (if
      count_objects didn't use a trylock, this pattern would trivially
      deadlock).
    
    This change significantly increases responsiveness of the system while
    in this state.  It makes a real difference because it unblocks kswapd.
    In the bug report analyzed, kswapd was hung:
    
       kswapd0         D ffffffc000204fd8     0    72      2 0x00000000
       Call trace:
       [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
       [<ffffffc00090b794>] __schedule+0x440/0x6d8
       [<ffffffc00090bac0>] schedule+0x94/0xb4
       [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
       [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
       [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
       [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
       [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
       [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
       [<ffffffc00030e578>] balance_pgdat+0x328/0x508
       [<ffffffc00030eb7c>] kswapd+0x424/0x51c
       [<ffffffc00023f06c>] kthread+0x10c/0x114
       [<ffffffc000203dd0>] ret_from_fork+0x10/0x40
    
    By unblocking kswapd memory pressure should be reduced.
    Suggested-by: default avatarDavid Rientjes <rientjes@google.com>
    Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
    Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
    Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
    9ea61cac
dm-bufio.c 46.9 KB