• Dave Chinner's avatar
    xfs: bufferhead chains are invalid after end_page_writeback · 28b783e4
    Dave Chinner authored
    In xfs_finish_page_writeback(), we have a loop that looks like this:
    
            do {
                    if (off < bvec->bv_offset)
                            goto next_bh;
                    if (off > end)
                            break;
                    bh->b_end_io(bh, !error);
    next_bh:
                    off += bh->b_size;
            } while ((bh = bh->b_this_page) != head);
    
    The b_end_io function is end_buffer_async_write(), which will call
    end_page_writeback() once all the buffers have marked as no longer
    under IO.  This issue here is that the only thing currently
    protecting both the bufferhead chain and the page from being
    reclaimed is the PageWriteback state held on the page.
    
    While we attempt to limit the loop to just the buffers covered by
    the IO, we still read from the buffer size and follow the next
    pointer in the bufferhead chain. There is no guarantee that either
    of these are valid after the PageWriteback flag has been cleared.
    Hence, loops like this are completely unsafe, and result in
    use-after-free issues. One such problem was caught by Calvin Owens
    with KASAN:
    
    .....
     INFO: Freed in 0x103fc80ec age=18446651500051355200 cpu=2165122683 pid=-1
      free_buffer_head+0x41/0x90
      __slab_free+0x1ed/0x340
      kmem_cache_free+0x270/0x300
      free_buffer_head+0x41/0x90
      try_to_free_buffers+0x171/0x240
      xfs_vm_releasepage+0xcb/0x3b0
      try_to_release_page+0x106/0x190
      shrink_page_list+0x118e/0x1a10
      shrink_inactive_list+0x42c/0xdf0
      shrink_zone_memcg+0xa09/0xfa0
      shrink_zone+0x2c3/0xbc0
    .....
     Call Trace:
      <IRQ>  [<ffffffff81e8b8e4>] dump_stack+0x68/0x94
      [<ffffffff8153a995>] print_trailer+0x115/0x1a0
      [<ffffffff81541174>] object_err+0x34/0x40
      [<ffffffff815436e7>] kasan_report_error+0x217/0x530
      [<ffffffff81543b33>] __asan_report_load8_noabort+0x43/0x50
      [<ffffffff819d651f>] xfs_destroy_ioend+0x3bf/0x4c0
      [<ffffffff819d69d4>] xfs_end_bio+0x154/0x220
      [<ffffffff81de0c58>] bio_endio+0x158/0x1b0
      [<ffffffff81dff61b>] blk_update_request+0x18b/0xb80
      [<ffffffff821baf57>] scsi_end_request+0x97/0x5a0
      [<ffffffff821c5558>] scsi_io_completion+0x438/0x1690
      [<ffffffff821a8d95>] scsi_finish_command+0x375/0x4e0
      [<ffffffff821c3940>] scsi_softirq_done+0x280/0x340
    
    
    Where the access is occuring during IO completion after the buffer
    had been freed from direct memory reclaim.
    
    Prevent use-after-free accidents in this end_io processing loop by
    pre-calculating the loop conditionals before calling bh->b_end_io().
    The loop is already limited to just the bufferheads covered by the
    IO in progress, so the offset checks are sufficient to prevent
    accessing buffers in the chain after end_page_writeback() has been
    called by the the bh->b_end_io() callout.
    
    Yet another example of why Bufferheads Must Die.
    
    cc: <stable@vger.kernel.org> # 4.7
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reported-and-Tested-by: default avatarCalvin Owens <calvinowens@fb.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
    Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
    
    28b783e4
xfs_aops.c 48.2 KB