• Christoph Hellwig's avatar
    xfs: fix error handling for synchronous writes · bfc60177
    Christoph Hellwig authored
    If we get an IO error on a synchronous superblock write, we attach an
    error release function to it so that when the last reference goes away
    the release function is called and the buffer is invalidated and
    unlocked. The buffer is left locked until the release function is
    called so that other concurrent users of the buffer will be locked out
    until the buffer error is fully processed.
    
    Unfortunately, for the superblock buffer the filesyetm itself holds a
    reference to the buffer which prevents the reference count from
    dropping to zero and the release function being called. As a result,
    once an IO error occurs on a sync write, the buffer will never be
    unlocked and all future attempts to lock the buffer will hang.
    
    To make matters worse, this problems is not unique to such buffers;
    if there is a concurrent _xfs_buf_find() running, the lookup will grab
    a reference to the buffer and then wait on the buffer lock, preventing
    the reference count from ever falling to zero and hence unlocking the
    buffer.
    
    As such, the whole b_relse function implementation is broken because it
    cannot rely on the buffer reference count falling to zero to unlock the
    errored buffer. The synchronous write error path is the only path that
    uses this callback - it is used to ensure that the synchronous waiter
    gets the buffer error before the error state is cleared from the buffer
    by the release function.
    
    Given that the only sychronous buffer writes now go through xfs_bwrite
    and the error path in question can only occur for a write of a dirty,
    logged buffer, we can move most of the b_relse processing to happen
    inline in xfs_buf_iodone_callbacks, just like a normal I/O completion.
    In addition to that we make sure the error is not cleared in
    xfs_buf_iodone_callbacks, so that xfs_bwrite can reliably check it.
    Given that xfs_bwrite keeps the buffer locked until it has waited for
    it and checked the error this allows to reliably propagate the error
    to the caller, and make sure that the buffer is reliably unlocked.
    
    Given that xfs_buf_iodone_callbacks was the only instance of the
    b_relse callback we can remove it entirely.
    
    Based on earlier patches by Dave Chinner and Ajeet Yadav.
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Reported-by: default avatarAjeet Yadav <ajeet.yadav.77@gmail.com>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
    bfc60177
xfs_buf_item.c 29.5 KB