• Brian Foster's avatar
    xfs: use dedicated log worker wq to avoid deadlock with cil wq · 696a5620
    Brian Foster authored
    The log covering background task used to be part of the xfssyncd
    workqueue. That workqueue was removed as of commit 5889608d ("xfs:
    syncd workqueue is no more") and the associated work item scheduled
    to the xfs-log wq. The latter is used for log buffer I/O completion.
    
    Since xfs_log_worker() can invoke a log flush, a deadlock is
    possible between the xfs-log and xfs-cil workqueues. Consider the
    following codepath from xfs_log_worker():
    
    xfs_log_worker()
      xfs_log_force()
        _xfs_log_force()
          xlog_cil_force()
            xlog_cil_force_lsn()
              xlog_cil_push_now()
                flush_work()
    
    The above is in xfs-log wq context and blocked waiting on the
    completion of an xfs-cil work item. Concurrently, the cil push in
    progress can end up blocked here:
    
    xlog_cil_push_work()
      xlog_cil_push()
        xlog_write()
          xlog_state_get_iclog_space()
            xlog_wait(&log->l_flush_wait, ...)
    
    The above is in xfs-cil context waiting on log buffer I/O
    completion, which executes in xfs-log wq context. In this scenario
    both workqueues are deadlocked waiting on eachother.
    
    Add a new workqueue specifically for the high level log covering and
    ail pushing worker, as was the case prior to commit 5889608d.
    Diagnosed-by: default avatarDavid Jeffery <djeffery@redhat.com>
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    696a5620
xfs_super.c 53.7 KB