• Kenny Yu's avatar
    tools: add tool to detect potential deadlocks in running programs · 66fb4d29
    Kenny Yu authored
    `deadlock_detector` is a new tool to detect potential deadlocks (lock order
    inversions) in a running process. The program attaches uprobes on
    `pthread_mutex_lock` and `pthread_mutex_unlock` to build a mutex wait directed
    graph, and then looks for a cycle in this graph. This graph has the following
    properties:
    
    - Nodes in the graph represent mutexes.
    - Edge (A, B) exists if there exists some thread T where lock(A) was called
      and lock(B) was called before unlock(A) was called.
    
    If there is a cycle in this graph, this indicates that there is a lock order
    inversion (potential deadlock). If the program finds a lock order inversion, the
    program will dump the cycle of mutexes, dump the stack traces where each mutex
    was acquired, and then exit.
    
    The format of the output uses a similar output as ThreadSanitizer (See example:
    https://github.com/google/sanitizers/wiki/ThreadSanitizerDeadlockDetector)
    
    This program can only find potential deadlocks that occur while the program is
    tracing the process. It cannot find deadlocks that may have occurred before the
    program was attached to the process.
    
    If the traced process has many mutexes and threads, this program will add a
    very large overhead because every mutex lock/unlock and clone call will be
    traced. This tool is meant for debugging only, and you should run this tool
    only on programs where the slowdown is acceptable.
    
    Note: This tool adds a dependency on `networkx` for the graph libraries
    (building a directed graph and cycle detection).
    
    Note: This tool does not work for shared mutexes or recursive mutexes.
    
    For shared (read-write) mutexes, a deadlock requires a cycle in the wait
    graph where at least one of the mutexes in the cycle is acquiring exclusive
    (write) ownership.
    
    For recursive mutexes, lock() is called multiple times on the same mutex.
    However, there is no way to determine if a mutex is a recursive mutex
    after the mutex has been created. As a result, this tool will not find
    potential deadlocks that involve only one mutex.
    66fb4d29
control 1.13 KB