• Tomer Tayar's avatar
    habanalabs: fix race between hl_get_compute_ctx() and hl_ctx_put() · 41021f72
    Tomer Tayar authored
    hl_get_compute_ctx() is used to get the pointer to the compute context
    from the hpriv object.
    The function is called in code paths that are not necessarily initiated
    by user, so it is possible that a context release process will happen in
    parallel.
    This can lead to a race condition in which hl_get_compute_ctx()
    retrieves the context pointer, and just before it increments the context
    refcount, the context object is released and a freed memory is accessed.
    
    To avoid this race, add a mutex to protect the context pointer in hpriv.
    With this lock, hl_get_compute_ctx() will be able to detect if the
    context has been released or is about to be released.
    
    struct hl_ctx_mgr has a mutex for contexts IDR with a similar "ctx_lock"
    name, so rename it to just "lock" to avoid a confusion with the new
    lock.
    Signed-off-by: default avatarTomer Tayar <ttayar@habana.ai>
    Reviewed-by: default avatarOded Gabbay <ogabbay@kernel.org>
    Signed-off-by: default avatarOded Gabbay <ogabbay@kernel.org>
    41021f72
habanalabs.h 126 KB