• Ioanna Alifieraki's avatar
    Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()" · edf28f40
    Ioanna Alifieraki authored
    This reverts commit a9795584.
    
    Commit a9795584 ("ipc,sem: remove uneeded sem_undo_list lock usage
    in exit_sem()") removes a lock that is needed.  This leads to a process
    looping infinitely in exit_sem() and can also lead to a crash.  There is
    a reproducer available in [1] and with the commit reverted the issue
    does not reproduce anymore.
    
    Using the reproducer found in [1] is fairly easy to reach a point where
    one of the child processes is looping infinitely in exit_sem between
    for(;;) and if (semid == -1) block, while it's trying to free its last
    sem_undo structure which has already been freed by freeary().
    
    Each sem_undo struct is on two lists: one per semaphore set (list_id)
    and one per process (list_proc).  The list_id list tracks undos by
    semaphore set, and the list_proc by process.
    
    Undo structures are removed either by freeary() or by exit_sem().  The
    freeary function is invoked when the user invokes a syscall to remove a
    semaphore set.  During this operation freeary() traverses the list_id
    associated with the semaphore set and removes the undo structures from
    both the list_id and list_proc lists.
    
    For this case, exit_sem() is called at process exit.  Each process
    contains a struct sem_undo_list (referred to as "ulp") which contains
    the head for the list_proc list.  When the process exits, exit_sem()
    traverses this list to remove each sem_undo struct.  As in freeary(),
    whenever a sem_undo struct is removed from list_proc, it is also removed
    from the list_id list.
    
    Removing elements from list_id is safe for both exit_sem() and freeary()
    due to sem_lock().  Removing elements from list_proc is not safe;
    freeary() locks &un->ulp->lock when it performs
    list_del_rcu(&un->list_proc) but exit_sem() does not (locking was
    removed by commit a9795584 ("ipc,sem: remove uneeded sem_undo_list
    lock usage in exit_sem()").
    
    This can result in the following situation while executing the
    reproducer [1] : Consider a child process in exit_sem() and the parent
    in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).
    
     - The list_proc for the child contains the last two undo structs A and
       B (the rest have been removed either by exit_sem() or freeary()).
    
     - The semid for A is 1 and semid for B is 2.
    
     - exit_sem() removes A and at the same time freeary() removes B.
    
     - Since A and B have different semid sem_lock() will acquire different
       locks for each process and both can proceed.
    
    The bug is that they remove A and B from the same list_proc at the same
    time because only freeary() acquires the ulp lock. When exit_sem()
    removes A it makes ulp->list_proc.next to point at B and at the same
    time freeary() removes B setting B->semid=-1.
    
    At the next iteration of for(;;) loop exit_sem() will try to remove B.
    
    The only way to break from for(;;) is for (&un->list_proc ==
    &ulp->list_proc) to be true which is not. Then exit_sem() will check if
    B->semid=-1 which is and will continue looping in for(;;) until the
    memory for B is reallocated and the value at B->semid is changed.
    
    At that point, exit_sem() will crash attempting to unlink B from the
    lists (this can be easily triggered by running the reproducer [1] a
    second time).
    
    To prove this scenario instrumentation was added to keep information
    about each sem_undo (un) struct that is removed per process and per
    semaphore set (sma).
    
              CPU0                                CPU1
      [caller holds sem_lock(sma for A)]      ...
      freeary()                               exit_sem()
      ...                                     ...
      ...                                     sem_lock(sma for B)
      spin_lock(A->ulp->lock)                 ...
      list_del_rcu(un_A->list_proc)           list_del_rcu(un_B->list_proc)
    
    Undo structures A and B have different semid and sem_lock() operations
    proceed.  However they belong to the same list_proc list and they are
    removed at the same time.  This results into ulp->list_proc.next
    pointing to the address of B which is already removed.
    
    After reverting commit a9795584 ("ipc,sem: remove uneeded
    sem_undo_list lock usage in exit_sem()") the issue was no longer
    reproducible.
    
    [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779
    
    Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com
    Fixes: a9795584 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()")
    Signed-off-by: default avatarIoanna Alifieraki <ioanna-maria.alifieraki@canonical.com>
    Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
    Acked-by: default avatarHerton R. Krzesinski <herton@redhat.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: <malat@debian.org>
    Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    edf28f40
sem.c 62.7 KB