• Robbie Ko's avatar
    Btrfs: send, fix infinite loop due to directory rename dependencies · a4390aee
    Robbie Ko authored
    When doing an incremental send, due to the need of delaying directory move
    (rename) operations we can end up in infinite loop at
    apply_children_dir_moves().
    
    An example scenario that triggers this problem is described below, where
    directory names correspond to the numbers of their respective inodes.
    
    Parent snapshot:
    
     .
     |--- 261/
           |--- 271/
                 |--- 266/
                       |--- 259/
                       |--- 260/
                       |     |--- 267
                       |
                       |--- 264/
                       |     |--- 258/
                       |           |--- 257/
                       |
                       |--- 265/
                       |--- 268/
                       |--- 269/
                       |     |--- 262/
                       |
                       |--- 270/
                       |--- 272/
                       |     |--- 263/
                       |     |--- 275/
                       |
                       |--- 274/
                             |--- 273/
    
    Send snapshot:
    
     .
     |-- 275/
          |-- 274/
               |-- 273/
                    |-- 262/
                         |-- 269/
                              |-- 258/
                                   |-- 271/
                                        |-- 268/
                                             |-- 267/
                                                  |-- 270/
                                                       |-- 259/
                                                       |    |-- 265/
                                                       |
                                                       |-- 272/
                                                            |-- 257/
                                                                 |-- 260/
                                                                 |-- 264/
                                                                      |-- 263/
                                                                           |-- 261/
                                                                                |-- 266/
    
    When processing inode 257 we delay its move (rename) operation because its
    new parent in the send snapshot, inode 272, was not yet processed. Then
    when processing inode 272, we delay the move operation for that inode
    because inode 274 is its ancestor in the send snapshot. Finally we delay
    the move operation for inode 274 when processing it because inode 275 is
    its new parent in the send snapshot and was not yet moved.
    
    When finishing processing inode 275, we start to do the move operations
    that were previously delayed (at apply_children_dir_moves()), resulting in
    the following iterations:
    
    1) We issue the move operation for inode 274;
    
    2) Because inode 262 depended on the move operation of inode 274 (it was
       delayed because 274 is its ancestor in the send snapshot), we issue the
       move operation for inode 262;
    
    3) We issue the move operation for inode 272, because it was delayed by
       inode 274 too (ancestor of 272 in the send snapshot);
    
    4) We issue the move operation for inode 269 (it was delayed by 262);
    
    5) We issue the move operation for inode 257 (it was delayed by 272);
    
    6) We issue the move operation for inode 260 (it was delayed by 272);
    
    7) We issue the move operation for inode 258 (it was delayed by 269);
    
    8) We issue the move operation for inode 264 (it was delayed by 257);
    
    9) We issue the move operation for inode 271 (it was delayed by 258);
    
    10) We issue the move operation for inode 263 (it was delayed by 264);
    
    11) We issue the move operation for inode 268 (it was delayed by 271);
    
    12) We verify if we can issue the move operation for inode 270 (it was
        delayed by 271). We detect a path loop in the current state, because
        inode 267 needs to be moved first before we can issue the move
        operation for inode 270. So we delay again the move operation for
        inode 270, this time we will attempt to do it after inode 267 is
        moved;
    
    13) We issue the move operation for inode 261 (it was delayed by 263);
    
    14) We verify if we can issue the move operation for inode 266 (it was
        delayed by 263). We detect a path loop in the current state, because
        inode 270 needs to be moved first before we can issue the move
        operation for inode 266. So we delay again the move operation for
        inode 266, this time we will attempt to do it after inode 270 is
        moved (its move operation was delayed in step 12);
    
    15) We issue the move operation for inode 267 (it was delayed by 268);
    
    16) We verify if we can issue the move operation for inode 266 (it was
        delayed by 270). We detect a path loop in the current state, because
        inode 270 needs to be moved first before we can issue the move
        operation for inode 266. So we delay again the move operation for
        inode 266, this time we will attempt to do it after inode 270 is
        moved (its move operation was delayed in step 12). So here we added
        again the same delayed move operation that we added in step 14;
    
    17) We attempt again to see if we can issue the move operation for inode
        266, and as in step 16, we realize we can not due to a path loop in
        the current state due to a dependency on inode 270. Again we delay
        inode's 266 rename to happen after inode's 270 move operation, adding
        the same dependency to the empty stack that we did in steps 14 and 16.
        The next iteration will pick the same move dependency on the stack
        (the only entry) and realize again there is still a path loop and then
        again the same dependency to the stack, over and over, resulting in
        an infinite loop.
    
    So fix this by preventing adding the same move dependency entries to the
    stack by removing each pending move record from the red black tree of
    pending moves. This way the next call to get_pending_dir_moves() will
    not return anything for the current parent inode.
    
    A test case for fstests, with this reproducer, follows soon.
    Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    [Wrote changelog with example and more clear explanation]
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    a4390aee
send.c 164 KB