• Stephen Brennan's avatar
    assoc_array: Fix BUG_ON during garbage collect · d1dc8776
    Stephen Brennan authored
    A rare BUG_ON triggered in assoc_array_gc:
    
        [3430308.818153] kernel BUG at lib/assoc_array.c:1609!
    
    Which corresponded to the statement currently at line 1593 upstream:
    
        BUG_ON(assoc_array_ptr_is_meta(p));
    
    Using the data from the core dump, I was able to generate a userspace
    reproducer[1] and determine the cause of the bug.
    
    [1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc
    
    After running the iterator on the entire branch, an internal tree node
    looked like the following:
    
        NODE (nr_leaves_on_branch: 3)
          SLOT [0] NODE (2 leaves)
          SLOT [1] NODE (1 leaf)
          SLOT [2..f] NODE (empty)
    
    In the userspace reproducer, the pr_devel output when compressing this
    node was:
    
        -- compress node 0x5607cc089380 --
        free=0, leaves=0
        [0] retain node 2/1 [nx 0]
        [1] fold node 1/1 [nx 0]
        [2] fold node 0/1 [nx 2]
        [3] fold node 0/2 [nx 2]
        [4] fold node 0/3 [nx 2]
        [5] fold node 0/4 [nx 2]
        [6] fold node 0/5 [nx 2]
        [7] fold node 0/6 [nx 2]
        [8] fold node 0/7 [nx 2]
        [9] fold node 0/8 [nx 2]
        [10] fold node 0/9 [nx 2]
        [11] fold node 0/10 [nx 2]
        [12] fold node 0/11 [nx 2]
        [13] fold node 0/12 [nx 2]
        [14] fold node 0/13 [nx 2]
        [15] fold node 0/14 [nx 2]
        after: 3
    
    At slot 0, an internal node with 2 leaves could not be folded into the
    node, because there was only one available slot (slot 0). Thus, the
    internal node was retained. At slot 1, the node had one leaf, and was
    able to be folded in successfully. The remaining nodes had no leaves,
    and so were removed. By the end of the compression stage, there were 14
    free slots, and only 3 leaf nodes. The tree was ascended and then its
    parent node was compressed. When this node was seen, it could not be
    folded, due to the internal node it contained.
    
    The invariant for compression in this function is: whenever
    nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all
    leaf nodes. The compression step currently cannot guarantee this, given
    the corner case shown above.
    
    To fix this issue, retry compression whenever we have retained a node,
    and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second
    compression will then allow the node in slot 1 to be folded in,
    satisfying the invariant. Below is the output of the reproducer once the
    fix is applied:
    
        -- compress node 0x560e9c562380 --
        free=0, leaves=0
        [0] retain node 2/1 [nx 0]
        [1] fold node 1/1 [nx 0]
        [2] fold node 0/1 [nx 2]
        [3] fold node 0/2 [nx 2]
        [4] fold node 0/3 [nx 2]
        [5] fold node 0/4 [nx 2]
        [6] fold node 0/5 [nx 2]
        [7] fold node 0/6 [nx 2]
        [8] fold node 0/7 [nx 2]
        [9] fold node 0/8 [nx 2]
        [10] fold node 0/9 [nx 2]
        [11] fold node 0/10 [nx 2]
        [12] fold node 0/11 [nx 2]
        [13] fold node 0/12 [nx 2]
        [14] fold node 0/13 [nx 2]
        [15] fold node 0/14 [nx 2]
        internal nodes remain despite enough space, retrying
        -- compress node 0x560e9c562380 --
        free=14, leaves=1
        [0] fold node 2/15 [nx 0]
        after: 3
    
    Changes
    =======
    DH:
     - Use false instead of 0.
     - Reorder the inserted lines in a couple of places to put retained before
       next_slot.
    
    ver #2)
     - Fix typo in pr_devel, correct comparison to "<="
    
    Fixes: 3cb98950
    
     ("Add a generic associative array implementation.")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarStephen Brennan <stephen.s.brennan@oracle.com>
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    cc: Andrew Morton <akpm@linux-foundation.org>
    cc: keyrings@vger.kernel.org
    Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.com/ # v1
    Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.com/
    
     # v2
    Reviewed-by: default avatarJarkko Sakkinen <jarkko@kernel.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    d1dc8776
assoc_array.c 51.9 KB