• Filipe Manana's avatar
    btrfs: avoid iterating over all indexes when logging directory · fa4b8cb1
    Filipe Manana authored
    When logging a directory, after copying all directory index items from the
    subvolume tree to the log tree, we iterate over the subvolume tree to find
    all dir index items that are located in leaves COWed (or created) in the
    current transaction. If we keep logging a directory several times during
    the same transaction, we end up iterating over the same dir index items
    everytime we log the directory, wasting time and adding extra lock
    contention on the subvolume tree.
    
    So just keep track of the last logged dir index offset in order to start
    the search for that index (+1) the next time the directory is logged, as
    dir index values (key offsets) come from a monotonically increasing
    counter.
    
    The following test measures the difference before and after this change:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/nullb0
      MNT=/mnt/nullb0
    
      umount $DEV &> /dev/null
      mkfs.btrfs -f $DEV
      mount -o ssd $DEV $MNT
    
      # Time values in milliseconds.
      declare -a fsync_times
      # Total number of files added to the test directory.
      num_files=1000000
      # Fsync directory after every N files are added.
      fsync_period=100
    
      mkdir $MNT/testdir
    
      fsync_total_time=0
      for ((i = 1; i <= $num_files; i++)); do
            echo -n > $MNT/testdir/file_$i
    
            if [ $((i % fsync_period)) -eq 0 ]; then
                    start=$(date +%s%N)
                    xfs_io -c "fsync" $MNT/testdir
                    end=$(date +%s%N)
                    fsync_total_time=$((fsync_total_time + (end - start)))
                    fsync_times[i]=$(( (end - start) / 1000000 ))
                    echo -n -e "Progress $i / $num_files\r"
            fi
      done
    
      echo -e "\nHistogram of directory fsync duration in ms:\n"
    
      printf '%s\n' "${fsync_times[@]}" | \
         perl -MStatistics::Histogram -e '@d = <>; print get_histogram(\@d);'
    
      fsync_total_time=$((fsync_total_time / 1000000))
      echo -e "\nTotal time spent in fsync: $fsync_total_time ms\n"
      echo
    
      umount $MNT
    
    The test was run on a non-debug kernel (Debian's default kernel config)
    against a 15G null block device.
    
    Result before this change:
    
       Histogram of directory fsync duration in ms:
    
       Count: 10000
       Range:  3.000 - 362.000; Mean: 34.556; Median: 31.000; Stddev: 25.751
       Percentiles:  90th: 71.000; 95th: 77.000; 99th: 81.000
          3.000 -    5.278:  1423 #################################
          5.278 -    8.854:  1173 ###########################
          8.854 -   14.467:   591 ##############
         14.467 -   23.277:  1025 #######################
         23.277 -   37.105:  1422 #################################
         37.105 -   58.809:  2036 ###############################################
         58.809 -   92.876:  2316 #####################################################
         92.876 -  146.346:     6 |
        146.346 -  230.271:     6 |
        230.271 -  362.000:     2 |
    
       Total time spent in fsync: 350527 ms
    
    Result after this change:
    
       Histogram of directory fsync duration in ms:
    
       Count: 10000
       Range:  3.000 - 1088.000; Mean:  8.704; Median:  8.000; Stddev: 12.576
       Percentiles:  90th: 12.000; 95th: 14.000; 99th: 17.000
          3.000 -    6.007:  3222 #################################
          6.007 -   11.276:  5197 #####################################################
         11.276 -   20.506:  1551 ################
         20.506 -   36.674:    24 |
         36.674 -  201.552:     1 |
        201.552 -  353.841:     4 |
        353.841 - 1088.000:     1 |
    
       Total time spent in fsync: 92114 ms
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    fa4b8cb1
btrfs_inode.h 18.6 KB