• Filipe Manana's avatar
    btrfs: do unsigned integer division in the extent buffer binary search loop · a724f313
    Filipe Manana authored
    In the search loop of the binary search function, we are doing a division
    by 2 of the sum of the high and low slots. Because the slots are integers,
    the generated assembly code for it is the following on x86_64:
    
       0x00000000000141f1 <+145>:	mov    %eax,%ebx
       0x00000000000141f3 <+147>:	shr    $0x1f,%ebx
       0x00000000000141f6 <+150>:	add    %eax,%ebx
       0x00000000000141f8 <+152>:	sar    %ebx
    
    It's a few more instructions than a simple right shift, because signed
    integer division needs to round towards zero. However we know that slots
    can never be negative (btrfs_header_nritems() returns an u32), so we
    can instead use unsigned types for the low and high slots and therefore
    use unsigned integer division, which results in a single instruction on
    x86_64:
    
       0x00000000000141f0 <+144>:	shr    %ebx
    
    So use unsigned types for the slots and therefore unsigned division.
    
    This is part of a small patchset comprised of the following two patches:
    
      btrfs: eliminate extra call when doing binary search on extent buffer
      btrfs: do unsigned integer division in the extent buffer binary search loop
    
    The following fs_mark test was run on a non-debug kernel (Debian's default
    kernel config) before and after applying the patchset:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/sdi
      MNT=/mnt/sdi
      MOUNT_OPTIONS="-o ssd"
      MKFS_OPTIONS="-O no-holes -R free-space-tree"
      FILES=100000
      THREADS=$(nproc --all)
      FILE_SIZE=0
    
      umount $DEV &> /dev/null
      mkfs.btrfs -f $MKFS_OPTIONS $DEV
      mount $MOUNT_OPTIONS $DEV $MNT
    
      OPTS="-S 0 -L 6 -n $FILES -s $FILE_SIZE -t $THREADS -k"
      for ((i = 1; i <= $THREADS; i++)); do
          OPTS="$OPTS -d $MNT/d$i"
      done
    
      fs_mark $OPTS
    
      umount $MNT
    
    Results before applying patchset:
    
      FSUse%        Count         Size    Files/sec     App Overhead
           2      1200000            0     174472.0         11549868
           4      2400000            0     253503.0         11694618
           4      3600000            0     257833.1         11611508
           6      4800000            0     247089.5         11665983
           6      6000000            0     211296.1         12121244
          10      7200000            0     187330.6         12548565
    
    Results after applying patchset:
    
      FSUse%        Count         Size    Files/sec     App Overhead
           2      1200000            0     207556.0         11393252
           4      2400000            0     266751.1         11347909
           4      3600000            0     274397.5         11270058
           6      4800000            0     259608.4         11442250
           6      6000000            0     238895.8         11635921
           8      7200000            0     211942.2         11873825
    Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    a724f313
ctree.h 22.9 KB