• Filipe David Borba Manana's avatar
    Btrfs: add support for inode properties · 63541927
    Filipe David Borba Manana authored
    This change adds infrastructure to allow for generic properties for
    inodes. Properties are name/value pairs that can be associated with
    inodes for different purposes. They are stored as xattrs with the
    prefix "btrfs."
    
    Properties can be inherited - this means when a directory inode has
    inheritable properties set, these are added to new inodes created
    under that directory. Further, subvolumes can also have properties
    associated with them, and they can be inherited from their parent
    subvolume. Naturally, directory properties have priority over subvolume
    properties (in practice a subvolume property is just a regular
    property associated with the root inode, objectid 256, of the
    subvolume's fs tree).
    
    This change also adds one specific property implementation, named
    "compression", whose values can be "lzo" or "zlib" and it's an
    inheritable property.
    
    The corresponding changes to btrfs-progs were also implemented.
    A patch with xfstests for this feature will follow once there's
    agreement on this change/feature.
    
    Further, the script at the bottom of this commit message was used to
    do some benchmarks to measure any performance penalties of this feature.
    
    Basically the tests correspond to:
    
    Test 1 - create a filesystem and mount it with compress-force=lzo,
    then sequentially create N files of 64Kb each, measure how long it took
    to create the files, unmount the filesystem, mount the filesystem and
    perform an 'ls -lha' against the test directory holding the N files, and
    report the time the command took.
    
    Test 2 - create a filesystem and don't use any compression option when
    mounting it - instead set the compression property of the subvolume's
    root to 'lzo'. Then create N files of 64Kb, and report the time it took.
    The unmount the filesystem, mount it again and perform an 'ls -lha' like
    in the former test. This means every single file ends up with a property
    (xattr) associated to it.
    
    Test 3 - same as test 2, but uses 4 properties - 3 are duplicates of the
    compression property, have no real effect other than adding more work
    when inheriting properties and taking more btree leaf space.
    
    Test 4 - same as test 3 but with 10 properties per file.
    
    Results (in seconds, and averages of 5 runs each), for different N
    numbers of files follow.
    
    * Without properties (test 1)
    
                        file creation time        ls -lha time
    10 000 files              3.49                   0.76
    100 000 files            47.19                   8.37
    1 000 000 files         518.51                 107.06
    
    * With 1 property (compression property set to lzo - test 2)
    
                        file creation time        ls -lha time
    10 000 files              3.63                    0.93
    100 000 files            48.56                    9.74
    1 000 000 files         537.72                  125.11
    
    * With 4 properties (test 3)
    
                        file creation time        ls -lha time
    10 000 files              3.94                    1.20
    100 000 files            52.14                   11.48
    1 000 000 files         572.70                  142.13
    
    * With 10 properties (test 4)
    
                        file creation time        ls -lha time
    10 000 files              4.61                    1.35
    100 000 files            58.86                   13.83
    1 000 000 files         656.01                  177.61
    
    The increased latencies with properties are essencialy because of:
    
    *) When creating an inode, we now synchronously write 1 more item
       (an xattr item) for each property inherited from the parent dir
       (or subvolume). This could be done in an asynchronous way such
       as we do for dir intex items (delayed-inode.c), which could help
       reduce the file creation latency;
    
    *) With properties, we now have larger fs trees. For this particular
       test each xattr item uses 75 bytes of leaf space in the fs tree.
       This could be less by using a new item for xattr items, instead of
       the current btrfs_dir_item, since we could cut the 'location' and
       'type' fields (saving 18 bytes) and maybe 'transid' too (saving a
       total of 26 bytes per xattr item) from the btrfs_dir_item type.
    
    Also tried batching the xattr insertions (ignoring proper hash
    collision handling, since it didn't exist) when creating files that
    inherit properties from their parent inode/subvolume, but the end
    results were (surprisingly) essentially the same.
    
    Test script:
    
    $ cat test.pl
      #!/usr/bin/perl -w
    
      use strict;
      use Time::HiRes qw(time);
      use constant NUM_FILES => 10_000;
      use constant FILE_SIZES => (64 * 1024);
      use constant DEV => '/dev/sdb4';
      use constant MNT_POINT => '/home/fdmanana/btrfs-tests/dev';
      use constant TEST_DIR => (MNT_POINT . '/testdir');
    
      system("mkfs.btrfs", "-l", "16384", "-f", DEV) == 0 or die "mkfs.btrfs failed!";
    
      # following line for testing without properties
      #system("mount", "-o", "compress-force=lzo", DEV, MNT_POINT) == 0 or die "mount failed!";
    
      # following 2 lines for testing with properties
      system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";
      system("btrfs", "prop", "set", MNT_POINT, "compression", "lzo") == 0 or die "set prop failed!";
    
      system("mkdir", TEST_DIR) == 0 or die "mkdir failed!";
      my ($t1, $t2);
    
      $t1 = time();
      for (my $i = 1; $i <= NUM_FILES; $i++) {
          my $p = TEST_DIR . '/file_' . $i;
          open(my $f, '>', $p) or die "Error opening file!";
          $f->autoflush(1);
          for (my $j = 0; $j < FILE_SIZES; $j += 4096) {
              print $f ('A' x 4096) or die "Error writing to file!";
          }
          close($f);
      }
      $t2 = time();
      print "Time to create " . NUM_FILES . ": " . ($t2 - $t1) . " seconds.\n";
      system("umount", DEV) == 0 or die "umount failed!";
      system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";
    
      $t1 = time();
      system("bash -c 'ls -lha " . TEST_DIR . " > /dev/null'") == 0 or die "ls failed!";
      $t2 = time();
      print "Time to ls -lha all files: " . ($t2 - $t1) . " seconds.\n";
      system("umount", DEV) == 0 or die "umount failed!";
    Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
    Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    63541927
inode.c 236 KB