• Eric Biggers's avatar
    fs-verity: implement readahead of Merkle tree pages · fd39073d
    Eric Biggers authored
    When fs-verity verifies data pages, currently it reads each Merkle tree
    page synchronously using read_mapping_page().
    
    Therefore, when the Merkle tree pages aren't already cached, fs-verity
    causes an extra 4 KiB I/O request for every 512 KiB of data (assuming
    that the Merkle tree uses SHA-256 and 4 KiB blocks).  This results in
    more I/O requests and performance loss than is strictly necessary.
    
    Therefore, implement readahead of the Merkle tree pages.
    
    For simplicity, we take advantage of the fact that the kernel already
    does readahead of the file's *data*, just like it does for any other
    file.  Due to this, we don't really need a separate readahead state
    (struct file_ra_state) just for the Merkle tree, but rather we just need
    to piggy-back on the existing data readahead requests.
    
    We also only really need to bother with the first level of the Merkle
    tree, since the usual fan-out factor is 128, so normally over 99% of
    Merkle tree I/O requests are for the first level.
    
    Therefore, make fsverity_verify_bio() enable readahead of the first
    Merkle tree level, for up to 1/4 the number of pages in the bio, when it
    sees that the REQ_RAHEAD flag is set on the bio.  The readahead size is
    then passed down to ->read_merkle_tree_page() for the filesystem to
    (optionally) implement if it sees that the requested page is uncached.
    
    While we're at it, also make build_merkle_tree_level() set the Merkle
    tree readahead size, since it's easy to do there.
    
    However, for now don't set the readahead size in fsverity_verify_page(),
    since currently it's only used to verify holes on ext4 and f2fs, and it
    would need parameters added to know how much to read ahead.
    
    This patch significantly improves fs-verity sequential read performance.
    Some quick benchmarks with 'cat'-ing a 250MB file after dropping caches:
    
        On an ARM64 phone (using sha256-ce):
            Before: 217 MB/s
            After: 263 MB/s
            (compare to sha256sum of non-verity file: 357 MB/s)
    
        In an x86_64 VM (using sha256-avx2):
            Before: 173 MB/s
            After: 215 MB/s
            (compare to sha256sum of non-verity file: 223 MB/s)
    
    Link: https://lore.kernel.org/r/20200106205533.137005-1-ebiggers@kernel.orgReviewed-by: default avatarTheodore Ts'o <tytso@mit.edu>
    Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
    fd39073d
verify.c 9.45 KB