Commit 18888cf0 authored by Andrey Sidorov's avatar Andrey Sidorov Committed by Theodore Ts'o

ext4: speed up truncate/unlink by not using bforget() unless needed

Do not iterate over data blocks scanning for bh's to forget as they're
never exist. This improves time taken by unlink / truncate syscall.
Tested by continuously truncating file that is being written by dd.
Another test is rm -rf of linux tree while tar unpacks it. With
ordered data mode condition unlikely(!tbh) was always met in
ext4_free_blocks. With journal data mode tbh was found only few times,
so optimisation is also possible.

Unlinking fallocated 60G file after doing sync && echo 3 >
/proc/sys/vm/drop_caches && time rm --help

X86 before (linux 3.6-rc4):
# time rm -f test1
real    0m2.710s
user    0m0.000s
sys     0m1.530s

X86 after:
# time rm -f test1
real    0m0.644s
user    0m0.003s
sys     0m0.060s

MIPS before (linux 2.6.37):
# time rm -f test1
real    0m 4.93s
user    0m 0.00s
sys     0m 4.61s

MIPS after:
# time rm -f test1
real    0m 0.16s
user    0m 0.00s
sys     0m 0.06s
Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: default avatarAndrey Sidorov <qrxd43@motorola.com>
parent 59e31c15
...@@ -2318,10 +2318,13 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode, ...@@ -2318,10 +2318,13 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
unsigned short ee_len = ext4_ext_get_actual_len(ex); unsigned short ee_len = ext4_ext_get_actual_len(ex);
ext4_fsblk_t pblk; ext4_fsblk_t pblk;
int flags = EXT4_FREE_BLOCKS_FORGET; int flags = 0;
if (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode)) if (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode))
flags |= EXT4_FREE_BLOCKS_METADATA; flags |= EXT4_FREE_BLOCKS_METADATA | EXT4_FREE_BLOCKS_FORGET;
else if (ext4_should_journal_data(inode))
flags |= EXT4_FREE_BLOCKS_FORGET;
/* /*
* For bigalloc file systems, we never free a partial cluster * For bigalloc file systems, we never free a partial cluster
* at the beginning of the extent. Instead, we make a note * at the beginning of the extent. Instead, we make a note
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment