Commit 46b11ba9 authored by Andrew Morton's avatar Andrew Morton Committed by Linus Torvalds

[PATCH] fix i_sem contention in sys_unlink()

Truncates can take a very long time.  Especially if there is a lot of
writeout happening, because truncate must wait on in-progress I/O.

And sys_unlink() is performing that truncate while holding the parent
directory's i_sem.  This basically shuts down new accesses to the entire
directory until the synchronous I/O completes.

In the testing I've been doing, that directory is /tmp, and this hurts.

So change sys_unlink() to perform the actual truncate outside i_sem.

When there is a continuous streaming write to the same disk, this patch
reduces the time for `make -j4 bzImage' from 370 seconds to 220.
parent b345e6d2
......@@ -1659,12 +1659,19 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
return error;
}
/*
* Make sure that the actual truncation of the file will occur outside its
* directory's i_sem. Truncate can take a long time if there is a lot of
* writeout happening, and we don't want to prevent access to the directory
* while waiting on the I/O.
*/
asmlinkage long sys_unlink(const char * pathname)
{
int error = 0;
char * name;
struct dentry *dentry;
struct nameidata nd;
struct inode *inode = NULL;
name = getname(pathname);
if(IS_ERR(name))
......@@ -1683,6 +1690,9 @@ asmlinkage long sys_unlink(const char * pathname)
/* Why not before? Because we want correct error value */
if (nd.last.name[nd.last.len])
goto slashes;
inode = dentry->d_inode;
if (inode)
inode = igrab(inode);
error = vfs_unlink(nd.dentry->d_inode, dentry);
exit2:
dput(dentry);
......@@ -1693,6 +1703,8 @@ asmlinkage long sys_unlink(const char * pathname)
exit:
putname(name);
if (inode)
iput(inode); /* truncate the inode here */
return error;
slashes:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment