• Andrea Arcangeli's avatar
    [PATCH] fix nr_unused accounting, and avoid recursing in iput with I_WILL_FREE set · 7f04c26d
    Andrea Arcangeli authored
     			list_move(&inode->i_list, &inode_in_use);
     		} else {
     			list_move(&inode->i_list, &inode_unused);
    +			inodes_stat.nr_unused++;
     		}
     	}
     	wake_up_inode(inode);
    
    Are you sure the above diff is correct? It was added somewhere between
    2.6.5 and 2.6.8. I think it's wrong.
    
    The only way I can imagine the i_count to be zero in the above path, is
    that I_WILL_FREE is set.  And if I_WILL_FREE is set, then we must not
    increase nr_unused.  So I believe the above change is buggy and it will
    definitely overstate the number of unused inodes and it should be backed
    out.
    
    Note that __writeback_single_inode before calling __sync_single_inode, can
    drop the spinlock and we can have both the dirty and locked bitflags clear
    here:
    
    		spin_unlock(&inode_lock);
    		__wait_on_inode(inode);
    		iput(inode);
    XXXXXXX
    		spin_lock(&inode_lock);
    	}
    	use inode again here
    
    a construct like the above makes zero sense from a reference counting
    standpoint.
    
    Either we don't ever use the inode again after the iput, or the
    inode_lock should be taken _before_ executing the iput (i.e. a __iput
    would be required). Taking the inode_lock after iput means the iget was
    useless if we keep using the inode after the iput.
    
    So the only chance the 2.6 was safe to call __writeback_single_inode
    with the i_count == 0, is that I_WILL_FREE is set (I_WILL_FREE will
    prevent the VM to free the inode in XXXXX).
    
    Potentially calling the above iput with I_WILL_FREE was also wrong
    because it would recurse in iput_final (the second mainline bug).
    
    The below (untested) patch fixes the nr_unused accounting, avoids recursing
    in iput when I_WILL_FREE is set and makes sure (with the BUG_ON) that we
    don't corrupt memory and that all holders that don't set I_WILL_FREE, keeps
    a reference on the inode!
    Signed-off-by: default avatarAndrea Arcangeli <andrea@suse.de>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    7f04c26d
inode.c 35.8 KB