• Peter Staubach's avatar
    NFS: read-modify-write page updating · 38c73044
    Peter Staubach authored
    Hi.
    
    I have a proposal for possibly resolving this issue.
    
    I believe that this situation occurs due to the way that the
    Linux NFS client handles writes which modify partial pages.
    
    The Linux NFS client handles partial page modifications by
    allocating a page from the page cache, copying the data from
    the user level into the page, and then keeping track of the
    offset and length of the modified portions of the page.  The
    page is not marked as up to date because there are portions
    of the page which do not contain valid file contents.
    
    When a read call comes in for a portion of the page, the
    contents of the page must be read in the from the server.
    However, since the page may already contain some modified
    data, that modified data must be written to the server
    before the file contents can be read back in the from server.
    And, since the writing and reading can not be done atomically,
    the data must be written and committed to stable storage on
    the server for safety purposes.  This means either a
    FILE_SYNC WRITE or a UNSTABLE WRITE followed by a COMMIT.
    This has been discussed at length previously.
    
    This algorithm could be described as modify-write-read.  It
    is most efficient when the application only updates pages
    and does not read them.
    
    My proposed solution is to add a heuristic to decide whether
    to do this modify-write-read algorithm or switch to a read-
    modify-write algorithm when initially allocating the page
    in the write system call path.  The heuristic uses the modes
    that the file was opened with, the offset in the page to
    read from, and the size of the region to read.
    
    If the file was opened for reading in addition to writing
    and the page would not be filled completely with data from
    the user level, then read in the old contents of the page
    and mark it as Uptodate before copying in the new data.  If
    the page would be completely filled with data from the user
    level, then there would be no reason to read in the old
    contents because they would just be copied over.
    
    This would optimize for applications which randomly access
    and update portions of files.  The linkage editor for the
    C compiler is an example of such a thing.
    
    I tested the attached patch by using rpmbuild to build the
    current Fedora rawhide kernel.  The kernel without the
    patch generated about 269,500 WRITE requests.  The modified
    kernel containing the patch generated about 261,000 WRITE
    requests.  Thus, about 8,500 fewer WRITE requests were
    generated.  I suspect that many of these additional
    WRITE requests were probably FILE_SYNC requests to WRITE
    a single page, but I didn't test this theory.
    
    The difference between this patch and the previous one was
    to remove the unneeded PageDirty() test.  I then retested to
    ensure that the resulting system continued to behave as
    desired.
    
    	Thanx...
    
    		ps
    Signed-off-by: default avatarPeter Staubach <staubach@redhat.com>
    Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
    38c73044
file.c 22.8 KB