Commit 693c241a authored by Joel Becker's avatar Joel Becker

ocfs2: No need to zero pages past i_size.

When ocfs2 fills a hole, it does so by allocating clusters.  When a
cluster is larger than the write, ocfs2 must zero the portions of the
cluster outside of the write.  If the clustersize is smaller than a
pagecache page, this is handled by the normal pagecache mechanisms, but
when the clustersize is larger than a page, ocfs2's write code will zero
the pages adjacent to the write.  This makes sure the entire cluster is
zeroed correctly.

Currently ocfs2 behaves exactly the same when writing past i_size.
However, this means ocfs2 is writing zeroed pages for portions of a new
cluster that are beyond i_size.  The page writeback code isn't expecting
this.  It treats all pages past the one containing i_size as left behind
due to a previous truncate operation.

Thankfully, ocfs2 calculates the number of pages it will be working on
up front.  The rest of the write code merely honors the original
calculation.  We can simply trim the number of pages to only cover the
actual file data.
Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
Cc: stable@kernel.org
parent 5693486b
...@@ -1100,23 +1100,37 @@ static int ocfs2_prepare_page_for_write(struct inode *inode, u64 *p_blkno, ...@@ -1100,23 +1100,37 @@ static int ocfs2_prepare_page_for_write(struct inode *inode, u64 *p_blkno,
*/ */
static int ocfs2_grab_pages_for_write(struct address_space *mapping, static int ocfs2_grab_pages_for_write(struct address_space *mapping,
struct ocfs2_write_ctxt *wc, struct ocfs2_write_ctxt *wc,
u32 cpos, loff_t user_pos, int new, u32 cpos, loff_t user_pos,
unsigned user_len, int new,
struct page *mmap_page) struct page *mmap_page)
{ {
int ret = 0, i; int ret = 0, i;
unsigned long start, target_index, index; unsigned long start, target_index, end_index, index;
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
loff_t last_byte;
target_index = user_pos >> PAGE_CACHE_SHIFT; target_index = user_pos >> PAGE_CACHE_SHIFT;
/* /*
* Figure out how many pages we'll be manipulating here. For * Figure out how many pages we'll be manipulating here. For
* non allocating write, we just change the one * non allocating write, we just change the one
* page. Otherwise, we'll need a whole clusters worth. * page. Otherwise, we'll need a whole clusters worth. If we're
* writing past i_size, we only need enough pages to cover the
* last page of the write.
*/ */
if (new) { if (new) {
wc->w_num_pages = ocfs2_pages_per_cluster(inode->i_sb); wc->w_num_pages = ocfs2_pages_per_cluster(inode->i_sb);
start = ocfs2_align_clusters_to_page_index(inode->i_sb, cpos); start = ocfs2_align_clusters_to_page_index(inode->i_sb, cpos);
/*
* We need the index *past* the last page we could possibly
* touch. This is the page past the end of the write or
* i_size, whichever is greater.
*/
last_byte = max(user_pos + user_len, i_size_read(inode));
BUG_ON(last_byte < 1);
end_index = ((last_byte - 1) >> PAGE_CACHE_SHIFT) + 1;
if ((start + wc->w_num_pages) > end_index)
wc->w_num_pages = end_index - start;
} else { } else {
wc->w_num_pages = 1; wc->w_num_pages = 1;
start = target_index; start = target_index;
...@@ -1773,7 +1787,7 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, ...@@ -1773,7 +1787,7 @@ int ocfs2_write_begin_nolock(struct address_space *mapping,
* that we can zero and flush if we error after adding the * that we can zero and flush if we error after adding the
* extent. * extent.
*/ */
ret = ocfs2_grab_pages_for_write(mapping, wc, wc->w_cpos, pos, ret = ocfs2_grab_pages_for_write(mapping, wc, wc->w_cpos, pos, len,
cluster_of_pages, mmap_page); cluster_of_pages, mmap_page);
if (ret) { if (ret) {
mlog_errno(ret); mlog_errno(ret);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment