- 22 Oct, 2011 1 commit
-
-
Dmitry Monakhov authored
Currently code make an impression what grow procedure is very complicated and some mythical paths, blocks are involved. But in fact grow in depth it relatively simple procedure: 1) Just create new meta block and copy root data to that block. 2) Convert root from extent to index if old depth == 0 3) Update root block pointer This patch does: - Reorganize code to make it more self explanatory - Do not pass path parameter to new_meta_block() in order to provoke allocation from inode's group because top-level block should site closer to it's inode, but not to leaf data block. [ This happens anyway, due to logic in mballoc; we should drop the path parameter from new_meta_block() entirely. -- tytso ] Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 21 Oct, 2011 1 commit
-
-
Dmitry Monakhov authored
Quota file is fs's metadata, so it is reasonable to permit use root resevation if necessary. This patch fix 265'th xfstest failure Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 20 Oct, 2011 2 commits
-
-
Kazuya Mio authored
If ext4_jbd2_file_inode() in mpage_da_map_and_submit() fails due to journal abort, this function returns to caller without unlocking the page. It leads to the deadlock, and the patch fixes this issue by calling mpage_da_submit_io(). Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Akira Fujita authored
If ext4_jbd2_file_inode() in ext4_ordered_write_end() fails for some reasons, this function returns to caller without unlocking the page. It leads to the deadlock, and the patch fixes this issue. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 18 Oct, 2011 7 commits
-
-
H Hartley Sweeten authored
The third parameter to ext4_free_blocks is a struct buffer_head *. This parameter should be NULL not 0. This quiets the sparse noise: warning: Using plain integer as NULL pointer Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
H Hartley Sweeten authored
This quiets the sparse noise: warning: incorrect type in argument 2 (different address spaces) expected void const [noderef] <asn:1>*from got struct fstrim_range *<noident> warning: incorrect type in argument 1 (different address spaces) expected void [noderef] <asn:1>*to got struct fstrim_range *<noident> Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
H Hartley Sweeten authored
The function declarations in ext4.h are already marked extern, so it's not necessary to do so in the .c files. This quiets the sparse noise: warning: function 'ext4_flush_completed_IO' with external linkage has definition warning: function 'ext4_init_inode_table' with external linkage has definition Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Shaohua Li authored
Add block plug for ext4 .writepages. Though ext4 .writepages already handles request merge very well, block plug is still helpful to reduce block lock contention. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Darrick J. Wong authored
As part of startup, the MMP initialization code does this: mmp->mmp_seq = seq = cpu_to_le32(mmp_new_seq()); Next, mmp->mmp_seq is written out to disk, a delay happens, and then the MMP block is read back in and the sequence value is tested: if (seq != le32_to_cpu(mmp->mmp_seq)) { /* fail the mount */ On a LE system such as x86, the *le32* functions do nothing and this works. Unfortunately, on a BE system such as ppc64, this comparison becomes: if (cpu_to_le32(new_seq) != le32_to_cpu(cpu_to_le32(new_seq)) { /* fail the mount */ Except for a few palindromic sequence numbers, this test always causes the mount to fail, which makes MMP filesystems generally unmountable on ppc64. The attached patch fixes this situation. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Nikitas Angelinas authored
Current logic would print an error message only once, and then 'failed_writes' would stay at 1. Rework the loop to increment 'failed_writes' and print the error message every s_mmp_update_interval * 60 seconds, as intended according to the comment. Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com> Signed-off-by: Andrew Perepechko <andrew_perepechko@xyratex.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Andreas Dilger <adilger@dilger.ca>
-
Nikitas Angelinas authored
sysname holds "Linux" by default, i.e. what appears when doing a "uname -s"; nodename should be used to print the machine's hostname, i.e. what is returned when doing a "uname -n" or "hostname", and what gethostname(2)/sethostname(2) manipulate, in order to notify the administrator of the node which is contending to mount the filesystem. Acked-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com> Signed-off-by: Andrew Perepechko <andrew_perepechko@xyratex.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 17 Oct, 2011 1 commit
-
-
Tao Ma authored
Add a sanity check to make sure ix hasn't gone beyond the valid bounds of the extent block. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 08 Oct, 2011 6 commits
-
-
Fabrice Jouhaud authored
This fixes a bug which was introduced in dd68314c. The problem came from the test of the return value of proc_mkdir which is always false without procfs, and this would initialization of ext4. Signed-off-by: Fabrice Jouhaud <yargil@free.fr> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
ext4_extent_idx.e_block is __le32, so use le32_to_cpu() in ext4_ext_search_left(). Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
There are no users of the EXT4_IOC_WAIT_FOR_READONLY ioctl, and it is also broken. No one sets the set_ro_timer, no one wakes up us and our state is set to TASK_INTERRUPTIBLE not RUNNING. So remove it. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
The comment describing what ext4_ext_search_right() does is incorrect. We return 0 in *phys when *logical is the 'largest' allocated block, not smallest. Fix a few other typos while we're at it. Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Tao Ma <boyu.mt@taobao.com>
-
Lukas Czerner authored
For a long time now orlov is the default block allocator in the ext4. It performs better than the old one and no one seems to claim otherwise so we can safely drop it and make oldalloc and orlov mount option deprecated. This is a part of the effort to reduce number of ext4 options hence the test matrix. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Acl and user_xattr mount options are no longer needed since those features are enabled by default if configured in (seee commit ea663336). We can not easily deprecate mount options itself (since it is probably too early), but we can remove it from documentation first. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 06 Oct, 2011 2 commits
-
-
Tao Ma authored
Some of the error path in ext4_fill_super don't release the resouces properly. So this patch just try to release them in the right way. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
In commit 79a77c5a, we move ext4_mb_init_backend after the allocation of s_locality_group to avoid memory leak in error path, but there are still some other error paths in ext4_mb_init that need to do the same work. So this patch adds all the error patch for ext4_mb_init. And all the pointers are reset to NULL in case the caller may double free them. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Sep, 2011 20 commits
-
-
Aditya Kali authored
Currently, there exists a race between delayed allocated writes and the writeback when bigalloc feature is in use. The race was because we wanted to determine what blocks in a cluster are under delayed allocation and we were using buffer_delayed(bh) check for it. But, the writeback codepath clears this bit without any synchronization which resulted in a race and an ext4 warning similar to: EXT4-fs (ram1): ext4_da_update_reserve_space: ino 13, used 1 with only 0 reserved data blocks The race existed in two places. (1) between ext4_find_delalloc_range() and ext4_map_blocks() when called from writeback code path. (2) between ext4_find_delalloc_range() and ext4_da_get_block_prep() (where buffer_delayed(bh) is set. To fix (1), this patch introduces a new buffer_head state bit - BH_Da_Mapped. This bit is set under the protection of EXT4_I(inode)->i_data_sem when we have actually mapped the delayed allocated blocks during the writeout time. We can now reliably check for this bit inside ext4_find_delalloc_range() to determine whether the reservation for the blocks have already been claimed or not. To fix (2), it was necessary to set buffer_delay(bh) under the protection of i_data_sem. So, I extracted the very beginning of ext4_map_blocks into a new function - ext4_da_map_blocks() - and performed the required setting of bh_delay bit and the quota reservation under the protection of i_data_sem. These two fixes makes the checking of buffer_delay(bh) and buffer_da_mapped(bh) consistent, thus removing the race. Tested: I was able to reproduce the problem by running 'dd' and 'fsync' in parallel. Also, xfstests sometimes used to reproduce this race. After the fix both my test and xfstests were successful and no race (warning message) was observed. Google-Bug-Id: 4997027 Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Aditya Kali authored
This patch adds some tracepoints in ext4/extents.c and updates a tracepoint in ext4/inode.c. Tested: Built and ran the kernel and verified that these tracepoints work. Also ran xfstests. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Rename the function so it is more clear what is going on. Also rename the various variables so it's clearer what's happening. Also fix a missing blocks to cluster conversion when reading the number of reserved blocks for root. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
This function really claims a number of free clusters, not blocks, so rename it so it's clearer what's going on. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
This function really returns the number of clusters after initializing an uninitalized block bitmap has been initialized. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
This function really counts the free clusters reported in the block group descriptors, so rename it to reduce confusion. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
The field bg_free_blocks_count_{lo,high} in the block group descriptor has been repurposed to hold the number of free clusters for bigalloc functions. So rename the functions so it makes it easier to read and audit the block allocation and block freeing code. Note: at this point in bigalloc development we doesn't support online resize, so this also makes it really obvious all of the places we need to fix up to add support for online resize. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Now that we have implemented all of the changes needed for bigalloc, we can finally enable it! Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Aditya Kali authored
With bigalloc changes, the i_blocks value was not correctly set (it was still set to number of blocks being used, but in case of bigalloc, we want i_blocks to represent the number of clusters being used). Since the quota subsystem sets the i_blocks value, this patch fixes the quota accounting and makes sure that the i_blocks value is set correctly. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
The default group preallocation size had been previously set to 512 blocks/clusters, regardless of the block/cluster size. This is probably too big for large cluster sizes. So adjust the default so that it is 2 megabytes or 32 clusters, whichever is larger. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Convert the free_blocks to be free_clusters to make the final revised bigalloc changes easier to read/understand. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Convert the percpu counters s_dirtyblocks_counter and s_freeblocks_counter in struct ext4_super_info to be s_dirtyclusters_counter and s_freeclusters_counter. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
When we are truncating (as opposed unlinking) a file, we need to worry about partial truncates of a file, especially in the light of sparse files. The changes here make sure that arbitrary truncates of sparse files works correctly. Yeah, it's messy. Note that these functions will need to be revisted when the punch ioctl is integrated --- in fact this commit will probably have merge conflicts with the punch changes which Allison Henders and the IBM LTC have been working on. I will need to fix this up when either patch hits mainline. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
If we need to allocate a new block in ext4_ext_map_blocks(), the function needs to see if the cluster has already been allocated. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
The ext4_free_blocks() function now has two new flags that indicate whether a partial cluster at the beginning or the end of the block extents should be freed or not. That will be up the caller (i.e., truncate), who can figure out whether partial clusters at the beginning or the end of a block range can be freed. We also have to update the ext4_mb_free_metadata() and release_blocks_on_commit() machinery to be cluster-based, since it is used by ext4_free_blocks(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
In most of mballoc.c, we do everything in units of clusters, since the block allocation bitmaps and buddy bitmaps are all denominated in clusters. The one place where we do deal with absolute block numbers is in the code that handles the preallocation regions, since in the case of inode-based preallocation regions, the start of the preallocation region can't be relative to the beginning of the group. So this adds a bit of complexity, where pa_pstart and pa_lstart are block numbers, while pa_free, pa_len, and fe_len are denominated in units of clusters. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Certain parts of the ext4 code base, primarily in mballoc.c, use a block group number and offset from the beginning of the block group. This offset is invariably used to index into the allocation bitmap, so change the offset to be denominated in units of clusters. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Add bigalloc support to ext4_init_block_bitmap() and ext4_free_blocks_after_init(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
The function ext4_free_blocks_after_init() used to be a #define of ext4_init_block_bitmap(). This actually made it difficult to understand how the function worked, and made it hard make changes to support clusters. So as an initial cleanup, I've separated out the functionality of initializing block bitmap from calculating the number of free blocks in the new block group. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-