An error occurred fetching the project authors.
  1. 29 Sep, 2012 1 commit
  2. 05 Sep, 2012 2 commits
    • Theodore Ts'o's avatar
      ext4: grow the s_group_info array as needed · 28623c2f
      Theodore Ts'o authored
      Previously we allocated the s_group_info array with enough space for
      any future possible growth of the file system via online resize.  This
      is unfortunate because it wastes memory, and it doesn't work for the
      meta_bg scheme, since there is no limit based on the number of
      reserved gdt blocks.  So add the code to grow the s_group_info array
      as needed.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      28623c2f
    • Theodore Ts'o's avatar
      ext4: grow the s_flex_groups array as needed when resizing · 117fff10
      Theodore Ts'o authored
      Previously, we allocated the s_flex_groups array to the maximum size
      that the file system could be resized.  There was two problems with
      this approach.  First, it wasted memory in the common case where the
      file system was not resized.  Secondly, once we start allowing online
      resizing using the meta_bg scheme, there is no maximum size that the
      file system can be resized.  So instead, we need to grow the
      s_flex_groups at inline resize time.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      117fff10
  3. 17 Aug, 2012 2 commits
    • Zheng Liu's avatar
      ext4: make the zero-out chunk size tunable · 67a5da56
      Zheng Liu authored
      Currently in ext4 the length of zero-out chunk is set to 7 file system
      blocks.  But if an inode has uninitailized extents from using
      fallocate to preallocate space, and the workload issues many random
      writes, this can cause a fragmented extent tree that will
      unnecessarily grow the extent tree.
      
      So create a new sysfs tunable, extent_max_zeroout_kb, which controls
      the maximum size where blocks will be zeroed out instead of creating a
      new uninitialized extent.  The default of this has been sent to 32kb.
      
      CC: Zach Brown <zab@zabbo.net>
      CC: Andreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      67a5da56
    • Theodore Ts'o's avatar
      ext4: add max_dir_size_kb mount option · df981d03
      Theodore Ts'o authored
      Very large directories can cause significant performance problems, or
      perhaps even invoke the OOM killer, if the process is running in a
      highly constrained memory environment (whether it is VM's with a small
      amount of memory or in a small memory cgroup).
      
      So it is useful, in cloud server/data center environments, to be able
      to set a filesystem-wide cap on the maximum size of a directory, to
      ensure that directories never get larger than a sane size.  We do this
      via a new mount option, max_dir_size_kb.  If there is an attempt to
      grow the directory larger than max_dir_size_kb, the system call will
      return ENOSPC instead.
      
      Google-Bug-Id: 6863013
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      
      
      
      df981d03
  4. 23 Jul, 2012 3 commits
    • Jan Kara's avatar
      ext4: convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super() · 044ce47f
      Jan Kara authored
      The last user of ext4_mark_super_dirty() in ext4_file_open() is so
      rare it can well be modifying the superblock properly by journalling
      the change.  Change it and get rid of ext4_mark_super_dirty() as it's
      not needed anymore.
      
      Artem: small amendments.
      Artem: tested using xfstests for both journalled and non-journalled ext4.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Tested-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      044ce47f
    • Theodore Ts'o's avatar
      ext4: remove dynamic array size in ext4_chksum() · 3108b54b
      Theodore Ts'o authored
      The ext4_checksum() inline function was using a dynamic array size,
      which is not legal C.  (It is a gcc extension).
      
      Remove it.
      
      Cc: "Darrick J. Wong" <djwong@us.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      3108b54b
    • Aditya Kali's avatar
      ext4: make quota as first class supported feature · 7c319d32
      Aditya Kali authored
      This patch adds support for quotas as a first class feature in ext4;
      which is to say, the quota files are stored in hidden inodes as file
      system metadata, instead of as separate files visible in the file system
      directory hierarchy.
      
      It is based on the proposal at:                                                                                                           
      https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4
      
      This patch introduces a new feature - EXT4_FEATURE_RO_COMPAT_QUOTA
      which, when turned on, enables quota accounting at mount time
      iteself. Also, the quota inodes are stored in two additional superblock
      fields.  Some changes introduced by this patch that should be pointed
      out are:
      
      1) Two new ext4-superblock fields - s_usr_quota_inum and
         s_grp_quota_inum for storing the quota inodes in use.
      2) Default quota inodes are: inode#3 for tracking userquota and inode#4
         for tracking group quota. The superblock fields can be set to use
         other inodes as well.
      3) If the QUOTA feature and corresponding quota inodes are set in
         superblock, the quota usage tracking is turned on at mount time. On
         'quotaon' ioctl, the quota limits enforcement is turned
         on. 'quotaoff' ioctl turns off only the limits enforcement in this
         case.
      4) When QUOTA feature is in use, the quota mount options 'quota',
         'usrquota', 'grpquota' are ignored by the kernel.
      5) mke2fs or tune2fs can be used to set the QUOTA feature and initialize
         quota inodes. The default reserved inodes will not be visible to user
         as regular files.
      6) The quota-tools will need to be modified to support hidden quota
         files on ext4. E2fsprogs will also include support for creating and
         fixing quota files.
      7) Support is only for the new V2 quota file format.
      Tested-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJohann Lombardi <johann@whamcloud.com>
      Signed-off-by: default avatarAditya Kali <adityakali@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      7c319d32
  5. 09 Jul, 2012 2 commits
    • Zheng Liu's avatar
      ext4: add a new nolock flag in ext4_map_blocks · 729f52c6
      Zheng Liu authored
      EXT4_GET_BLOCKS_NO_LOCK flag is added to indicate that we don't need
      to acquire i_data_sem lock in ext4_map_blocks.  Meanwhile, it changes
      ext4_get_block() to not start a new journal because when we do a
      overwrite dio, there is no any metadata that needs to be modified.
      
      We define a new function called ext4_get_block_write_nolock, which is
      used in dio overwrite nolock.  In this function, it doesn't try to
      acquire i_data_sem lock and doesn't start a new journal as it does a
      lookup.
      
      CC: Tao Ma <tm@tao.ma>
      CC: Eric Sandeen <sandeen@redhat.com>
      CC: Robin Dong <hao.bigrat@gmail.com>
      Signed-off-by: default avatarZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      729f52c6
    • Theodore Ts'o's avatar
      ext4: fix overhead calculation used by ext4_statfs() · 952fc18e
      Theodore Ts'o authored
      Commit f975d6bc introduced bug which caused ext4_statfs() to
      miscalculate the number of file system overhead blocks.  This causes
      the f_blocks field in the statfs structure to be larger than it should
      be.  This would in turn cause the "df" output to show the number of
      data blocks in the file system and the number of data blocks used to
      be larger than they should be.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      952fc18e
  6. 30 Jun, 2012 1 commit
  7. 31 May, 2012 1 commit
  8. 28 May, 2012 1 commit
  9. 27 May, 2012 1 commit
  10. 15 May, 2012 1 commit
  11. 29 Apr, 2012 9 commits
  12. 16 Apr, 2012 1 commit
  13. 20 Mar, 2012 1 commit
  14. 19 Mar, 2012 1 commit
  15. 05 Mar, 2012 3 commits
    • Curt Wohlgemuth's avatar
      ext4: add comments to definition of ext4_io_end_t · 4188188b
      Curt Wohlgemuth authored
      This should make it more clear what this structure is used
      for, and how some of the (mutually exclusive) fields are
      used to keep page cache references.
      Signed-off-by: default avatarCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      4188188b
    • Jeff Moyer's avatar
      ext4: fix race between sync and completed io work · 491caa43
      Jeff Moyer authored
      The following command line will leave the aio-stress process unkillable
      on an ext4 file system (in my case, mounted on /mnt/test):
      
      aio-stress -t 20 -s 10 -O -S -o 2 -I 1000 /mnt/test/aiostress.3561.4 /mnt/test/aiostress.3561.4.20 /mnt/test/aiostress.3561.4.19 /mnt/test/aiostress.3561.4.18 /mnt/test/aiostress.3561.4.17 /mnt/test/aiostress.3561.4.16 /mnt/test/aiostress.3561.4.15 /mnt/test/aiostress.3561.4.14 /mnt/test/aiostress.3561.4.13 /mnt/test/aiostress.3561.4.12 /mnt/test/aiostress.3561.4.11 /mnt/test/aiostress.3561.4.10 /mnt/test/aiostress.3561.4.9 /mnt/test/aiostress.3561.4.8 /mnt/test/aiostress.3561.4.7 /mnt/test/aiostress.3561.4.6 /mnt/test/aiostress.3561.4.5 /mnt/test/aiostress.3561.4.4 /mnt/test/aiostress.3561.4.3 /mnt/test/aiostress.3561.4.2
      
      This is using the aio-stress program from the xfstests test suite.
      That particular command line tells aio-stress to do random writes to
      20 files from 20 threads (one thread per file).  The files are NOT
      preallocated, so you will get writes to random offsets within the
      file, thus creating holes and extending i_size.  It also opens the
      file with O_DIRECT and O_SYNC.
      
      On to the problem.  When an I/O requires unwritten extent conversion,
      it is queued onto the completed_io_list for the ext4 inode.  Two code
      paths will pull work items from this list.  The first is the
      ext4_end_io_work routine, and the second is ext4_flush_completed_IO,
      which is called via the fsync path (and O_SYNC handling, as well).
      There are two issues I've found in these code paths.  First, if the
      fsync path beats the work routine to a particular I/O, the work
      routine will free the io_end structure!  It does not take into account
      the fact that the io_end may still be in use by the fsync path.  I've
      fixed this issue by adding yet another IO_END flag, indicating that
      the io_end is being processed by the fsync path.
      
      The second problem is that the work routine will make an assignment to
      io->flag outside of the lock.  I have witnessed this result in a hang
      at umount.  Moving the flag setting inside the lock resolved that
      problem.
      
      The problem was introduced by commit b82e384c ("ext4: optimize
      locking for end_io extent conversion"), which first appeared in 3.2.
      As such, the fix should be backported to that release (probably along
      with the unwritten extent conversion race fix).
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      CC: stable@kernel.org
      491caa43
    • Theodore Ts'o's avatar
      ext4: make ext4_show_options() be table-driven · 5a916be1
      Theodore Ts'o authored
      Consistently show mount options which are the non-default, so that
      /proc/mounts accurately shows the mount options that would be
      necessary to mount the file system in its current mode of operation.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      5a916be1
  16. 03 Mar, 2012 1 commit
  17. 02 Mar, 2012 1 commit
  18. 20 Feb, 2012 3 commits
    • Jeff Moyer's avatar
      ext4: fix race between unwritten extent conversion and truncate · 266991b1
      Jeff Moyer authored
      The following comment in ext4_end_io_dio caught my attention:
      
      	/* XXX: probably should move into the real I/O completion handler */
              inode_dio_done(inode);
      
      The truncate code takes i_mutex, then calls inode_dio_wait.  Because the
      ext4 code path above will end up dropping the mutex before it is
      reacquired by the worker thread that does the extent conversion, it
      seems to me that the truncate can happen out of order.  Jan Kara
      mentioned that this might result in error messages in the system logs,
      but that should be the extent of the "damage."
      
      The fix is pretty straight-forward: don't call inode_dio_done until the
      extent conversion is complete.
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      266991b1
    • Theodore Ts'o's avatar
      ext4: fix INCOMPAT feature codepoint reservation for INLINEDATA · 856cbcf9
      Theodore Ts'o authored
      In commit 9b90e5e0 I incorrectly reserved the wrong bit for
      EXT4_FEATURE_INCOMPAT_INLINEDATA per the discussion on the linux-ext4
      list on December 7, 2011.  The codepoint 0x2000 should be used for
      EXT4_FEATURE_INCOMPAT_USE_META_CSUM, so INLINEDATA will be assigned
      the value 0x8000.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      856cbcf9
    • Theodore Ts'o's avatar
      ext4: fix race when setting bitmap_uptodate flag · 813e5727
      Theodore Ts'o authored
      In ext4_read_{inode,block}_bitmap() we were setting bitmap_uptodate()
      before submitting the buffer for read.  The is bad, since we check
      bitmap_uptodate() without locking the buffer, and so if another
      process is racing with us, it's possible that they will think the
      bitmap is uptodate even though the read has not completed yet,
      resulting in inodes and blocks potentially getting allocated more than
      once if we get really unlucky.
      
      Addresses-Google-Bug: 2828254
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      813e5727
  19. 05 Jan, 2012 2 commits
  20. 04 Jan, 2012 3 commits