An error occurred fetching the project authors.
  1. 19 Oct, 2004 1 commit
  2. 16 Oct, 2004 1 commit
  3. 22 Sep, 2004 1 commit
  4. 17 Sep, 2004 1 commit
  5. 27 Aug, 2004 1 commit
  6. 30 Jun, 2004 1 commit
  7. 27 Jun, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] ext3: direct-io transaction extending fix · 375f73f9
      Andrew Morton authored
      ext3_direct_io_get_blocks() is misinterpreting the return value from
      ext3_journal_extend(), and is consequently running out of buffer credits and
      going BUG on tremendously large direct-io writes.  Fix that up.
      
      Also, I note that the really large direct-io writes can hold a transaction
      open for the entire duration, which can be minutes.  This violates ext3's
      attempt to commit data at regular intervals.  Fix that up by looking at the
      transaction state: if it's T_LOCKED, shut off the current handle so the
      pending commit can complete.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      375f73f9
  8. 18 Jun, 2004 1 commit
    • Theodore Y. Ts'o's avatar
      [PATCH] Ext3: Retry allocation after transaction commit (v2) · 5c4ad014
      Theodore Y. Ts'o authored
      Here is a reworked version of my patch to ext3 to retry certain filesystem
      operations after an ENOSPC error.  The ext3_should_retry_alloc() function will
      not wait on the currently running transaction if there is a currently active
      handle; hence this should avoid deadlocks in the Lustre use case.  The patch
      is versus BK-recent.
      
      I've also included a simple, reliable test case which demonstrates the problem
      this patch is intended to fix.  (Note that BK-recent is not sufficient to
      address this test case, and waiting on the commiting transaction in
      ext3_new_block is also not sufficient.  Been there, tried that, didn't work.
      We need to do the full-bore retry from the top level.  The
      ext3_should_retry_alloc() will only wait on the committing transaction if
      there is an active handle; hence Lustre will probably also need to use
      ext3_should_retry_alloc() if it wants to reliably avoid this particular
      problem.)
      
      #!/bin/sh
      #
      #
      TEST_DIR=/tmp
      IMAGE=$TEST_DIR/retry.img
      MNTPT=$TEST_DIR/retry.mnt
      TEST_SRC=/usr/projects/e2fsprogs/e2fsprogs/build
      MKE2FS_OPTS=""
      IMAGE_SIZE=8192
      
      umount $MNTPT
      dd if=/dev/zero of=$IMAGE bs=4k count=$IMAGE_SIZE
      mke2fs -j -F $MKE2FS_OPTS $IMAGE 
      
      function test_log ()
      {
      	echo $*
      	logger -p local4.notice $*
      }
      
      mkdir -p $MNTPT
      mount -o loop -t ext3 $IMAGE $MNTPT
      test_log Retry test: BEGIN
      for i in `seq 1 3`
      do
      	test_log "Retry test: Loop $i"
      	echo 2 > /proc/sys/fs/jbd-debug
      	while ! mkdir -p $MNTPT/foo/bar
      	do
      		test_log "Retry test: mkdir failed"
      		sleep 1
      	done
      	echo 0 > /proc/sys/fs/jbd-debug
      	cp -r $TEST_SRC $MNTPT/foo/bar 2> /dev/null
      	rm -rf $MNTPT/*
      done
      umount $MNTPT
      test_log "Retry test: END"
      
      
      akpm@osdl.org
      
        Rework the code to make it a formal JBD API entry point.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5c4ad014
  9. 25 May, 2004 1 commit
  10. 19 May, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] use-before-uninitialized value in ext3(2)_find_ goal · 83ee50f5
      Andrew Morton authored
      From: Mingming Cao <cmm@us.ibm.com>
      
      There is a uninitialized goal value being referenced in both ext3 and ext2
      find goal block functions (ext3_find_goal() and ext2_find_goal()).
      
      In the non-sequential write case, these functions check the goal value(non
      zero) before calling ext3(2)_find_near() to find the goal block to
      allocate.
      
      Since the goal value is uninitialized(non zero), the ext3(2)_find_near() is
      never being called in the non-sequential write, thus ext3(2)_find_goal()
      failed to guide a goal block in the random write case.
      
      ext3(2)_new_block() takes the junk goal value and will turn it to goal 0
      since it's normally beyond the filesystem block number limit.  The fix is
      trivial.
      83ee50f5
  11. 10 May, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] Fix ext3 bogus ENOSPC · e736428d
      Andrew Morton authored
      With strange workloads which do a lot of quick truncation on small filesystems
      it is possible to get into a situation where there are free blocks on the
      disk, but they are not allocatable at this time due to their having been freed
      up in the current JBD transaction.  Applications get unexpected ENOSPC errors.
      
      We can fix that with this patch, originally by Andreas Dilger which forces a
      single commit+retry when an ENOSPC is encountered.
      e736428d
  12. 22 Apr, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] writeback livelock fix · 1ed73535
      Andrew Morton authored
      If a filesystem's ->writepage implementation repeatedly refuses to write the
      page (it keeps on redirtying it instead) (reiserfs seems to do this) then the
      writeback logic can get stuck repeately trying to write the same page.
      
      Fix that up by correctly setting wbc->pages_skipped, to tell the writeback
      logic that things aren't working out.
      1ed73535
  13. 19 Apr, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] direct-IO return type fixes · 59fed502
      Andrew Morton authored
      From: me, Badari Pulavarty <pbadari@us.ibm.com>
      
      Currently a direct-IO read or write of more than 2G on 64-bit machines is
      broken.  Replace int with ssize_t in various places to fix that up.
      59fed502
  14. 17 Apr, 2004 1 commit
  15. 15 Apr, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] ext3: journalled quotas · 2df2c24a
      Andrew Morton authored
      From: Jan Kara <jack@ucw.cz>
      
      Journalled quota support for ext3: The patch consists of two parts - ext3
      changes and changes in generic quota code.  The main idea of the changes is
      that a transaction is always started before any operation which changes quota
      file and dirtifying of the quota causes its write to disk.  These two changes
      assure that quota change is journalled into the same transaction as the file
      change and hence after journal replay quota is consistent with the filesystem
      state.  As during journal replay inodes from orphan list are deleted/truncated
      we have to do quota_on before the replay of the orphan list - this problem is
      solved by additional mount options to ext3 with quota file names and format.
      
      Some changes in generic code were also needed to assure that quota structure
      in file is always allocated and so ordinary quota operations (like
      adding/deleting a block/inode) need only a few blocks from the transaction.
      2df2c24a
  16. 20 Jan, 2004 1 commit
  17. 19 Jan, 2004 1 commit
    • Andrew Morton's avatar
      [PATCH] bdev: switch to f_mapping · 32d66678
      Andrew Morton authored
      From: viro@parcelfarce.linux.theplanet.co.uk <viro@parcelfarce.linux.theplanet.co.uk>
      
      A lot of places used to use ->f_dentry->d_inode->i_mapping all over the
      place.  Replaced with use of ->f_mapping.  For now - just the places where we
      literally could do search-and-replace.
      32d66678
  18. 16 Oct, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] ext3: i_disksize locking fix · 87e628f7
      Andrew Morton authored
      From: Alex Tomas <alex@clusterfs.com>
      
      The setting of i_disksize can race against concurrent invokations of
      ext3_get_block().  Moving this inside i_truncate_sem fixes it up.
      87e628f7
  19. 05 Oct, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] ext3 block allocator locking fix · 3101501b
      Andrew Morton authored
      When the BKL was removed from ext3 we lost locking coverage for
      get_block()-versus-get_block().  Nobody seems to have hit the race because
      get_block() almost always runs under i_sem: only memory pressure-based
      writeout over a file hole runs outside i_sem.
      
      ext2 uses the dedicated i_meta_lock spinlock in the inode to provide the
      needed locking.  But ext3 already has an rwsem around all the get_block()
      activity to protect it from truncate-related races.
      
      So this patch just converts that rwsem into a semaphore, so concurrent
      get_block() can never occur.  This will be more efficient than adding the new
      spinlock.
      
      We lose the ability to have two threads run get_block() against the same file
      at the same time but again, that only happens during pageout over a hole
      anyway.
      
      (Kudos Alex Tomas for noticing the bug)
      3101501b
  20. 01 Oct, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] dev_t forward compatibility fix · 1885b3f1
      Andrew Morton authored
      From: Andries.Brouwer@cwi.nl
      
      ext2 used a 32-bit field for dev_t, with possibly undefined storage
      following; thus, no action was required to go to 32-bit dev_t, but going to
      64-bit dev_t required some subtlety: 0 was written in the first word and
      the 64 bits in the following two.  Al truncated my 64-bit stuff to 32 bits
      but did not understand why there was this split, and wrote 0 followed by a
      single word.  We should at least zero the word following to have
      well-defined storage later.
      1885b3f1
  21. 23 Sep, 2003 1 commit
    • Alexander Viro's avatar
      [PATCH] 32-bit dev_t: switch-over · 1c2c2a8f
      Alexander Viro authored
      Real conversion to 32bit dev_t.  Expansion to:
      	* mknod() - 32
      	* newstat() - 32 on 64bit platforms
      	* stat64() - 32 on mips, 64 on everything else (mips has weird struct
      stat64 and can't get more than 32 bits).  Note that right now the difference
      is purely theoretical - we don't have internal values above 32 bits, so
      huge_... vs. new_... only marks the places where 64bit conversion will need
      extra work.
      	* arch-dependent stat variants - depending on width available.
      	* ustat et.al. - 32
      	* filesystems that can handle 32 bits right now - 32
      	* ext2 and ext3 - 32, with large dev_t inodes having 0 in the first
      element of i_data[] (where we store dev_t value for small device numbers) and
      keeping the value in the second element.
      	* nfsd - 32; it can be driven to 64, but we'll get several issues with
      NFSv2 support.
      	* RAID - 32
      	* devmapper - with v1 it's still 16 (nothing to do here), with v4 it's
      64.
      	* loop - 64
      	* initramfs - 32
      	* do_mounts code - 32.  Parts that scan devfs tree are using newstat()
      on 64bit platforms and stat64() on the rest (IOW, the latest stat variant on
      given platform).
      	* old_valid_dev()/new_valid_dev() added where needed (stat variants,
      mostly - we fail with -EOVERFLOW if values do not fit).
      1c2c2a8f
  22. 05 Sep, 2003 2 commits
  23. 19 Aug, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] async write errors: report truncate and io errors on · fe7e689f
      Andrew Morton authored
      From: Oliver Xymoron <oxymoron@waste.org>
      
      These patches add the infrastructure for reporting asynchronous write errors
      to block devices to userspace.  Error which are detected due to pdflush or VM
      writeout are reported at the next fsync, fdatasync, or msync on the given
      file, and on close if the error occurs in time.
      
      We do this by propagating any errors into page->mapping->error when they are
      detected.  In fsync(), msync(), fdatasync() and close() we return that error
      and zero it out.
      
      
      The Open Group say close() _may_ fail if an I/O error occurred while reading
      from or writing to the file system.  Well, in this implementation close() can
      return -EIO or -ENOSPC.  And in that case it will succeed, not fail - perhaps
      that is what they meant.
      
      
      There are three patches in this series and testing has only been performed
      with all three applied.
      fe7e689f
  24. 01 Aug, 2003 4 commits
    • Randy Dunlap's avatar
      [PATCH] don't init statics to 0 (fs/) · 9cf89014
      Randy Dunlap authored
      From: Leann Ogasawara <ogasawara@osdl.org>
      
      Uninitialize static variables initialized to 0 so they are pushed to the
      .bss instead of .data.
      9cf89014
    • Andrew Morton's avatar
      [PATCH] direct-io support for XFS unwritten extents · 359a5de1
      Andrew Morton authored
      From: Nathan Scott <nathans@sgi.com>
      
      This patch adds a mechanism by which a filesystem can register an interest in
      the completion of direct I/O.  The completion routine will be given the
      inode, an offset and a length, and an optional filesystem-private field.
      
      We have extended the use of the buffer_head-based interface (i.e.
      get_block_t) for direct I/O such that the b_private field is now utilised.
      It is defined to be initially zero at the start of I/O, and will be passed
      into the filesystem unmodified by the VFS with each map request, while
      setting up the direct I/O.  Once I/O has completed the final value of this
      pointer will be passed into a filesystems I/O completion handler.  This
      mechanism can be used to keep track of all of the mapping requests which
      encompass an individual direct I/O request.
      
      This has been implemented specifically for XFS, but is done so as to be as
      generic as possible.  XFS uses this mechanism to provide support for
      unwritten extents - these are file extents which have been pre-allocated
      on-disk, but not yet written to (once written, these become regular file
      extents, but only once I/O is complete).
      359a5de1
    • Andrew Morton's avatar
      [PATCH] Fix race in ext3_getblk · 77b070cb
      Andrew Morton authored
      From: Alex Tomas <bzzz@tmi.comex.ru>
      
      ext3_getblk() memsets a newly allocated buffer, but forgets to check
      whether a different thread brought it uptodate while we waited for the
      buffer lock.
      
      It's OK normally because we're serialised by the page lock.  But lustre
      apparently is doing something different with getblk and hits this race.
      
      Plus I suspect it's racy with competing O_DIRECT writes.
      77b070cb
    • Andrew Morton's avatar
      [PATCH] ext3: avoid reading empty inode blocks · bca17d03
      Andrew Morton authored
      From: Alex Tomas <bzzz@tmi.comex.ru>
      
      ext3_get_inode_loc() read inode's block only if:
      
        1) this inode has no copy in memory
        2) inode's block has another valid inode(s)
      
      this optimization allows to avoid needless I/O in two cases:
      
      1) just allocated inode is first valid in the inode's block
      
      2) kernel wants to write inode, but buffer in which inode
         belongs to gets freed by VM
      bca17d03
  25. 25 Jul, 2003 1 commit
  26. 10 Jul, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] i_size atomic access · eafe5916
      Andrew Morton authored
      From: Daniel McNeil <daniel@osdl.org>
      
      This adds i_seqcount to the inode structure and then uses i_size_read() and
      i_size_write() to provide atomic access to i_size.  This is a port of
      Andrea Arcangeli's i_size atomic access patch from 2.4.  This only uses the
      generic reader/writer consistent mechanism.
      
      Before:
      mnm:/usr/src/25> size vmlinux
         text    data     bss     dec     hex filename
      2229582 1027683  162436 3419701  342e35 vmlinux
      
      After:
      mnm:/usr/src/25> size vmlinux
         text    data     bss     dec     hex filename
      2225642 1027655  162436 3415733  341eb5 vmlinux
      
      3.9k more text, a lot of it fastpath :(
      
      It's a very minor bug, and the fix has a fairly non-minor cost.  The most
      compelling reason for fixing this is that writepage() checks i_size.  If it
      sees a transient value it may decide that page is outside i_size and will
      refuse to write it.  Lost user data.
      eafe5916
  27. 03 Jul, 2003 1 commit
  28. 25 Jun, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] ext3: fix page lock vs journal_start ranking bug · 30276fd6
      Andrew Morton authored
      ext3_block_truncate_page() is calling grab_cache_page() inside a JBD
      transaction.  This is wrong, because transactions nest inside lock_page().
      
      The deadlock is against shrink_list->ext3_journalled_writepage->journal_start.
      
      This was not noticed before because we never used to journal writepage() data
      in journalled-data mode.  And because the deadlock against
      generic_file_write() is covered up by i_sem.
      
      Rework things so that we lock the page prior to starting a transaction.
      30276fd6
  29. 20 Jun, 2003 1 commit
  30. 18 Jun, 2003 6 commits
    • Andrew Morton's avatar
      [PATCH] ext3: disable O_DIRECT in journalled-data mode · 45c22f8f
      Andrew Morton authored
      We cannot sensibly support O_DIRECT reads or writes when all writes are
      journalled.
      
      This is because the VFS explicitly avoids syncing the file metadata during
      O_DIRECT reads and writes.  ext3 with journalled data will leave pending
      changes in memory and they will overwrite the results of O_DIRECT writes, and
      O_DIRECT reads will not return the latest data.
      
      Setting the a_op to null will cause opens and fcntl(F_SETFL) to return
      -EINVAL if O_DIRECT is requested.
      45c22f8f
    • Andrew Morton's avatar
      [PATCH] ext3: fix data=journal for small blocksize · 319a1ad4
      Andrew Morton authored
      Fix various problems which cropped up due to MAP_SHARED traffic on
      data=journal with blocksize < PAGE_CACHE_SIZE.
      
      All relate to handling the "pending truncate" buffers outside i_size.
      319a1ad4
    • Andrew Morton's avatar
      [PATCH] ext3: add a dump_stack() · 4308a50e
      Andrew Morton authored
      add a dump_stack() to a can't-happen path which happened during development.
      4308a50e
    • Andrew Morton's avatar
      [PATCH] ext3: fix data=journal mode · de285c52
      Andrew Morton authored
      ext3's fully data-journalled mode has been broken for a year.  This patch
      fixes it up.
      
      The prepare_write/commit_write/writepage implementations have been split up.
      Instead of having each function handle all three journalling mode we now have
      three separate sets of address_space_operations.
      
      The problematic part of data=journal is MAP_SHARED writepage traffic: pages
      which don't have buffers.  In 2.4 these were cheatingly treated as
      data-ordered buffers and that caused several nasty problems.
      
      Here we do it properly: writepage traffic is fully journalled.  This means
      that the various workarounds for the 2.4 scheme can be removed, when I
      remember where they all are.
      
      The PG_checked flag has been borrowed: it it set in the atomic set_page_dirty
      a_op to tell the subsequent writepage() that this page needs to have buffers
      attached, dirtied and journalled.
      
      This rather defines PG_checked as "fs-private info in page->flags" and it
      should be renamed sometime.
      de285c52
    • Andrew Morton's avatar
      [PATCH] ext3: ext3_writepage race fix · dd71e33f
      Andrew Morton authored
      After ext3_writepage() has called block_write_full_page() it will walk the
      page's buffer ring dropping the buffer_head refcounts.
      
      It does this wrong - on the final loop it will dereference the buffer_head
      which it just dropped the refcount on.  Poisoned oopses have been seen
      against bh->b_this_page.
      
      Change it to take a local copy of b_this_page prior to dropping the bh's
      refcount.
      dd71e33f
    • Andrew Morton's avatar
      [PATCH] ext3: move lock_kernel() down into the JBD layer. · 3307fbd1
      Andrew Morton authored
      This is the start of the ext3 scalability rework.  It basically comes in two
      halves:
      
      - ext3 BKL/lock_super removal and scalable inode/block allocators
      
      - JBD locking rework.
      
      The ext3 scalability work was completed a couple of months ago.
      
      The JBD rework has been stable for a couple of weeks now.  My gut feeling is
      that there should be one, maybe two bugs left in it, but no problems have
      been discovered...
      
      
      Performance-wise, throughput is increased by up to 2x on dual CPU.  10x on
      16-way has been measured.  Given that current ext3 is able to chew two whole
      CPUs spinning on locks on a 4-way, that wasn't especially suprising.
      
      These patches were prepared by Alex Tomas <bzzz@tmi.comex.ru> and myself.
      
      
      First patch: ext3 lock_kernel() removal.
      
      The only reason why ext3 takes lock_kernel() is because it is requires by the
      JBD API.
      
      The patch removes the lock_kernels() from ext3 and pushes them down into JBD
      itself.
      3307fbd1
  31. 02 Jun, 2003 1 commit