An error occurred fetching the project authors.
  1. 07 Dec, 2004 1 commit
    • Dave Kleikamp's avatar
      JFS: flush new iag from bd_inode's mapping · c4ede418
      Dave Kleikamp authored
      This is a fix to help jfs work with grub.  A new IAG is created in
      the bd_inode's mapping, but subsequently modified in a different
      mapping.  We should invalidate the former page to keep grub from
      using that cached page.  It isn't useful to have it cached anyway,
      since jfs will never access it again through that mapping.
      Signed-off-by: default avatarDave Kleikamp <shaggy@austin.ibm.com>
      c4ede418
  2. 26 Oct, 2004 1 commit
  3. 18 Aug, 2004 1 commit
  4. 11 Jul, 2004 1 commit
  5. 07 Jul, 2004 1 commit
  6. 24 Mar, 2004 1 commit
    • Dave Kleikamp's avatar
      JFS: Prevent hang in __lock_metapage · 135e21a5
      Dave Kleikamp authored
      Remove the hold_metapage call from txLog to prevent a hang.
      While investigating this one, I audited all functions that held
      metapage locks and found several error paths that did not release
      them correctly.  These are fixed as well.
      135e21a5
  7. 26 Feb, 2004 1 commit
  8. 07 Jan, 2004 1 commit
    • Dave Kleikamp's avatar
      [PATCH] don't clear i_sb · 3434a16e
      Dave Kleikamp authored
      From: Christoph Hellwig <hch@lst.de>
      
      JFS currently clears i_sb in some error pathes which can make the
      core kernel OOPS because it may never be NULL.  Noticed because some
      IBM people try to "fix" the core kernel for it now..
      3434a16e
  9. 08 Oct, 2003 1 commit
    • Dave Kleikamp's avatar
      JFS: Improved error handing · 6349fc5a
      Dave Kleikamp authored
      This patch replaces many assert statements, which caused a BUG(), with
      improved code to mark the superblock dirty and then proceed as specified by
      the errors= mount flag (as ext2 and ext3 do).  JFS's default for the errors
      option is "remount-ro" in order to prevent addition data corruption when a
      problem is found.
      
      These asserts are usually triggered by on-disk data corruption.  By marking
      the superblock dirty, fsck will perform a complete check on the file system
      and correct the problems, rather than simply replaying the journal, inviting
      later trouble.
      
      Submitted by Karl Rister & Dave Kleikamp
      6349fc5a
  10. 27 Sep, 2003 1 commit
  11. 23 Sep, 2003 1 commit
    • Alexander Viro's avatar
      [PATCH] 32-bit dev_t: switch-over · 1c2c2a8f
      Alexander Viro authored
      Real conversion to 32bit dev_t.  Expansion to:
      	* mknod() - 32
      	* newstat() - 32 on 64bit platforms
      	* stat64() - 32 on mips, 64 on everything else (mips has weird struct
      stat64 and can't get more than 32 bits).  Note that right now the difference
      is purely theoretical - we don't have internal values above 32 bits, so
      huge_... vs. new_... only marks the places where 64bit conversion will need
      extra work.
      	* arch-dependent stat variants - depending on width available.
      	* ustat et.al. - 32
      	* filesystems that can handle 32 bits right now - 32
      	* ext2 and ext3 - 32, with large dev_t inodes having 0 in the first
      element of i_data[] (where we store dev_t value for small device numbers) and
      keeping the value in the second element.
      	* nfsd - 32; it can be driven to 64, but we'll get several issues with
      NFSv2 support.
      	* RAID - 32
      	* devmapper - with v1 it's still 16 (nothing to do here), with v4 it's
      64.
      	* loop - 64
      	* initramfs - 32
      	* do_mounts code - 32.  Parts that scan devfs tree are using newstat()
      on 64bit platforms and stat64() on the rest (IOW, the latest stat variant on
      given platform).
      	* old_valid_dev()/new_valid_dev() added where needed (stat variants,
      mostly - we fail with -EOVERFLOW if values do not fit).
      1c2c2a8f
  12. 16 Sep, 2003 1 commit
    • Dave Kleikamp's avatar
      JFS: Fix rampant data corruption · bb51beca
      Dave Kleikamp authored
      A recent change causes pervasive data corruption by over-writing inode
      metadata with a word of garbage.  The field, di_rdev, should only be set
      for a device inode.
      bb51beca
  13. 05 Sep, 2003 2 commits
    • Alexander Viro's avatar
      [PATCH] large dev_t - second series (11/15) · ec55b83d
      Alexander Viro authored
      	Fix for JFS handling of device nodes; it has 32bit on-disk device
      numbers, shoves them into 16bit (->i_rdev) when inode is read and writes
      them back truncated when inode is written to disk.  For now (and 2.4 will
      have to do the same permanently) we store the original value in private
      part of inode and use it instead of ->i_rdev in ->write_inode(); mknod()
      sets it at the same time as ->i_rdev.  It will become unnecessary when
      dev_t becomes wider than 16 bits, but for now we need it.
      ec55b83d
    • Alexander Viro's avatar
      [PATCH] large dev_t - second series (7/15) · ad1da81a
      Alexander Viro authored
      	the last kdev_t object is gone; ->i_rdev switched to dev_t.
      ad1da81a
  14. 19 Aug, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] async write errors: use flags in address space · fcad2b42
      Andrew Morton authored
      From: Oliver Xymoron <oxymoron@waste.org>
      
      This patch just saves a few bytes in the inode by turning mapping->gfp_mask
      into an unsigned long mapping->flags.
      
      The mapping's gfp mask is placed in the 16 high bits of mapping->flags and
      two of the remaining 16 bits are used for tracking EIO and ENOSPC errors.
      
      This leaves 14 bits in the mapping for future use.  They should be accessed
      with the atomic bitops.
      fcad2b42
  15. 03 Jul, 2003 1 commit
  16. 13 Mar, 2003 1 commit
  17. 10 Feb, 2003 1 commit
    • Andrew Morton's avatar
      [PATCH] Fix synchronous writers to wait properly for the result · 8d49bf3f
      Andrew Morton authored
      Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> points out a bug in
      ll_rw_block() usage.
      
      Typical usage is:
      
      	mark_buffer_dirty(bh);
      	ll_rw_block(WRITE, 1, &bh);
      	wait_on_buffer(bh);
      
      the problem is that if the buffer was locked on entry to this code sequence
      (due to in-progress I/O), ll_rw_block() will not wait, and start new I/O.  So
      this code will wait on the _old_ I/O, and will then continue execution,
      leaving the buffer dirty.
      
      It turns out that all callers were only writing one buffer, and they were all
      waiting on that writeout.  So I added a new sync_dirty_buffer() function:
      
      	void sync_dirty_buffer(struct buffer_head *bh)
      	{
      		lock_buffer(bh);
      		if (test_clear_buffer_dirty(bh)) {
      			get_bh(bh);
      			bh->b_end_io = end_buffer_io_sync;
      			submit_bh(WRITE, bh);
      		} else {
      			unlock_buffer(bh);
      		}
      	}
      
      which allowed a fair amount of code to be removed, while adding the desired
      data-integrity guarantees.
      
      UFS has its own wrappers around ll_rw_block() which got in the way, so this
      operation was open-coded in that case.
      8d49bf3f
  18. 17 Jan, 2003 1 commit
  19. 20 Nov, 2002 1 commit
    • Dave Kleikamp's avatar
      JFS: Move index table out of directory inode's address space · cf9e638b
      Dave Kleikamp authored
      The metadata representing the directory entries' persistent index has been
      mapped to the directory inode's address space.  This was the cause of much
      ugliness in the code to avoid the inode being released from the inode cache
      while there was still dirty metadata mapped to the inode.
      
      This patch moves this metadata to the block device inode's address space,
      which allows us to clean up the code somewhat.
      cf9e638b
  20. 17 Nov, 2002 1 commit
    • Andi Kleen's avatar
      [PATCH] nanosecond stat timefields · 5d62665d
      Andi Kleen authored
      stat64 has been changed to return jiffies granuality as nsec in previously
      unused fields. This allows make to make better decisions on when
      to recompile a file. Follows losely the Solaris API.
      
      CURRENT_TIME has been redefined to return struct timespec.  The users
      who don't use it in a inode/attr context have been changed to use a new
      get_seconds() function.  CURRENT_TIME is implemented by an out-of-line
      function.
      
      There is a small performance penalty in this patch.  The previous
      filemap code had an optimization to flush atime only once a second.
      This is currently gone, which will increase flushes a bit.  I believe
      the correct solution if it should be a problem is to have per super
      block fields that give an arbitary atime flush granuality - so that you
      can set it to be only flushed once a hour if you prefer that.  I will
      work on that later in separate patches if the need should arise.
      
      struct inode and the attr struct has been changed to store struct
      timespec instead of time_t for [cma]time.  Not all file systems support
      this granuality, but some like XFS,NFSv3,CIFS,JFS do.  The others will
      currently truncate the nsec part on flushing to disk.  There was some
      discussion on this rounding on l-k previously.  I went for simple
      truncation because there is not much evidence IMHO that the more
      complicated roundings have any advantages.  In practice application will
      be rather unlikely to notice the rounding anyways - they can only see a
      difference when an inode is flush from memory and reloaded in less than
      a second, which is rather unlikely.
      5d62665d
  21. 25 Sep, 2002 1 commit
    • Dave Kleikamp's avatar
      JFS: Fix problems with NFS · b920ee56
      Dave Kleikamp authored
      readdir: Don't hold metadata page while calling filldir().  NFS's
      filldir may call lookup() which could result in a hang.
      b920ee56
  22. 18 Sep, 2002 1 commit
    • Dave Kleikamp's avatar
      JFS: Avoid parallel allocations within the same allocation group · 2f86142b
      Dave Kleikamp authored
      When large files are writting in parallel, allocating the space for
      these files within the same allocation group can cause severe
      fragmentation of the files.  By keeping track of open, growing files
      within an allocation group, we can force other new allocations into
      a different allocation group to avoid this.
      2f86142b
  23. 12 Sep, 2002 1 commit
  24. 02 Sep, 2002 2 commits
  25. 30 Aug, 2002 2 commits
    • Andrew Morton's avatar
      [PATCH] writeback correctness and efficiency changes · ec12ac49
      Andrew Morton authored
      This is a performance and correctness fix against the writeback paths.
      
      The writeback code has competing requirements.  Sometimes it is used
      for "memory cleansing": kupdate, bdflush, writer throttling, page
      allocator writeback, etc.  And sometimes this same code is used for
      data integrity pruposes: fsync, msync, fdatasync, sync, umount, various
      other kernel-internal uses.
      
      The problem is: how to handle a dirty buffer or page which is currently
      under writeback.
      
      For memory cleansing, we just want to skip that buffer/page and go onto
      the next one.  But for sync, we must wait on the old writeback and then
      start new writeback.
      
      mpage_writepages() is current correct for cleansing, but incorrect for
      sync.  block_write_full_page() is currently correct for sync, but
      inefficient for cleansing.
      
      The fix is fairly simple.
      
      - In mpage_writepages(), don't skip the page is it's a sync
      operation.
      
      - In block_write_full_page(), skip the buffer if it is a sync
      operation.  And return -EAGAIN to tell the caller that the writeout
      didn't work out.  The caller must then set the page dirty again and
      move it onto mapping->dirty_pages.
      
      This is an extension of the writepage API: writepage can now return
      EAGAIN.  There are only three callers, and they have been updated.
      
      fail_writepage() and ext3_writepage() were actually doing this by
      hand.  They have been changed to return -EAGAIN.  NTFS will want to
      be able to return -EAGAIN from its writepage as well.
      
      - A sticky question is: how to tell the writeout code which mode it
      is operating in?  Cleansing or sync?
      
      It's such a tiny code change that I didn't have the heart to go and
      propagate a `mode' argument down every instance of writepages() and
      writepage() in the kernel.  So I passed it in via current->flags.
      
      Incidentally, the occurrence of a locked-and-dirty buffer in
      block_write_full_page() is fairly rare: normally the collision avoidance
      happens at the address_space level, via PageWriteback.  But some
      mappings (blockdevs, ext3 files, etc) have their dirty buffers written
      out via submit_bh().  It is these buffers which can stall
      block_write_full_page().
      
      This wart will be pretty intrusive to fix.  ext3 needs to become fully
      page-based (ugh.  It's a block-based journalling filesystem, and pages
      are unnatural).  blockdev mappings are still written out by buffers
      because that's how filesystems use them.  Putting _all_ metadata
      (indirects, inodes, superblocks, etc) into standalone address_spaces
      would fix that up.
      
      - filemap_fdatawrite() sets PF_SYNC.  So filemap_fdatawrite() is the
      kernel function which will start writeback against a mapping for
      "data integrity" purposes, whereas the unexported, internal-only
      do_writepages() is the writeback function which is used for memory
      cleansing.  This difference is the reason why I didn't consolidate
      those functions ages ago...
      
      - Lots of code paths had a bogus extra call to filemap_fdatawait(),
      which I previously added in a moment of weak-headedness.  They have
      all been removed.
      ec12ac49
    • Dave Kleikamp's avatar
      JFS extended attributes · 90041109
      Dave Kleikamp authored
      90041109
  26. 29 Aug, 2002 1 commit
    • Dave Kleikamp's avatar
      JFS: rework extent invalidation · 407a52fa
      Dave Kleikamp authored
      All callers of invalidate_metapages() actually have a dxd_t or pxd_t to
      invalidate, so add invalidate_pxd_metapages() and invalidate_dxd_metapages()
      routines with a common __invalidate_metapages() backend instead.
      
      Start to invalidate the EA/ACL extents, we'll need that soon.
      
      Submitted by Christoph Hellwig
      407a52fa
  27. 05 Aug, 2002 1 commit
    • Dave Kleikamp's avatar
      Add resize function to JFS · 8c8da5ae
      Dave Kleikamp authored
      This is invoked by mount -remount,resize=<blocks>.
      See Documentation/filesystems/jfs.txt for more information.
      8c8da5ae
  28. 19 Jun, 2002 1 commit
  29. 18 Jun, 2002 1 commit
  30. 20 May, 2002 1 commit
    • Christoph Hellwig's avatar
      [PATCH] get rid of <linux/locks.h> · bd2b0c85
      Christoph Hellwig authored
      The lock.h header contained some hand-crafted lcoking routines from
      the pre-SMP days.  In 2.5 only lock_super/unlock_super are left,
      guarded by a number of completly unrelated (!) includes.
      
      This patch moves lock_super/unlock_super to fs.h, which defined
      struct super_block that is needed for those to operate it, removes
      locks.h and updates all caller to not include it and add the missing,
      previously nested includes where needed.
      bd2b0c85
  31. 30 Apr, 2002 2 commits
    • Andrew Morton's avatar
      [PATCH] page writeback locking update · a2bcb3a0
      Andrew Morton authored
      - Fixes a performance problem - callers of
        prepare_write/commit_write, etc are locking pages, which synchronises
        them behind writeback, which also locks these pages.  Significant
        slowdowns for some workloads.
      
      - So pages are no longer locked while under writeout.  Introduce a
        new PG_writeback and associated infrastructure to support this design
        change.
      
      - Pages which are under read I/O still use PageLocked.  Pages which
        are under write I/O have PageWriteback() true.
      
        I considered creating Page_IO instead of PageWriteback, and marking
        both readin and writeout pages as PageIO().  So pages are unlocked
        during both read and write.  There just doesn't seem a need to do
        this - nobody ever needs unblocking access to a page which is under
        read I/O.
      
      - Pages under swapout (brw_page) are PageLocked, not PageWriteback.
        So their treatment is unchangeded.
      
        It's not obvious that pages which are under swapout actually need
        the more asynchronous behaviour of PageWriteback.
      
        I was setting the swapout pages PageWriteback and unlocking them
        prior to submitting the buffers in brw_page().  This led to deadlocks
        on the exit_mmap->zap_page_range->free_swap_and_cache path.  These
        functions call block_flushpage under spinlock.  If the page is
        unlocked but has locked buffers, block_flushpage->discard_buffer()
        sleeps.  Under spinlock.  So that will need fixing if for some reason
        we want swapout to use PageWriteback.
      
        Kernel has called block_flushpage() under spinlock for a long time.
         It is assuming that a locked page will never have locked buffers.
        This appears to be true, but it's ugly.
      
      - Adds new function wait_on_page_writeback().  Renames wait_on_page()
        to wait_on_page_locked() to remind people that they need to call the
        appropriate one.
      
      - Renames filemap_fdatasync() to filemap_fdatawrite().  It's more
        accurate - "sync" implies, if anything, writeout and wait.  (fsync,
        msync) Or writeout.  it's not clear.
      
      - Subtly changes the filemap_fdatawrite() internals - this function
        used to do a lock_page() - it waited for any other user of the page
        to let go before submitting new I/O against a page.  It has been
        changed to simply skip over any pages which are currently under
        writeback.
      
        This is the right thing to do for memory-cleansing reasons.
      
        But it's the wrong thing to do for data consistency operations (eg,
        fsync()).  For those operations we must ensure that all data which
        was dirty *at the time of the system call* are tight on disk before
        the call returns.
      
        So all places which care about this have been converted to do:
      
      	filemap_fdatawait(mapping);	/* Wait for current writeback */
      	filemap_fdatawrite(mapping);	/* Write all dirty pages */
      	filemap_fdatawait(mapping);	/* Wait for I/O to complete */
      
      - Fixes a truncate_inode_pages problem - truncate currently will
        block when it hits a locked page, so it ends up getting into lockstep
        behind writeback and all of the file is pointlessly written back.
      
        One fix for this is for truncate to simply walk the page list in the
        opposite direction from writeback.
      
        I chose to use a separate cleansing pass.  It is more
        CPU-intensive, but it is surer and clearer.  This is because there is
        no reason why the per-address_space ->vm_writeback and
        ->writeback_mapping functions *have* to perform writeout in
        ->dirty_pages order.  They may choose to do something totally
        different.
      
        (set_page_dirty() is an a_op now, so address_spaces could almost
        privatise the whole dirty-page handling thing.  Except
        truncate_inode_pages and invalidate_inode_pages assume that the pages
        are on the address_space lists.  hmm.  So making truncate_inode_pages
        and invalidate_inode_pages a_ops would make some sense).
      a2bcb3a0
    • Andrew Morton's avatar
      [PATCH] remove i_dirty_data_buffers · 7d513234
      Andrew Morton authored
      Removes inode.i_dirty_data_buffers.  It's no longer used - all dirty
      buffers have their pages marked dirty and filemap_fdatasync() /
      filemap_fdatawait() catches it all.
      
      Updates all callers.
      
      This required a change in JFS - it has "metapages" which
      are a container around a page which holds metadata.  They
      were holding these pages locked and were relying on fsync_inode_data_buffers
      for writing them out.  So fdatasync() deadlocked.
      
      I've changed JFS to not lock those pages.  Change was acked
      by Dave Kleikamp <shaggy@austin.ibm.com> as the right
      thing to do, but may not be complete.  Probably igrab()
      against ->host is needed to pin the address_space down.
      7d513234
  32. 04 Apr, 2002 2 commits
  33. 22 Feb, 2002 1 commit