1. 16 Nov, 2002 18 commits
    • James Simmons's avatar
      Merge bk://linux.bkbits.net/linux-2.5 · f22f6deb
      James Simmons authored
      into maxwell.earthlink.net:/usr/src/linus-2.5
      f22f6deb
    • Andrew Morton's avatar
      [PATCH] better inode reclaim balancing · 9c716856
      Andrew Morton authored
      The inode reclaim is too aggressive at present - it is causing the
      shootdown of lots of recently-used pagecache.  Simple testcase: run a
      huge `dd' while running a concurrent `watch -n1 cat /proc/meminfo'.
      The program text for `cat' gets loaded from disk once per second.
      
      This is in fact because the dentry_unused reclaim is too aggressive.
      
      (The general approach to inode reclaim is that it _not_ happen at the
      inode level.  All the aging and lru activity happens at the dcache
      level.)
      
      The problem is partly due to a bug: shrink_dcache_memory() is returning
      the *total* number of dentries to the VM, rather than the number of
      unused dentries.
      
      This patch fixes that, and goes a little further.
      
      We do want to keep some unused dentries around.  Reclaiming the last
      few thousand dentries is pretty pointless, and will allow reclaim of
      the last few thousand inodes and their attached pagecache.
      
      So the algorithm I have used is to not allow the number of unused
      dentries to fall below the number of used ones.  This keeps a
      reasonable number of dentries in cache while providing a level of
      scaling to the system size and the current system activity.
      
      (Magic number alert: why not pin nr_unused to seven times nr_used,
      rather than one times??)
      
      shrink_dcache_memory() has been changed to tell the VM that the number
      of shrinkable dentries is:
      
      	zero if (nr_unused < nr_used)
      	otherwise (nr_unused - nr_used)
      
      so when there is memory pressure the VM will prune the unused dentry
      cache down to the size of the used dentry cache, but not below that.
      
      The patch also arranges (awkwardly) for all modifications of
      dentry_stat.nr_dentry to occur inside dcache_lock - it was racy.
      9c716856
    • Andrew Morton's avatar
      [PATCH] kmap->kmap_atomic in mpage.c · b084fe4b
      Andrew Morton authored
      Replace some kmaps in mpage.c with kmap_atomic.
      b084fe4b
    • Andrew Morton's avatar
      [PATCH] try to remove buffer_heads from to-be-reaped inodes · deda1b5e
      Andrew Morton authored
      Stephen Tweedie reports a 2.4.7 problem in which kswapd is chewing lots
      of CPU trying to reclaim inodes which are pinned by buffer_heads at
      i_dirty_buffers.
      
      This can only happen when there's memory pressure on ZONE_HIGHMEM - the
      2.4 kernel runs shrink_icache_memory in that case as well.  But there's
      no reclaim pressure on ZONE_NORMAL so the VM is never running
      try_to_free_buffers() against the ZONE_NORMAL buffers which are pinning
      the inodes.
      
      The 2.5 kernel also runs the slab shrinkers in response to ZONE_HIGHMEM
      pressure.  This may be wrong - still thinking about that.
      
      This patch arranges for prune_icache to try to remove the inode's buffers
      when the inode is to be reclaimed.
      
      It also changes inode_has_buffers() and the other inode-buffer-list
      functions to look at inode->i_data, not inode->i_mapping.  The latter
      was wrong.
      deda1b5e
    • Andrew Morton's avatar
      [PATCH] improved slab error diagnostics · db054df8
      Andrew Morton authored
      slab does various consistency checks during `cat /proc/slabinfo'
      processing.  If it finds one it stupidly goes BUG just before
      displaying the info which is required to diagnose the bug.
      
      Change it to not go BUG, but to emit some useful printks and continue.
      
      The patch also removes an uninteresting printk from the boot process.
      db054df8
    • Andrew Morton's avatar
      [PATCH] more hugetlb fixes · 8838edfb
      Andrew Morton authored
      Patch from Rohit Seth, changelog from Bill Irwin:
      
      (1) fixes failure to clear key->busy (yes, it's always under a lock)
      (2) fixes key->count being unconditionally set to 1 in alloc_key()
      (3) reduces search to key->size >> HPAGE_SHIFT from key->size
      (4) actually uses vma->vm_private_data to release the key as intended
      
      plus the cleanup:
      (5) removes the int *new argument to alloc_key()
      8838edfb
    • Andrew Morton's avatar
      [PATCH] hugetlb cleanups · 6c5ceacc
      Andrew Morton authored
      A rollup of Bill's 11-patch series which replaces hugetlb's custom
      inode with a bare radix tree.  Reviewed and acked by Rohit.
      
      - revert doublefreeing patch
      
      - Put set_new_inode() and the inode allocation loop into an
        alloc_key(), and introduce a new opaque type "struct hugetlb_key".
      
      - Wrap the release path in alloc_shared_hugetlb_pages() with a
        release_key() function that handles the release.
      
      - Wrap hugetlb_prefault() with prefault_key() in order to isolate the
        dependency on inodes for prefaulting the hugetlb vma.
      
      - Replaces the usage of inode->i_writecount as a flag marking keys
        busy with a boolean flag field in struct htlbpagekey, and removes the
        last dependency of alloc_shared_hugetlb_pages() on struct inode.
      
      - Remove the last direct usage of struct inode within the hugetlb
        functions.
      
      - Removes many direct usages of struct inode within the key
        manipulation API in exchange for passing references to the key
        structure itself.
      
      - Expand out prefault_key() into its hugetlb_prefault() component,
        but substitute stubs to abstract out inode access.
      
      - Move uid/gid/mode/size fields used in struct inode and the checks
        on them into key management code and structures.
      
      - Substitute direct usage of radix trees for inodes, and removes the
        custom-allocated inodes.
      
      - Wrap up the release path by adding proper refcounting of keys.
      6c5ceacc
    • Andrew Morton's avatar
      [PATCH] misc fixes · d6afb9ef
      Andrew Morton authored
      - add init_timer in bttv driver
      
      - remove duplicated init_timer() in ncpfs.
      
      - remove noisy printk's from 3c59x.c
      
      - sparc64 compile fix with CONFIG_HUGETLBPAGE=y - htlbpage_max is now an
        int
      d6afb9ef
    • Andrew Morton's avatar
      [PATCH] handle pages which alter their ->mapping · 820ef9df
      Andrew Morton authored
      Patch from Hugh Dickins <hugh@veritas.com>
      
      tmpfs failed fsx+swapout tests after many hours, a page found zeroed.
      Not a truncate problem, but mirror image of earlier truncate problems:
      swap goes through mpage_writepages, which must therefore allow for a
      sudden swizzle back to file identity.
      
      Second time this caught us, so I've audited the tree for other places
      which might be surprised by such swizzling.  The only others I found
      were (perhaps) in the parisc and sparc64 flush_dcache_page called
      from do_generic_mapping_read on a looped tmpfs file which is also
      mmapped; but that's a very marginal case, I wanted to understand it
      better before making any edit, and now realize that hch's sendfile
      in loop eliminates it (now go through do_shmem_file_read instead:
      similar but crucially this locks the page when raising its count,
      which is enough to keep vmscan from interfering).
      820ef9df
    • Andrew Morton's avatar
      [PATCH] unlock_page when get_swap_bio fails · 77fe2d67
      Andrew Morton authored
      Patch from Hugh Dickins <hugh@veritas.com>
      
      swap_readpage and swap_writepage forgot
      to unlock_page if get_swap_bio failed.
      77fe2d67
    • Andrew Morton's avatar
      [PATCH] run flush_cache_page while pte is valid · 3d3f3c51
      Andrew Morton authored
      Patch from Hugh Dickins <hugh@veritas.com>
      
      On some architectures (cachetlb.txt gives HyperSparc as an example)
      it is essential to flush_cache_page while pte is still valid: the
      rmap VM diverged from the base 2.4 VM before that fix was made,
      so this error has crept back into 2.5.
      
      Patch below applies to 2.5.47 or 2.5.47-mm1 - needs more work over
      shared pagetables, but they've silently fallen out of 2.5.47-mm1:
      oversight?  I'll send Alan the equivalent for 2.4-ac shortly.
      
      (I wonder, what happens if userspace now modifies the page
      after the flush_cache_page, before the pte is invalidated?)
      3d3f3c51
    • Andrew Morton's avatar
      [PATCH] mbcache: add gfp_mask parameter to free() callback, · ea6d6fc7
      Andrew Morton authored
      Patch from Andreas Gruenbacher <agruen@suse.de>
      
      Add a gfp_mask parameter to the free() callback so that the callback can
      safely do I/O, etc. The free callback can now also fail.  This will be
      needed by reiserfs.
      
      The order of entries on the cache entry lru is reversed so that
      list_for_each_safe() can be used. Several helper functions that don't
      make the code any better are removed. Finally, a couple of cosmetic
      things.
      ea6d6fc7
    • Andrew Morton's avatar
      [PATCH] direct-io bio_add_page fix · 0828e38f
      Andrew Morton authored
      From Badari.
      
      bio_add_page returns zero on failure - we need to propagate that
      to the dio_bio_add_page() caller.
      0828e38f
    • Linus Torvalds's avatar
      Merge penguin:v2.5/linux · eb2b0b9a
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      eb2b0b9a
    • Matthew Wilcox's avatar
      [PATCH] Move wait queue handling from sched.h to wait.h · 2b2cb8a0
      Matthew Wilcox authored
      This patch removes all the wait_queue handling code from sched.h and puts
      it in wait.h with the rest of the wait_queue handling code.  Note that
      sched.h must continue to include wait.h for the wait_queue_head_t embedded
      in struct task.  However there may be files which only need wait.h now.
      2b2cb8a0
    • Matthew Wilcox's avatar
      [PATCH] Move request_irq & free_irq to interrupt.h · 8f57bc89
      Matthew Wilcox authored
      It really makes no sense to have request_irq and free_irq in sched.h
      Let's move them to interrupt.h instead.  Note that I also remove sched.h
      from interrupt.h since it's not needed.
      8f57bc89
    • Matthew Wilcox's avatar
      [PATCH] Move fd-related functions from sched.h to file.h · 8e9b611d
      Matthew Wilcox authored
      A minor removal of 6 function definitions from sched.h.  They clearly
      fit better in file.h.  All users of these functions already include file.h.
      And none of them included sched.h directly...
      8e9b611d
    • Matthew Wilcox's avatar
      [PATCH] Remove d_path from sched.h · cd574b74
      Matthew Wilcox authored
      This patch from William Lee Irwin III privatizes __d_path() to dcache.c,
      uninlines d_path(), moves its declaration to dcache.h, moves it to
      dcache.c, and exports d_path() instead of __d_path().
      cd574b74
  2. 15 Nov, 2002 14 commits
    • Alexander Viro's avatar
      [PATCH] late-boot fixes · 40fa9470
      Alexander Viro authored
      Grrr...  Two bugs in a patch that had moved md setup to late boot:
      
      a) we need md_run_setup() run before parsing root name.
      b) it's create_dev("/dev/md0",...), not create_dev("md0",...) ;-/
      40fa9470
    • Alexander Viro's avatar
      [PATCH] paride protocols switched to ->owner · 739659cc
      Alexander Viro authored
      Still not safe (we use __MOD_INC_USE_COUNT in paride.c; old code has
      MOD_INC_USE_COUNT in protocol drivers), but that takes crap in one
      place.
      
      	->owner added
      	paride.c grabs/releases it if if present
      	->proto_init() became empty for almost everything
      	->proto_release() <<--->>
      	->proto_init() returns int now (the only case where we do have a
      	  non-empty ->proto_init() needed that all along).  paride.c
      	  taught to deal with that.
      739659cc
    • Alexander Viro's avatar
      [PATCH] gratitious MOD_INC_USE_COUNT · f7efec4a
      Alexander Viro authored
      dasd_proc.c : should be using ->owner instead of MOD_..._USE_COUNT in
      ->open()/->release().
      
      s390/char/tape.c, s390/char/tapechar.c, usb/image/scanner.c,
      intermezzo/psdev.c: ditto
      
      intermezzo/super.c: they forgot to remove MOD_INC_USE_COUNT from the
      ..._fill_super()
      
      binfmt_som.c: ->load_binary() and ->load_library() don't need
      MOD_INC_USE_COUNT, since ->module is correctly set.
      f7efec4a
    • Linus Torvalds's avatar
      Merge bk://linux-input.bkbits.net/linux-input · 32735425
      Linus Torvalds authored
      into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
      32735425
    • Vojtech Pavlik's avatar
    • Alexander Viro's avatar
      [PATCH] devfs_register_tape() cleanup · 3197f480
      Alexander Viro authored
      devfs_register_tape() returns the number it had assigned to tape.
      
      new helper: devfs_unregister_tape(number) - removes symlink created by
      devfs_register_tape()
      
      devfs_register_tape() doesn't use devfs_auto_unregister() anymore.
      
      devfs_register_tape() gets devfs entry of directory, instead of that of
      a random file in that directory.
      
      users updated
      3197f480
    • Alexander Viro's avatar
      81780a58
    • Alexander Viro's avatar
    • Alexander Viro's avatar
      [PATCH] paride.c fed through Lindent · 8e96b6e0
      Alexander Viro authored
      8e96b6e0
    • Alexander Viro's avatar
      [PATCH] dm use of devfs · 82a5acd5
      Alexander Viro authored
      dm-ioctl.c does, er, interesting things to figure out the name of devfs
      node it had just created.  Cleaned up.
      82a5acd5
    • Alexander Viro's avatar
      [PATCH] dv1394 devfs use · 8d1ab570
      Alexander Viro authored
      dv1394.c piles amazing amounts of crap around its devfs entries.
      Probably a result of times before devfs_find_and_unregister()...
      
      In any case, code switched to use of devfs_find_and_unregister(),
      crapectomy performed...
      8d1ab570
    • Davide Libenzi's avatar
      [PATCH] epoll bit 0.47 · eeab5fdc
      Davide Libenzi authored
      - Improved file cleanup code
      eeab5fdc
    • Davide Libenzi's avatar
      [PATCH] epoll bits 0.46 ... · 424980a8
      Davide Libenzi authored
      - A more uniform poll queueing interface with tips from Manfred
      
      - The f_op->poll() is done outside the irqlock to maintain compatibility
      	with existing drivers that assume to be called with irq enabled
      
      - Moved event mask setting inside ep_modify() with tips from John
      
      - Fixed locking to fit the new "poll() outside the lock" approach
      
      - Bufferd userspace event delivery to reduce irq_lock/irq_unlock switching
      	rate and to reduce the number of __copy_to_user()
      
      - Comments added
      424980a8
    • Jens Axboe's avatar
      [PATCH] incorrect block layer segment accounting · 6e941592
      Jens Axboe authored
      There's a long standing bug in blk_recount_segments(). Clustering means
      physical segment colascing, not hardware segment colascing. This
      basically means that we are mapping more segments than here than the bio
      + requests contains, and this causes a bug in the SCSI layer for host
      adapters that have CLUSTERING enabled.
      
      This patch makes sure that we are clustering physical segments
      correctly, and correctly accounting hardware segments. Please apply.
      6e941592
  3. 14 Nov, 2002 8 commits