1. 23 Sep, 2002 5 commits
  2. 22 Sep, 2002 32 commits
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/davem/BK/sparc-2.5 · f20bf018
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      f20bf018
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/davem/BK/net-2.5 · e7144e64
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      e7144e64
    • Andrew Morton's avatar
      [PATCH] low-latency page reclaim · 407ee6c8
      Andrew Morton authored
      Convert the VM to not wait on other people's dirty data.
      
       - If we find a dirty page and its queue is not congested, do some writeback.
      
       - If we find a dirty page and its queue _is_ congested then just
         refile the page.
      
       - If we find a PageWriteback page then just refile the page.
      
       - There is additional throttling for write(2) callers.  Within
         generic_file_write(), record their backing queue in ->current.
         Within page reclaim, if this tasks encounters a page which is dirty
         or under writeback onthis queue, block on it.  This gives some more
         writer throttling and reduces the page refiling frequency.
      
      It's somewhat CPU expensive - under really heavy load we only get a 50%
      reclaim rate in pages coming off the tail of the LRU.  This can be
      fixed by splitting the inactive list into reclaimable and
      non-reclaimable lists.  But the CPU load isn't too bad, and latency is
      much, much more important in these situations.
      
      Example: with `mem=512m', running 4 instances of `dbench 100', 2.5.34
      took 35 minutes to compile a kernel.  With this patch, it took three
      minutes, 45 seconds.
      
      I haven't done swapcache or MAP_SHARED pages yet.  If there's tons of
      dirty swapcache or mmap data around we still stall heavily in page
      reclaim.  That's less important.
      
      This patch also has a tweak for swapless machines: don't even bother
      bringing anon pages onto the inactive list if there is no swap online.
      407ee6c8
    • Andrew Morton's avatar
      [PATCH] use the congestion APIs in pdflush · c9b22619
      Andrew Morton authored
      The key concept here is that pdflush does not block on request queues
      any more.  Instead, it circulates across the queues, keeping any
      non-congested queues full of write data.  When all queues are full,
      pdflush takes a nap, to be woken when *any* queue exits write
      congestion.
      
      This code can keep sixty spindles saturated - we've never been able to
      do that before.
      
       - Add the `nonblocking' flag to struct writeback_control, and teach
         the writeback paths to honour it.
      
       - Add the `encountered_congestion' flag to struct writeback_control
         and teach the writeback paths to set it.
      
      So as soon as a mapping's backing_dev_info indicates that it is getting
      congested, bale out of writeback.  And don't even start writeback
      against filesystems whose queues are congested.
      
       - Convert pdflush's background_writeback() function to use
         nonblocking writeback.
      
      This way, a single pdflush thread will circulate around all the
      dirty queues, keeping them filled.
      
       - Convert the pdlfush `kupdate' function to do the same thing.
      
      This solves the problem of pdflush thread pool exhaustion.
      
      It solves the problem of pdflush startup latency.
      
      It solves the (minor) problem wherein `kupdate' writeback only writes
      back a single disk at a time (it was getting blocked on each queue in
      turn).
      
      It probably means that we only ever need a single pdflush thread.
      c9b22619
    • Andrew Morton's avatar
      [PATCH] use the queue congestion API in ext2_preread_inode() · f3332384
      Andrew Morton authored
      Use the new queue congestion detector in ext2_preread_inode().  Don't
      try the speculative read if the read queue is congested.
      
      Also, don't try it if the disk is write-congested.  Presumably it is
      more important to get the dirty memory cleaned out.
      f3332384
    • Andrew Morton's avatar
      [PATCH] infrastructure for monitoring queue congestion state · 4cef1b04
      Andrew Morton authored
      The patch provides a means for the VM to be able to determine whether a
      request queue is in a "congested" state.  If it is congested, then a
      write to (or read from) the queue may cause blockage in
      get_request_wait().
      
      So the VM can do:
      
      	if (!bdi_write_congested(page->mapping->backing_dev_info))
      		writepage(page);
      
      This is not exact.  The code assumes that if the request queue still
      has 1/4 of its capacity (queue_nr_requests) available then a request
      will be non-blocking.  There is a small chance that another CPU could
      zoom in and consume those requests.  But on the rare occasions where
      that may happen the result will mereley be some unexpected latency -
      it's not worth doing anything elaborate to prevent this.
      
      The patch decreases the size of `batch_requests'.  batch_requests is
      positively harmful - when a "heavy" writer and a "light" writer are
      both writing to the same queue, batch_requests provides a means for the
      heavy writer to massively stall the light writer.  Instead of waiting
      for one or two requests to come free, the light writer has to wait for
      32 requests to complete.
      
      Plus batch_requests generally makes things harder to tune, understand
      and predict.  I wanted to kill it altogether, but Jens says that it is
      important for some hardware - it allows decent size requests to be
      submitted.
      
      The VM changes which go along with this code cause batch_requests to be
      not so painful anyway - the only processes which sleep in
      get_request_wait() are the ones which we elect, by design, to wait in
      there - typically heavy writers.
      
      
      The patch changes the meaning of `queue_nr_requests'.  It used to mean
      "total number of requests per queue".  Half of these are for reads, and
      half are for writes.  This always confused the heck out of me, and the
      code needs to divide queue_nr_requests by two all over the place.
      
      So queue_nr_requests now means "the number of write requests per queue"
      and "the number of read requests per queue".  ie: I halved it.
      
      Also, queue_nr_requests was converted to static scope.  Nothing else
      uses it.
      
      
      The accuracy of bdi_read_congested() and bdi_write_congested() depends
      upon the accuracy of mapping->backing_dev_info.  With complex block
      stacking arrangements it is possible that ->backing_dev_info is
      pointing at the wrong queue.  I don't know.
      
      But the cost of getting this wrong is merely latency, and if it is a
      problem we can fix it up in the block layer, by getting stacking
      devices to communicate their congestion state upwards in some manner.
      4cef1b04
    • Andrew Morton's avatar
      [PATCH] don't hold mapping->private_lock while marking a page dirty · b5742733
      Andrew Morton authored
      __set_page_dirty_buffers() is calling __mark_inode_dirty under
      mapping->private_lock.
      
      We don't need to hold ->private_lock across that call.  It's only there
      to pin page->buffers.
      
      This simplifies the VM locking heirarchy.
      b5742733
    • Andrew Morton's avatar
      [PATCH] fix ext3 in data=writeback mode · c8b254cc
      Andrew Morton authored
      When I converted ext3 to use to use direct-to-BIO writeback for
      data=writeback mode I forgot that we need to hold a transaction open on
      behalf of MAP_SHARED pages.  The fileystem is BUGging in get_block()
      because there is no transaction open.
      
      So let's forget that idea for now and send data=writeback mode back to
      ext3_writepage.
      c8b254cc
    • David S. Miller's avatar
      Merge nuts.ninka.net:/home/davem/src/BK/sparcwork-2.5 · 2d35bd3f
      David S. Miller authored
      into nuts.ninka.net:/home/davem/src/BK/sparc-2.5
      2d35bd3f
    • David S. Miller's avatar
      Merge nuts.ninka.net:/home/davem/src/BK/network-2.5 · da29f6a8
      David S. Miller authored
      into nuts.ninka.net:/home/davem/src/BK/net-2.5
      da29f6a8
    • David S. Miller's avatar
      Merge master.kernel.org:/home/acme/BK/llc-2.5 · e1ec2e00
      David S. Miller authored
      into nuts.ninka.net:/home/davem/src/BK/net-2.5
      e1ec2e00
    • Arnaldo Carvalho de Melo's avatar
      [LLC] move reason to the {station,sap,conn}_ev structs · 1502caff
      Arnaldo Carvalho de Melo authored
      Slowly killing the ugly struct forest.
      1502caff
    • Arnaldo Carvalho de Melo's avatar
    • Arnaldo Carvalho de Melo's avatar
      [LLC] use the core lists to get info for /proc/net/llc · 5d8c0602
      Arnaldo Carvalho de Melo authored
      With this llc_ui_sockets is almost not needed anymore, next
      changesets will deal with the dataunit/xid/test primitives, that
      are still using it.
      5d8c0602
    • Arnaldo Carvalho de Melo's avatar
    • David S. Miller's avatar
    • Arnaldo Carvalho de Melo's avatar
      [UDPv6] fix udp_v6_get_port introduced by the sock splitup · 2efc5e41
      Arnaldo Carvalho de Melo authored
      It is the same bug fixed some months ago in tcp_v6_get_port,
      i.e. we can't touch ipv6 private areas without checking if
      the socket is AF_INET6.
      2efc5e41
    • David S. Miller's avatar
      96447e51
    • David S. Miller's avatar
      Merge master.kernel.org:/home/acme/BK/llc-2.5 · ae7eb260
      David S. Miller authored
      into nuts.ninka.net:/home/davem/src/BK/net-2.5
      ae7eb260
    • Alexander Viro's avatar
      [PATCH] blk_size[] is gone · f5076217
      Alexander Viro authored
      it is an ex-parrot
      f5076217
    • Alexander Viro's avatar
      [PATCH] compile fixes for ftl · 981de136
      Alexander Viro authored
      assorted compile fixes
      981de136
    • Alexander Viro's avatar
      [PATCH] gendisk for mtdblock · b55a9a52
      Alexander Viro authored
      mtdblock switched to use of gendisks + compile fixes
      b55a9a52
    • Alexander Viro's avatar
      [PATCH] gendisk for z2ram · 6ebe755c
      Alexander Viro authored
      z2ram.c switched to use of gendisks
      6ebe755c
    • Alexander Viro's avatar
      [PATCH] gendisk for ataflop · 3f028def
      Alexander Viro authored
      ataflop.c switched to use of gendisks
      3f028def
    • Alexander Viro's avatar
      [PATCH] gendisk for amiflop · b7264cd3
      Alexander Viro authored
      amiflop.c switched to use of gendisks
      b7264cd3
    • Alexander Viro's avatar
      [PATCH] cleanup of pd.c · 8e273c4e
      Alexander Viro authored
      macroectomy a-la pf.c and pcd.c ones, ditto for passing pointers to
      structures instead of minors.
      8e273c4e
    • Alexander Viro's avatar
      [PATCH] Lindent pd.c · 91c42c4d
      Alexander Viro authored
      pd.c fed through Lindent
      91c42c4d
    • Alexander Viro's avatar
      [PATCH] kills CURRENT in floppy.c · f299adc6
      Alexander Viro authored
      dumb expansion of macro - it had #define CURRENT current_req
      f299adc6
    • Alexander Viro's avatar
      [PATCH] tapeblock blk_size removal · 0beb090c
      Alexander Viro authored
      tapeblock never assignes anything to its elements of blk_size[][]; we could
      not bother allocating it in the first place.
      0beb090c
    • Alexander Viro's avatar
      [PATCH] Re: Linux 2.5.38 · 1751d060
      Alexander Viro authored
      More trivial fixes: typos in partitions/check.c, block/floppy.c and
      acorn/block/fd1772.c + replacement of #define with inline in block/floppy.c
      (fd_eject()).
      1751d060
    • Adrian Bunk's avatar
      [PATCH] gendisk typo fixes · e2d496c5
      Adrian Bunk authored
      Some trivial fixes for some typos introduced by Al's gendisk changes..
      
       - missing comma in cdu31a
       - missing semicolon in cdu31a
       - comma instead of colon in gscd
       - semicolon instead of comma in mcd
       - missing closing bracket in sonycd535
      e2d496c5
    • Linus Torvalds's avatar
      b890a32b
  3. 21 Sep, 2002 3 commits