1. 10 May, 2004 40 commits
    • Andrew Morton's avatar
      [PATCH] remove MOD_INC_USE_COUNT usage in arch/um/drivers/harddog_kern.c · 414f3455
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      ->open already has a reference so use __module_get.  The file has no
      maintainer noted in it, all credits are from the driver it's copied from.
      414f3455
    • Andrew Morton's avatar
      [PATCH] fix MOD_INC_USE_COUNT usage in mtd · d03500e8
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      mtd driver need to get another reference if ->probe succeeds (strange design
      if you ask me, but what the heck..), and while most drivers have been switched
      to __module_get already two are still missing.
      d03500e8
    • Andrew Morton's avatar
      [PATCH] drivers/video/* MOD_INC_USE_COUNT fixes · 655d8183
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      A bunch of framebuffer drivers use MOD_INC_USE_COUNT to prevent themselves
      from unloading completely - but we have a much easier way to do so, that is
      simply removing the module_exit/cleanup_module handler.
      655d8183
    • Andrew Morton's avatar
      [PATCH] fix MOD_{INC,DEC}_USE_COUNT gunk in arch/um/drivers/net_kern.c · de664d0c
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      Well, UML is pretty out of date in mainline, but I'd like to squash the last
      users of said beasts rather sooner than later.
      de664d0c
    • Andrew Morton's avatar
      [PATCH] kill MOD_{INC,DEC}_USE_COUNT gunk in arch/cris/arch-v10/drivers/pcf8563.c · 05347c79
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      Driver already sets fops->owner so the open/close methods are entirely
      superflous.
      05347c79
    • Andrew Morton's avatar
      [PATCH] kill useless MOD_{INC,DEC}_USE_COUNT in sound/oss/msnd.c · 8e512539
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      Callers are exported register/unregister handlers so the module is locked in
      core by users of said exports.
      8e512539
    • Andrew Morton's avatar
      [PATCH] cpqarray update for 2.6 · 3dfae718
      Andrew Morton authored
      From: <mikem@beardog.cca.cpqcorp.net>
      
      This patch fixes 2 minor issues that break our Array Configuration utility.
       my_io was changed to a pointer so the & had to removed when using it with
      copy_to_user().
      
      Sometime in 2.5 SG_MAX got changed to 31.  Maybe to copy cciss?  Now I'm
      changing it back to 32 so our app can work.
      3dfae718
    • Andrew Morton's avatar
      [PATCH] Add sysctl to define a hugetlb-capable group · cd053a94
      Andrew Morton authored
      From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>,
            "Seth, Rohit" <rohit.seth@intel.com>
      
      This patch addresses the longstanding problem wherein Oracle needs
      CAP_IPC_LOCK to allocate SHM_HUGETLB shm memory, but people don't want to run
      Oracle as root, and capabilties are busted.
      
      Various ideas with rlimits didn't work out, mainly because these objects live
      beyond the lifetime of the user processes which establish them.
      
      What we do is to create root-writeable /proc/sys/vm/hugetlb_shm_group which
      specifies a single group ID.  Users who belong to that group may allocate
      hugepages for SHM_HUGETLB shm segments.
      
      So the sysadmin will greate a new group, say `hugepageusers', will add the
      oracle user to that group and will write that group's ID into
      /proc/sys/vm/hugetlb_shm_group.
      cd053a94
    • Andrew Morton's avatar
      [PATCH] hugepage: fix add_to_page_cache() error handling · 9008d35b
      Andrew Morton authored
      From: David Gibson <david@gibson.dropbear.id.au>
      
      add_to_page_cache() locks the given page if and only if it suceeds.  The
      hugepage code (every arch), however, does an unlock_page() after
      add_to_page_cache() before checking the return code, which could trip the
      BUG() in unlock_page() if add_to_page_cache() failed.
      
      In practice we've never hit this bug, because the only ways
      add_to_page_cache() can fail are when we fail to allocate a radix tree node
      (very rare), or when there is already a page at that offset in the radix
      tree, which never happens during prefault, obviously.  We should probably
      fix it anyway, though.
      
      The analagous bug in some of the patches floating about to
      demand-allocation of hugepages is more of a problem, because multiple
      processes can race to instantiate a particular page in the radix tree -
      that's been hit at least once (which is how I found this).
      9008d35b
    • Andrew Morton's avatar
      [PATCH] fix wrong var used in hotplug/shpchp_ctrl.c. · d7553443
      Andrew Morton authored
      From: "Luiz Fernando N. Capitulino" <lcapitulino@prefeitura.sp.gov.br>
      
      Zhenmin's checker tool <zli4@cs.uiuc.edu> detected this:
      
       9. /drivers/pci/hotplug/shpchp_ctrl.c, Line 1575:
       err("%s: Failed to disable slot, error code(%d)\n", __FUNCTION__, rc);
      
       Maybe change to:
       err("%s: Failed to disable slot, error code(%d)\n", __FUNCTION__,
       retval);
      
      I think it is right because at line 1564, the slot is turned off, and in
      this line (1575) is checked the status to see if we got an error; if so,
      the error number is shown.  This number is in 'retval', not in 'rc' ('rc'
      does have the return of configure_new_device()).
      d7553443
    • Andrew Morton's avatar
      [PATCH] Lindent arch/i386/kernel/cpuid.c · 0610d50c
      Andrew Morton authored
      From: Hanna Linder <hannal@us.ibm.com>
      
      Per Greg's request this is a patch of having run Lindent on cpuid.c.  The
      tabs were not the right number of spaces before.  I have verified it still
      compiles and boots with this "change".
      0610d50c
    • Andrew Morton's avatar
      [PATCH] pcmcia/tcic.c warning fix. · b5c64411
      Andrew Morton authored
      From: "Luiz Fernando N. Capitulino" <lcapitulino@prefeitura.sp.gov.br>
      
      drivers/pcmcia/tcic.c:63: warning: `version' defined but not used
      b5c64411
    • Andrew Morton's avatar
      [PATCH] as-iosched barrier fix · 7a49740a
      Andrew Morton authored
      From: Jens Axboe <axboe@suse.de>
      
      AS does not correctly account requests inserted with INSERT_FRONT or
      INSERT_BACK, barriers for example.  In other elevators, requeued requests also
      go through the insert path, but AS has its own requeue handler which means the
      code has never been tested.
      
      Also, make inserting a barrier with INSERT_SORT imply INSERT_BACK, which is
      the logical behaviour.  Previously such insertions weren't rigorously defined.
      7a49740a
    • Andrew Morton's avatar
      [PATCH] Fix race on tty close · e829d2e4
      Andrew Morton authored
      From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      
      ldisc close can race with the flush_to_ldisc workqueue.
      
      This patch fixes it by killing the workqueue first.
      e829d2e4
    • Andrew Morton's avatar
      [PATCH] SElinux interface for reporting size of printk buffer · 19e12876
      Andrew Morton authored
      From: Olaf Dabrunz <od@suse.de>
      
      Add the necessary hooks so that a SELinux-enabled kernel will allow the new
      "report the size of the printk buffer" query to work.
      19e12876
    • Andrew Morton's avatar
      [PATCH] blk: cache queue_congestion_on/off_threshold values · 110eecfb
      Andrew Morton authored
      From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
      
      It's kind of redundant that queue_congestion_on/off_threshold gets
      calculated on every I/O and they produce the same number over and over
      again unless q->nr_requests gets changed (which is probably a very rare
      event).  We can cache those values in the request_queue structure.
      110eecfb
    • Andrew Morton's avatar
      [PATCH] swsusp documentation updates · bbfbb758
      Andrew Morton authored
      From: Pavel Machek <pavel@ucw.cz>
      bbfbb758
    • Andrew Morton's avatar
      [PATCH] simplify mqueue_inode_info->messages allocation · b3f8802c
      Andrew Morton authored
      From: Chris Wright <chrisw@osdl.org>
      
      Currently, if a user creates an mqueue and passes an mq_attr, the
      info->messages will be created twice (and the extra one is properly freed).
      This patch simply delays the allocation so that it only ever happens once. 
      The relevant mq_attr data is passed to lower levels via the dentry->d_fsdata
      fs private data.  This also helps isolate the areas we'd need to touch to do
      rlimits on mqueues.
      b3f8802c
    • Andrew Morton's avatar
      [PATCH] bfs filesystem read past the end of dir · e37a41af
      Andrew Morton authored
      From: Jakub Jermar <jermar@itbs.cz>
      
      I found out that BFS filesystem will eventually try to read and interpret
      garbage past the end of directory in bfs_add_entry().  If the garbage
      (interpreted as i-node number) is not set to zero (does it have to be?)
      bfs_add_entry() will consider it a regular directory entry. 
      
      This causes weird things like this:
      # touch a
      # rm a
      # ls
      # touch b
      # ls
      a
      
      My patch detects an attempt to read past the end of directory and explicitly
      clears the garbage that represents i-node number.  Thus the correct behaviour
      is achieved.
      
      (was unable to contact Tigran)
      e37a41af
    • Andrew Morton's avatar
      [PATCH] update Documentation/md.txt · 28d627fb
      Andrew Morton authored
      From: <spam@altium.nl> (Dick Streefland)
      
      The following patch documents the currently undocumented raid= kernel
      parameter.
      28d627fb
    • Andrew Morton's avatar
      [PATCH] es7000 subarch update for generic arch · 9828805c
      Andrew Morton authored
      From: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      
      This is ES7000 sub architecture update.  It makes ES7000 a part of the
      generic architecture, so the single compiled kernel will be able to choose
      a correct set of parameters, routines ("genapic"), and a boot path.  It
      uses criteria provided by the subarch for platform identification.  In case
      of ES7000, it is a unique product/vendor string in the ACPI/MP OEM table,
      and server control registers.  The patch is confined to only es7000 subarch
      and generic subarch.  It was tested on ES7000 as well as generic Intel 8x
      Xeon system.  Andi Kleen has reviewed the changes.
      9828805c
    • Andrew Morton's avatar
      [PATCH] CLOCK_TICK_RATE: use CLOCK_TICK_RATE · 1dd5cc77
      Andrew Morton authored
      From: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>
      
      use CLOCK_TICK_RATE where 1193180 was used in general timing calculations. 
      (optional)
      1dd5cc77
    • Andrew Morton's avatar
      [PATCH] CLOCK_TICK_RATE: use PIT_TICK_RATE in *spkr.c · 52161621
      Andrew Morton authored
      From: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>
      52161621
    • Andrew Morton's avatar
      [PATCH] CLOCK_TICK_RATE: introduce asm-*/8253pit.h, #define PIT_TICK_RATE constant. · 80c44e42
      Andrew Morton authored
      From: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>
      
      The calculation of the counter values in drivers/input/misc/pcspkr.c is
      incorrectly based on CLOCK_TICK_RATE.  This goes unnoticed in i386 because
      there the system clock is driven by the same Programmable Interval Timer chip
      as the speaker.  But this doesn't hold true on other archs, e.g.  Alpha.
      
      To solve this problem I made these patches:
      
      1/3:    introduce asm-*/8253pit.h, #define PIT_TICK_RATE constant.
              It seems this is not always the same value.
      2/3:    use PIT_TICK_RATE in *spkr.c
      3/3:    use CLOCK_TICK_RATE where 1193180 was used in general timing
              calculations. (optional)
      
      There are still some places where the magic number is used instead of the
      #define (vt_ioctl.c, gameport.c) but I left them as-is.  I got some responses
      from arch maintainers to specifically not touch their respective architectures
      so changing these places would mean breakage for them.
      
      Tested on Alpha and i386, ack'ed by Ralf Baechle for MIPS.
      
      
      This patch:
      
      introduce asm-*/8253pit.h, #define PIT_TICK_RATE constant.
      80c44e42
    • Andrew Morton's avatar
      [PATCH] readahead: keep file->f_ra sane · 2a12ed0e
      Andrew Morton authored
      When two threads are simultaneously pread()ing from the same fd (which is a
      legitimate thing to do), the readahead code thinks that a huge amount of
      seeking is happening and shrinks the window, damaging performance a lot.
      
      I don't see a sane way to avoid this within the readahead code, so take a
      private copy of the readahead state and restore it prior to returning from the
      read.
      2a12ed0e
    • Andrew Morton's avatar
      [PATCH] jiffies-to-clockt fix · 60967810
      Andrew Morton authored
      From: john stultz <johnstul@us.ibm.com>
      
      This patch polishes up Tim Schmielau's (tim@physik3.uni-rostock.de) fix for
      jiffies_to_clock_t() and jiffies_64_to_clock_t().  The issues observed was
      w/ /proc output not matching up to wall time due to accumulated error
      caused by HZ not being exactly 1000 on i386 systems.  The solution is to
      correct that error by using the more accurate TICK_NSEC in our calculation.
      
      Additionally, this patch corrects 3 warnings in the TCP layer uncovered by
      this change.
      60967810
    • Andrew Morton's avatar
      [PATCH] cyclades cleanups · c2e48749
      Andrew Morton authored
      From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
      
      - cleanups for cyclades Kconfig entry	(Adrian Bunk/me)
      - janitors project: remove dead function	(Don Koch)
      
      From: aris@cathedrallabs.org (Aristeu Sergio Rozanski Filho)
      
      	Use the standard min/max macros
      c2e48749
    • Andrew Morton's avatar
      [PATCH] fix ramdisk size assembler warning · 15e5643c
      Andrew Morton authored
      From: Jorn Engel <joern@wohnheim.fh-wedel.de>
      
       AS	arch/i386/boot/setup.o
      /usr/src/linux-2.6.5/arch/i386/boot/setup.S: Assembler messages:
      /usr/src/linux-2.6.5/arch/i386/boot/setup.S:159: Warning: value 0x37ffffff truncated to 0x37ffffff
      
      The warning is correct, the calculated value for ramdisk_max would be
      0xb7ffffff instead of 0x37ffffff.  Truncating 0xb7ffffff to 0x37ffffff
      is desired behaviour, so we should do it explicitly.
      15e5643c
    • Andrew Morton's avatar
      [PATCH] ppc64: use generic ipc syscall translation · 53fbd0b0
      Andrew Morton authored
      From: David Gibson <david@gibson.dropbear.id.au>
      
      Currently ppc64 has its own code to convert 32-bit ipc() syscalls to 64-bit,
      rather than using the common translation code from ipc/compat.c.  This patch,
      tweaked slightly from an earlier version of Anton Blanchard's fixes that,
      replacing the ppc64 code with calls to the common code.
      
      I've run the LSB IPC tests, and as many of the LTP IPC tests as I could figure
      out how to run easily, and it seems to pass them all.
      53fbd0b0
    • Andrew Morton's avatar
      [PATCH] gcc-3.4.0 fixes for 2.6.6-rc3 x86_64 kernel · bf434bf2
      Andrew Morton authored
      From: Mikael Pettersson <mikpe@csd.uu.se>
      
      Here are some patches to fix compilation warnings from
      gcc-3.4.0 in the 2.6.6-rc3 x86_64 kernel.
      
      - puts() type conflict in boot/compressed/misc.c:
        rename to putstr(), just like i386 did
      - cast-as-lvalue in ia32_copy_siginfo_from_user():
        use temporary
      - code before declaration in io_apic.c:
        move decl up
      - code before declaration in ioremap.c:
        move existing #ifndef up
      - cast-as-lvalue (tons of them) from UP version of per_cpu():
        merged asm-generic's version
      bf434bf2
    • Andrew Morton's avatar
      [PATCH] fixup 68360 module refcounting · 0b4e162c
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      0b4e162c
    • Andrew Morton's avatar
      [PATCH] Warn when smp_call_function() is called with interrupts disabled · 43653667
      Andrew Morton authored
      From: Keith Owens <kaos@sgi.com>
      
      Almost every architecture has a comment above smp_call_function()
      
       * You must not call this function with disabled interrupts or from a
       * hardware interrupt handler or from a bottom half handler.
      
      I have not seen any problems with calling smp_call_function() from a bottom
      half handler, but calling it with interrupts disabled can definitely
      deadlock.  This bug is hard to reproduce and even harder to debug.
      
      CPU A                               CPU B
      Disable interrupts
                                          smp_call_function()
                                          Take call_lock
                                          Send IPIs
                                          Wait for all cpus to acknowledge IPI
                                          CPU A has not responded, spin waiting
                                          for cpu A to respond, holding call_lock
      smp_call_function()
      Spin waiting for call_lock
      Deadlock                            Deadlock
      
      Change all smp_call_function() to WARN_ON(irqs_disabled()).  It should be
      BUG_ON() but some buggy code like SCSI sg will break with BUG_ON, so just
      warn for now.  Change it to BUG_ON after the buggy code has been fixed.
      43653667
    • Andrew Morton's avatar
      [PATCH] worker_thread race fix · 5805ad40
      Andrew Morton authored
      Fix a waitqueue-handling race in worker_thread().
      5805ad40
    • Andrew Morton's avatar
      [PATCH] pcmcia/i82365.c warning fix · df125ce9
      Andrew Morton authored
      From: "Luiz Fernando N. Capitulino" <lcapitulino@prefeitura.sp.gov.br>
      
      drivers/pcmcia/i82365.c: At top level:
      drivers/pcmcia/i82365.c:71: warning: `version' defined but not used
      df125ce9
    • Andrew Morton's avatar
      [PATCH] throttle P4 thermal warnings · d14c7e92
      Andrew Morton authored
      From: Zwane Mwaikambo <zwane@linuxpower.ca>
      
      In really bad conditions this can keep printing for a while, throttle the
      output somewhat.  Also change the "CPU%d" formatting to better match the
      other boot output.
      d14c7e92
    • Andrew Morton's avatar
      [PATCH] fix deadlock in create_workqueue() · b4ad84fc
      Andrew Morton authored
      Fix bug identified by Srivatsa Vaddagiri <vatsa@in.ibm.com>:
      
      There's a deadlock in __create_workqueue when CONFIG_HOTPLUG_CPU is set.  This
      can happen when create_workqueue_thread fails to create a worker thread.  In
      that case, we call destroy_workqueue with cpu hotplug lock held.
      destroy_workqueue however also attempts to take the same lock.
      b4ad84fc
    • Andrew Morton's avatar
      [PATCH] remove blk_queue_bounce() printks · 7676bfa0
      Andrew Morton authored
      From: Matt Domsch <Matt_Domsch@dell.com>
      
      Jens Axboe wrote:
      It should just be deleted. As you note, it is a debug message. I
      originally added it so we would have some clues as to dma capability for
      bug reports. There never was any, the check can go :)
      7676bfa0
    • Andrew Morton's avatar
      [PATCH] Fix MTD suspend/resume · b94ef24c
      Andrew Morton authored
      From: Russell King <rmk@arm.linux.org.uk>
      
      This patch carries forward the following bug fix from MTD CVS, which causes a
      lot of noise after a suspend/resume cycle on ARM devices.
      
      revision 1.127
      date: 2003/07/02 20:29:38;  author: acurtis;  state: Exp;  lines: +2 -1
      Added FL_STATUS to the FL_READY case in put_chip(). (Eliminate noise)
      b94ef24c
    • Andrew Morton's avatar
      [PATCH] dentry and inode cache hash algorithm performance changes. · 99effef9
      Andrew Morton authored
      From: "Jose R. Santos" <jrsantos@austin.ibm.com>
      
      It alleviates some issues seen with Linux when accessing millions of files on
      machines with large amounts of RAM (+32GB).  Both algorithms are base on some
      studies that Dominique Heger was doing on hash table efficiencies in Linux.
      The dentry hash table has been tested in small systems with one internal IDE
      hard disk as well as in large SMP with many fiberchanel disks.  Dominique
      claims that in all the testing done, they did not see one case were this has
      function provided worst performance and that in most test they were seeing
      better performance.
      
      The inode hash function was done by me base on Dominique's original work and
      has only been stress tested with SpecSFS.  It provided a 3% improvement over
      the default algorithm in the SpecSFS results and speed ups in the response
      time of almost all filesystem operations the benchmark stress.  With the
      better distribution is as also possible to reduce the number of inode buckets
      for 32 million to 16 million and still get a slightly better results.
      
      Anton was nice enough to provide some graphs that show the distribution 
      before and after the patch at http://samba.org/~anton/linux/sfs/1/
      
      For the dentry hash function, some of my other coorkers had put this hash
      function through various testing and have concluded that the hash function was
      equal or better than the default hash function.  These runs were done with a
      (hopefully to be Open Source soon) benchmark called FFSB which can simulate
      various io patters across many filesystems and variable file sizes.
      
      SpecSFS fileset is basically a lot of small file which varies depending on the
      size of the run.  For a not so big SMP system the number of file is in the +20
      Million files range.  Of those 20 million files only 10% are access randomly
      by the client.  The purpose of this is that the benchmark tries to stress not
      only the NFS layer but, VM and Filesystems layers as well.  The filesets are
      also hundreds of gigabytes in size in order to promote disk head movement by
      guaranteeing cache misses in memory.  SFS 27% of the workload are lookups
      __d_lookup has showing high in my profiles.
      
      For the inode hash the problem that I see is that when running a benchmark
      with this huge fileset we end up trying to free a lot of inode entries during
      the run while trying to put new entries in cache.  We end up calling
      ifind_fast() which calls find_inodes_fast() held under inode_lock.  In order
      to avoid holding the inode_lock we needed to avoid having long chains in that
      hash function.
      
      When I took a look at the original hash function, I found it to be a bit to
      simple for any workload.  My solution (which I took advantage of Dominique's
      work) was to create a hash that function that could generate completely
      different hashes depending on the hashval and the superblock in order to have
      the hash scale as we added more filesystems to the machine.
      
      Both of these problems can be somewhat tuned out by increasing the number of
      buckets of both d and i cache but it got to a point were I had 256MB of inode
      and 128MB in dentry hash buckets on a not so large SMP.  With the hash changes
      I have been able to reduce the number of buckets to 128MB for inode cache and
      to 32MB for dentry cache and still get better performance.
      
      If it help my case...  I haven't been running this benchmark for long, so I
      haven't been able to find a way to cheat.  I need to come up with generic
      solutions until I can find a cheat for the benchmark.  :)
      
      
      SDET results:
      
      Steve Pratt seem to have a SDET setup already and he did me the favor of
      running SDET with a reduce dentry entry hash table size.  I belive that
      his table suggest that less than 3% change is acceptable variability, but
      overall he got a 5% better number using the new hash algorith.
      
      A) x4408way1.sdet.2.6.5100000-8p.04-05-05_12.08.44 vs 
      B) x4408way1.sdet.2.6.5+hash-100000-8p.04-05-05_11.48.02
      
      
        Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
        Inode-cache hash table entries: 1048576 (order: 10, 4194304 bytes) 
      
      Results:Throughput
      
                                                tolerance = 0.00 + 3.00% of A
                            A            B
         Threads      Ops/sec      Ops/sec    %diff         diff    tolerance
      ----------- ------------ ------------ -------- ------------ ------------
               1    4341.9300    4401.9500     1.38        60.02       130.26 
               2    8242.2000    8165.1200    -0.94       -77.08       247.27 
               4   15274.4900   15257.1000    -0.11       -17.39       458.23 
               8   21326.9200   21320.7000    -0.03        -6.22       639.81 
              16   23056.2100   24282.8000     5.32      1226.59       691.69  * 
              32   23397.2500   24684.6100     5.50      1287.36       701.92  * 
              64   23372.7600   23632.6500     1.11       259.89       701.18 
             128   17009.3900   16651.9600    -2.10      -357.43       510.28 
      =========================================================================
      99effef9
    • Andrew Morton's avatar
      [PATCH] cmpci OSS driver update · 9e315f49
      Andrew Morton authored
      From: C.L. Tien <cltien@cmedia.com.tw>
      
      Current version from cmedia.
      9e315f49