1. 18 Oct, 2004 27 commits
    • Lev Makhlis's avatar
      [PATCH] show aggregate per-process counters in /proc/PID/stat 2 · 9a349eb7
      Lev Makhlis authored
      Add up resource usage counters for live and dead threads to show aggregate
      per-process usage in /proc/<pid>/stat.  This mirrors the new getrusage()
      semantics.  /proc/<pid>/task/<tid>/stat still has the per-thread usage.
      
      After moving the counter aggregation loop inside a task->sighand lock to
      avoid nasty race conditions, it has survived stress-testing with '(while
      true; do sleep 1 & done) & top -d 0.1'
      Signed-off-by: default avatarLev Makhlis <mlev@despammed.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9a349eb7
    • Albert Cahalan's avatar
      [PATCH] distinct tgid/tid CPU usage · bf719d26
      Albert Cahalan authored
      This patch adjusts /proc/*/stat to have distinct per-process and per-thread
      CPU usage, faults, and wchan.
      Signed-off-by: default avatarAlbert Cahalan <albert@users.sf.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bf719d26
    • Arnd Bergmann's avatar
      [PATCH] add missing linux/syscalls.h includes · 09b9135c
      Arnd Bergmann authored
      I found that the prototypes for sys_waitid and sys_fcntl in
      <linux/syscalls.h> don't match the implementation.  In order to keep all
      prototypes in sync in the future, now include the header from each file
      implementing any syscall.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      09b9135c
    • Ingo Molnar's avatar
      [PATCH] softirqs: fix latency of softirq processing · 40e39ce0
      Ingo Molnar authored
      The attached patch fixes a local_bh_enable() buglet: we first enabled
      softirqs then did we do local_softirq_pending() - often this is preemptible
      code.  So this task could be preempted and there's no guarantee that
      softirq processing will occur (except the periodic timer tick).
      
      The race window is small but existent.  This could result in packet
      processing latencies or timer expiration latencies - hard to detect and
      annoying bugs.
      
      The fix is to invoke softirqs with softirqs enabled but preemption still
      disabled.  Patch is against 2.6.9-rc2-mm1.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      40e39ce0
    • Roland McGrath's avatar
      [PATCH] fix PTRACE_ATTACH race with real parent's wait calls · cfc4f957
      Roland McGrath authored
      There is a race between PTRACE_ATTACH and the real parent calling wait. 
      For a moment, the task is put in PT_PTRACED but with its parent still
      pointing to its real_parent.  In this circumstance, if the real parent
      calls wait without the WUNTRACED flag, he can see a stopped child status,
      which wait should never return without WUNTRACED when the caller is not
      using ptrace.  Here it is not the caller that is using ptrace, but some
      third party.
      
      This patch avoids this race condition by adding the PT_ATTACHED flag to
      distinguish a real parent from a ptrace_attach parent when PT_PTRACED is
      set, and then having wait use this flag to confirm that things are in order
      and not consider the child ptraced when its ->ptrace flags are set but its
      parent links have not yet been switched.  (ptrace_check_attach also uses it
      similarly to rule out a possible race with a bogus ptrace call by the real
      parent during ptrace_attach.)
      
      While looking into this, I noticed that every arch's sys_execve has:
      
      		current->ptrace &= ~PT_DTRACE;
      
      with no locking at all.  So, if an exec happens in a race with
      PTRACE_ATTACH, you could wind up with ->ptrace not having PT_PTRACED set
      because this store clobbered it.  That will cause later BUG hits because
      the parent links indicate ptracedness but the flag is not set.  The patch
      corrects all the places I found to use task_lock around diddling ->ptrace
      when it's possible to be racing with ptrace_attach.  (The ptrace operation
      code itself doesn't have this issue because it already excludes anyone else
      being in ptrace_attach.)
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      cfc4f957
    • Roland McGrath's avatar
      [PATCH] add WCONTINUED support to wait4 syscall · 04bff088
      Roland McGrath authored
      POSIX specifies the new WCONTINUED flag for waitpid, not just for waitid.
      I overlooked this addition when I implemented waitid.  The real work was
      already done to support waitid, but waitpid needs to report the results
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      04bff088
    • Roland McGrath's avatar
      [PATCH] make rlimit settings per-process instead of per-thread · 31180071
      Roland McGrath authored
      POSIX specifies that the limit settings provided by getrlimit/setrlimit are
      shared by the whole process, not specific to individual threads.  This
      patch changes the behavior of those calls to comply with POSIX.
      
      I've moved the struct rlimit array from task_struct to signal_struct, as it
      has the correct sharing properties.  (This reduces kernel memory usage per
      thread in multithreaded processes by around 100/200 bytes for 32/64
      machines respectively.)  I took a fairly minimal approach to the locking
      issues with the newly shared struct rlimit array.  It turns out that all
      the code that is checking limits really just needs to look at one word at a
      time (one rlim_cur field, usually).  It's only the few places like
      getrlimit itself (and fork), that require atomicity in accessing a whole
      struct rlimit, so I just used a spin lock for them and no locking for most
      of the checks.  If it turns out that readers of struct rlimit need more
      atomicity where they are now cheap, or less overhead where they are now
      atomic (e.g. fork), then seqcount is certainly the right thing to use for
      them instead of readers using the spin lock.  Though it's in signal_struct,
      I didn't use siglock since the access to rlimits never needs to disable
      irqs and doesn't overlap with other siglock uses.  Instead of adding
      something new, I overloaded task_lock(task->group_leader) for this; it is
      used for other things that are not likely to happen simultaneously with
      limit tweaking.  To me that seems preferable to adding a word, but it would
      be trivial (and arguably cleaner) to add a separate lock for these users
      (or e.g. just use seqlock, which adds two words but is optimal for readers).
      
      Most of the changes here are just the trivial s/->rlim/->signal->rlim/. 
      
      I stumbled across what must be a long-standing bug, in reparent_to_init.
      It does:
      	memcpy(current->rlim, init_task.rlim, sizeof(*(current->rlim)));
      when surely it was intended to be:
      	memcpy(current->rlim, init_task.rlim, sizeof(current->rlim));
      As rlim is an array, the * in the sizeof expression gets the size of the
      first element, so this just changes the first limit (RLIMIT_CPU).  This is
      for kernel threads, where it's clear that resetting all the rlimits is what
      you want.  With that fixed, the setting of RLIMIT_FSIZE in nfsd is
      superfluous since it will now already have been reset to RLIM_INFINITY.
      
      The other subtlety is removing:
      	tsk->rlim[RLIMIT_CPU].rlim_cur = RLIM_INFINITY;
      in exit_notify, which was to avoid a race signalling during self-reaping
      exit.  As the limit is now shared, a dying thread should not change it for
      others.  Instead, I avoid that race by checking current->state before the
      RLIMIT_CPU check.  (Adding one new conditional in that path is now required
      one way or another, since if not for this check there would also be a new
      race with self-reaping exit later on clearing current->signal that would
      have to be checked for.)
      
      The one loose end left by this patch is with process accounting.
      do_acct_process temporarily resets the RLIMIT_FSIZE limit while writing the
      accounting record.  I left this as it was, but it is now changing a limit
      that might be shared by other threads still running.  I left this in a
      dubious state because it seems to me that processing accounting may already
      be more generally a dubious state when it comes to NPTL threads.  I would
      think you would want one record per process, with aggregate data about all
      threads that ever lived in it, not a separate record for each thread.
      I don't use process accounting myself, but if anyone is interested in
      testing it out I could provide a patch to change it this way.
      
      One final note, this is not 100% to POSIX compliance in regards to rlimits.
      POSIX specifies that RLIMIT_CPU refers to a whole process in aggregate, not
      to each individual thread.  I will provide patches later on to achieve that
      change, assuming this patch goes in first.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      31180071
    • Ingo Molnar's avatar
      [PATCH] i386 entry.S cleanups · cc588ba9
      Ingo Molnar authored
      Remove the unused lcall7/lcall27 code.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      cc588ba9
    • Pavel Machek's avatar
      [PATCH] acpi proc: error handling · 678ab4ca
      Pavel Machek authored
      Propagate the software_suspend() return value.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      678ab4ca
    • Pavel Machek's avatar
      [PATCH] swsusp: progress in percent · fa7f7d64
      Pavel Machek authored
      swsusp currently has very poor progress indication.  Thanks to Erik Rigtorp
      <erik@rigtorp.com>, we have percentages there, so people know how long wait
      to expect.  Please apply,
      
      From: Erik Rigtorp <erik@rigtorp.com>
      Signed-off-by: default avatarPavel Machek <pavel@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fa7f7d64
    • Andrea Arcangeli's avatar
      [PATCH] parport_pc superio chip fixes · b7478cdd
      Andrea Arcangeli authored
      This patch fixes some troubles that somebody reported me with the superio
      chips.
      
      In short rmmod parport_pc && cat /proc/iomem was good enough for crashing
      the box hard on some machine (and hwscan --printer was doing just that).
      The way the oops triggers is that iomem tries to vsprintf the p->name, but
      the p->name was a static string in the module address (now unloaded).
      
      The reason is that the superio chip scanning leaves up to two persistent
      ranges claimed.  But the second (legacy) pass has no way to notice the
      resources are already reclaimed.  Plus if the superio->io was different
      than the "io" variable (the range to scan for superio chips) the "io" range
      would generate a leak of the original "io" range too.
      
      I simply make sure to always release the requested space during the superio
      scan, and I make sure not to istantiate new ranges in the p->base that
      would cause the later parport scan to fail too (plus leaving up to leaked
      resources).
      
      The previous code that was returning values and was leaving garbage in
      there made no sense to me.  My best guess (assuming I didn't misread it ;)
      is that probably somebody added the request_region without realizing
      they're pointing to the very same address that would be requested later
      (and nobody does accesses on those ranges until later, so it was very safe
      to claim it later).
      
      Disclaimer: I don't have the specs of the winbond and smsc at hand, I just
      guessed what they do from the code (nothing checks superio->io except
      get_superio_dma get_superio_irq, which made the thing enough self
      explainatory to fix it without specs)
      Signed-off-by: default avatarAndrea Arcangeli <andrea@novell.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b7478cdd
    • Seth Rohit's avatar
      [PATCH] add sys_setaltroot() · 44c4fb89
      Seth Rohit authored
      Add a new system call setaltroot(2).
      
      Currently, using the altroot feature is accessible only via the
      set_personality() system call.  It is accessible to user space only if there
      is more than one exec domain in the system.  This patch allows using the
      altroot feature on systems where there is only one exec domain.
      
      It is possible to work around the issue by adding a dummy exec domain, but it
      was rejected for not being very elegant.
      
      If this feature is implemented in userspace, it adds a 16% overhead on a test
      case which greps for a single word in the kernel source tree.
      Signed-off-by: default avatarZou Nanhai <nanhai.zou@intel.com>
      Signed-off-by: default avatarGordon Jin <gordon.jin@intel.com>
      Signed-off-by: default avatarArun Sharma <arun.sharma@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      44c4fb89
    • Linus Torvalds's avatar
      Wrap <linux/compiler.h> inside '#ifndef __ASSEMBLY__' · f5aa089a
      Linus Torvalds authored
      None of the compatibility defines make sense for assembly
      files, and gcc has trouble with vararg macros when using
      "-traditional" (which is used for asm), to the point of
      ICE'ing.
      f5aa089a
    • Linus Torvalds's avatar
      Add copyright notice on ppc64 iomap files. · 0e0c5521
      Linus Torvalds authored
      Paul cares. I think there's something in the water at IBM
      that makes people sticklers ;)
      0e0c5521
    • Benjamin Herrenschmidt's avatar
      [PATCH] ppc64: Fix iSeries build (ouch !) · 02dc1467
      Benjamin Herrenschmidt authored
      The move of iomap out of eeh inadvertently broke iSeries ...
      
      Fixed like this.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      02dc1467
    • Benjamin Herrenschmidt's avatar
      [PATCH] ppc32/64: FPU/vector register restore after signal · 4aab1539
      Benjamin Herrenschmidt authored
      This fixes some issues with restoring the altivec and/or FPU registers
      upon return from a signal or when setting a context.  It also add a
      proper stack backlink to the signal frames created for 64 bits
      applications.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4aab1539
    • Linus Torvalds's avatar
      a85f54d7
    • Linus Torvalds's avatar
      Merge bk://gkernel.bkbits.net/net-drivers-2.6 · 9266734a
      Linus Torvalds authored
      into ppc970.osdl.org:/home/torvalds/v2.6/linux
      9266734a
    • Linus Torvalds's avatar
      Merge bk://gkernel.bkbits.net/libata-2.6 · 6db492bc
      Linus Torvalds authored
      into ppc970.osdl.org:/home/torvalds/v2.6/linux
      6db492bc
    • Jeff Garzik's avatar
      Merge pobox.com:/spare/repo/linux-2.6 · 49093583
      Jeff Garzik authored
      into pobox.com:/spare/repo/libata-2.6
      49093583
    • Linus Torvalds's avatar
      Add fake '__builtin_warning()' for the gcc case. · 6df3af84
      Linus Torvalds authored
      Allows us to do compile-time sparse warnings of our own.
      6df3af84
    • Linus Torvalds's avatar
      Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.6 · d78d2844
      Linus Torvalds authored
      into ppc970.osdl.org:/home/torvalds/v2.6/linux
      d78d2844
    • James Bottomley's avatar
      Merge titanic.il.steeleye.com:/home/jejb/BK/scsi-target-2.6 · e270e1b2
      James Bottomley authored
      into titanic.il.steeleye.com:/home/jejb/BK/scsi-for-linus-2.6
      e270e1b2
    • James Bottomley's avatar
      aic7xxx and aic79xx: fix sleeping while holding a lock · c045ebb7
      James Bottomley authored
      From: Luben Tuikov <luben_tuikov@adaptec.com>
      
      Fix sleeping while holding a lock on host removal and on
      killing the DV thread.
      Signed-off-by: default avatarLuben Tuikov <luben_tuikov@adaptec.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
      c045ebb7
    • James Bottomley's avatar
      SCSI: fix Suspend I/O block/unblock path · 4b8cbbf6
      James Bottomley authored
      From: James.Smart@Emulex.Com
      
      urther testing is showing that we are having some i/o threads
      prematurely die with the following message: "rejecting I/O to device
      being removed"
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
      4b8cbbf6
    • Mike Miller's avatar
      [PATCH] cciss: fixes for clustering · 2207252b
      Mike Miller authored
      This patch changes our open specifically for clustering software. We must
      allow root to access any volume or device with a LUN ID. We also modified
      our revalidate function for this reason.
      If a logical is reserved, we must register it with the OS with size=0. Then
      the backup system can call BLKRRPART after breaking the reservation to
      set the device to the correct size.
      We also must register a controller with no logical volumes for the online
      utilities to function. This is the way we've done it since the 2.2 kernel.
      Which doesn't neccesarily make it right, but we have legacy apps to consider.
      
      Signed off by: Mike Miller <mike.miller@hp.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
      2207252b
    • Linus Torvalds's avatar
      Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · 0d377ebc
      Linus Torvalds authored
      into ppc970.osdl.org:/home/torvalds/v2.6/linux
      0d377ebc
  2. 19 Oct, 2004 1 commit
  3. 18 Oct, 2004 9 commits
  4. 17 Oct, 2004 3 commits
    • Mike Miller's avatar
      [PATCH] cciss: SCSI API updates · a6c0c127
      Mike Miller authored
      This patch updates our SCSI support to no longer use deprecated APIs.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
      a6c0c127
    • Nathan Lynch's avatar
      [PATCH] ppc64: fix smp_startup_cpu for cpu hotplug · a0d194e3
      Nathan Lynch authored
      This change is needed in order to allow cpus to be onlined after
      boot.  This used to work but the declaration of
      pseries_secondary_smp_init in this file was changed in Ben's big
      cleanup patch a while back, so the cpu would start at a bad address.
      Signed-off-by: default avatarNathan Lynch <nathanl@austin.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a0d194e3
    • Nick Piggin's avatar
      [PATCH] kswapd lockup fix · 7ac62185
      Nick Piggin authored
      Fix some bugs in the kswapd logic which can cause kswapd lockups.
      
      The balance_pgdat() logic is supposed to cause kswapd to loop across all zones
      in the node until each zone either
      
      	a) has enough pages free or
      
      	b) is deemed to be in an "all pages unreclaimable" state.
      
      In the latter case, we just give the zone a light scan on each balance_pgdat()
      scan and wait for the zone to come back to life again.
      
      But the zone->all_unreclaimable logic is broken - if the zone has no pages on
      the LRU at all, we perform no scanning of that zone (of course).  So the
      zone->pages_scanned is not incremented and the expression
      
      		if (zone->pages_scanned > zone->present_pages * 2)
      			zone->all_unreclaimable = 1;
      
      never is satisfied.
      
      The patch changes that logic to
      
      		if (zone->pages_scanned >= (zone->nr_active +
      						zone->nr_inactive) * 4)
      			zone->all_unreclaimable = 1;
      
      so if the zone has no LRU pages it will still enter the all_unreclaimable
      state.
      
      
      Another problem is that if the zone has no LRU pages we will tell
      shrink_slab() that we scanned zero LRU pages.  This causes shrink_slab() to
      scan zero slab objects, which is obviously wrong.  So change shrink_slab() to
      perform a decent chunk of slab scanning in this situation.
      
      
      And put a cond_resched() into the balance_pgdat() outer loop.  Probably
      unnecessary, but that's what Jeff had in place when he confirmed that this
      patch fixed the lockup :(
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7ac62185