1. 09 Sep, 2002 1 commit
  2. 08 Sep, 2002 10 commits
    • Anton Altaparmakov's avatar
      Merge ssh://linux-ntfs@bkbits.net/ntfs-2.5 · fd16fd23
      Anton Altaparmakov authored
      into cantab.net:/usr/src/ntfs-2.5
      fd16fd23
    • Andrew Morton's avatar
      [PATCH] Use kmap_atomic() for generic_file_write() · 86ee4c5d
      Andrew Morton authored
      This patch uses the atomic copy_from_user() facility in
      generic_file_write().
      
      This required a change in the prepare_write/commit_write API
      definition.  It is no longer the case that these functions will kmap
      the page for you.
      
      If any part of the kernel wants to get at the page in the write path,
      it now has to kmap it for itself.  The best way to do this is with
      kmap_atomic(KM_USER0).
      
      This patch updates all callers.  It also converts several places which
      were unnecessarily using kmap() over to using kmap_atomic().
      
      The reiserfs changes here are Oleg Drokin's revised version.
      
      The patch has been tested with loop, ext2, ext3, reiserfs, jfs,
      minixfs, vfat, iso9660, nfs and the ramdisk driver.
      
      I haven't fixed the racy deadlock avoidance thing in
      generic_file_write() - the case where we take a fault when the source
      and dest of the copy are both the same pagecache page.
      
      There is a printk in there now which will trigger if the page was
      unexpectedly not present.  And guess what?  I get 50-100 of them when
      running `dbench 64' on mem=48m.   This deadlock can happen.
      86ee4c5d
    • Andrew Morton's avatar
      [PATCH] Use kmap_atomic() for generic_file_read() · 88a3b490
      Andrew Morton authored
      This patch allows the kernel to hold atomic kmaps in file_read_actor().
      
      We try to fault in the page, then take an atomic kmap.  If the atomic
      copy_to_user() then faults, drop a printk and fall back to kmap().
      88a3b490
    • Andrew Morton's avatar
      [PATCH] atomic copy_*_user infrastructure · 4b19c940
      Andrew Morton authored
      The patch implements the atomic copy_*_user() function.
      
      If the kernel takes a pagefault while running copy_*_user() in an
      atomic region, the copy_*_user() will fail (return a short value).
      
      And with this patch, holding an atomic kmap() puts the CPU into an
      atomic region.
      
      - Increment preempt_count() in kmap_atomic() regardless of the
        setting of CONFIG_PREEMPT.  The pagefault handler recognises this as
        an atomic region and refuses to service the fault.  copy_*_user will
        return a non-zero value.
      
      - Attempts to propagate the in_atomic() predicate to all the other
        highmem-capable architectures' pagefault handlers.  But the code is
        only tested on x86.
      
      - Fixed a PPC bug in kunmap_atomic(): it forgot to reenable
        preemption if HIGHMEM_DEBUG is turned on.
      
      - Fixed a sparc bug in kunmap_atomic(): it forgot to reenable
        preemption all the time, for non-fixmap pages.
      
      - Fix an error in <linux/highmem.h> - in the CONFIG_HIGHMEM=n case,
        kunmap_atomic() takes an address, not a page *.
      4b19c940
    • Andrew Morton's avatar
      [PATCH] refill the inactive list more quickly · 5f607d6e
      Andrew Morton authored
      Fix a problem noticed by Ed Tomlinson: under shifting workloads the
      shrink_zone() logic will refill the inactive load too slowly.
      
      Bale out of the zone scan when we've reclaimed enough pages.  Fixes a
      rarely-occurring problem wherein refill_inactive_zone() ends up
      shuffling 100,000 pages and generally goes silly.
      
      This needs to be revisited - we should go on and rebalance the lower
      zones even if we reclaimed enough pages from highmem.
      5f607d6e
    • Andrew Morton's avatar
      [PATCH] Back out the initial work for atomic copy_*_user() · 9fdbd959
      Andrew Morton authored
      Back out the use of preempt_count to signify atomicity wrt pagefaults.
      We won't do it that way - in_atomic() works fine.
      9fdbd959
    • Andrew Morton's avatar
      [PATCH] Fix the __block_write_full_page() error path. · 0e64a39d
      Andrew Morton authored
      Fix the ENOSPC recovery code in __block_write_full_page()
      
      - Don't write out clean buffers.
      
      - Set PG_writeback before submitting the IO.  Otherwise the completion
        handler will go BUG when it sees a non-PageWriteback page.  If the IO
        is very fast, or synchronous.
      0e64a39d
    • Andrew Morton's avatar
      [PATCH] Fix the boot-time reporting of each zone's available pages · d98b1feb
      Andrew Morton authored
      Patch from Bjorn Helgaas, via Rusty.
      
      Change:
      
        On node 0 totalpages: 61031         <--- not including holes
        zone(0): 65172 pages.               <--- including holes
        zone(1): 0 pages.                   ...
        zone(2): 0 pages.
      
      to:
      
        On node 0 totalpages: 61031         <--- not including holes
        DMA zone: 61031 pages               <--- not including holes
        Normal zone: 0 pages
        HighMem zone: 0 pages
      d98b1feb
    • Ingo Molnar's avatar
      [PATCH] shared thread signals · 6dfc8897
      Ingo Molnar authored
      Support POSIX compliant thread signals on a kernel level with usable
      debugging (broadcast SIGSTOP, SIGCONT) and thread group management
      (broadcast SIGKILL), plus to load-balance 'process' signals between
      threads for better signal performance. 
      
      Changes:
      
      - POSIX thread semantics for signals
      
      there are 7 'types' of actions a signal can take: specific, load-balance,
      kill-all, kill-all+core, stop-all, continue-all and ignore. Depending on
      the POSIX specifications each signal has one of the types defined for both
      the 'handler defined' and the 'handler not defined (kernel default)' case.  
      Here is the table:
      
       ----------------------------------------------------------
       |                    |  userspace       |  kernel        |
       ----------------------------------------------------------
       |  SIGHUP            |  load-balance    |  kill-all      |
       |  SIGINT            |  load-balance    |  kill-all      |
       |  SIGQUIT           |  load-balance    |  kill-all+core |
       |  SIGILL            |  specific        |  kill-all+core |
       |  SIGTRAP           |  specific        |  kill-all+core |
       |  SIGABRT/SIGIOT    |  specific        |  kill-all+core |
       |  SIGBUS            |  specific        |  kill-all+core |
       |  SIGFPE            |  specific        |  kill-all+core |
       |  SIGKILL           |  n/a             |  kill-all      |
       |  SIGUSR1           |  load-balance    |  kill-all      |
       |  SIGSEGV           |  specific        |  kill-all+core |
       |  SIGUSR2           |  load-balance    |  kill-all      |
       |  SIGPIPE           |  specific        |  kill-all      |
       |  SIGALRM           |  load-balance    |  kill-all      |
       |  SIGTERM           |  load-balance    |  kill-all      |
       |  SIGCHLD           |  load-balance    |  ignore        |
       |  SIGCONT           |  load-balance    |  continue-all  |
       |  SIGSTOP           |  n/a             |  stop-all      |
       |  SIGTSTP           |  load-balance    |  stop-all      |
       |  SIGTTIN           |  load-balancen   |  stop-all      |
       |  SIGTTOU           |  load-balancen   |  stop-all      |
       |  SIGURG            |  load-balance    |  ignore        |
       |  SIGXCPU           |  specific        |  kill-all+core |
       |  SIGXFSZ           |  specific        |  kill-all+core |
       |  SIGVTALRM         |  load-balance    |  kill-all      |
       |  SIGPROF           |  specific        |  kill-all      |
       |  SIGPOLL/SIGIO     |  load-balance    |  kill-all      |
       |  SIGSYS/SIGUNUSED  |  specific        |  kill-all+core |
       |  SIGSTKFLT         |  specific        |  kill-all      |
       |  SIGWINCH          |  load-balance    |  ignore        |
       |  SIGPWR            |  load-balance    |  kill-all      |
       |  SIGRTMIN-SIGRTMAX |  load-balance    |  kill-all      |
       ----------------------------------------------------------
      
      as you can see it from the list, signals that have handlers defined never 
      get broadcasted - they are either specific or load-balanced.
      
      - CLONE_THREAD implies CLONE_SIGHAND
      
      It does not make much sense to have a thread group that does not share
      signal handlers. In fact in the patch i'm using the signal spinlock to
      lock access to the thread group. I made the siglock IRQ-safe, thus we can
      load-balance signals from interrupt contexts as well. (we cannot take the
      tasklist lock in write mode from IRQ handlers.)
      
      this is not as clean as i'd like it to be, but it's the best i could come
      up with so far.
      
      - thread group list management reworked.
      
      threads are now removed from the group if the thread is unhashed from the
      PID table. This makes the most sense. This also helps with another feature 
      that relies on an intact thread group list: multithreaded coredumps.
      
      - child reparenting reworked.
      
      the O(N) algorithm in forget_original_parent() causes massive performance
      problems if a large number of threads exit from the group. Performance 
      improves more than 10-fold if the following simple rules are followed 
      instead:
      
       - reparent children to the *previous* thread [exiting or not]
       - if a thread is detached then reparent to init.
      
      - fast broadcasting of kernel-internal SIGSTOP, SIGCONT, SIGKILL, etc.
      
      kernel-internal broadcasted signals are a potential DoS problem, since
      they might generate massive amounts of GFP_ATOMIC allocations of siginfo
      structures. The important thing to note is that the siginfo structure does
      not actually have to be allocated and queued - the signal processing code
      has all the information it needs, neither of these signals carries any
      information in the siginfo structure. This makes a broadcast SIGKILL a
      very simple operation: all threads get the bit 9 set in their pending
      bitmask. The speedup due to this was significant - and the robustness win
      is invaluable.
      
      - sys_execve() should not kill off 'all other' threads.
      
      the 'exec kills all threads if the master thread does the exec()' is a
      POSIX(-ish) thing that should not be hardcoded in the kernel in this case.
      
      to handle POSIX exec() semantics, glibc uses a special syscall, which
      kills 'all but self' threads: sys_exit_allbutself().
      
      the straightforward exec() implementation just calls sys_exit_allbutself()  
      and then sys_execve().
      
      (this syscall is also be used internally if the thread group leader
      thread sys_exit()s or sys_exec()s, to ensure the integrity of the thread
      group.)
      6dfc8897
    • Ivan Kokshaysky's avatar
      [PATCH] pci bus resources, transparent bridges · 36780249
      Ivan Kokshaysky authored
      Added PCI_BUS_NUM_RESOURCES as Ben suggested. Default value is 4
      and can be overridden by arch (probably in asm/system.h).
      pci_read_bridge_bases() and pci_assign_bus_resource() changed
      accordingly. "for (i = 0 ; i < 4; i++)" in pci_add_new_bus() not
      changed, as it's used _only_ for pci-pci and cardbus bridges.
      36780249
  3. 07 Sep, 2002 29 commits