1. 08 Jan, 2005 40 commits
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: DRM · 2a113c8a
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarDave Airlie <airlied@linux.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2a113c8a
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: Block devices · 31dc5ed1
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      31dc5ed1
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: Misc drivers · 9ee22704
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9ee22704
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: MIPS · 24e4a599
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      24e4a599
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: M32R · 4b0bf7b6
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4b0bf7b6
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: IA64 · ecdf7357
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatar"Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ecdf7357
    • Thomas Gleixner's avatar
      [PATCH] Lock initializer unifying: ALPHA · ac30d221
      Thomas Gleixner authored
      To make spinlock/rwlock initialization consistent all over the kernel,
      this patch converts explicit lock-initializers into spin_lock_init() and
      rwlock_init() calls.
      
      Currently, spinlocks and rwlocks are initialized in two different ways:
      
        lock = SPIN_LOCK_UNLOCKED
        spin_lock_init(&lock)
      
        rwlock = RW_LOCK_UNLOCKED
        rwlock_init(&rwlock)
      
      this patch converts all explicit lock initializations to
      spin_lock_init() or rwlock_init(). (Besides consistency this also helps
      automatic lock validators and debugging code.)
      
      The conversion was done with a script, it was verified manually and it
      was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
      
      There is no runtime overhead or actual code change resulting out of this
      patch, because spin_lock_init() and rwlock_init() are macros and are
      thus equivalent to the explicit initialization method.
      
      That's the second batch of the unifying patches.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ac30d221
    • Andrew Morton's avatar
      [PATCH] invalidate_inode_pages2() mmap coherency fix · 918798e7
      Andrew Morton authored
      - When invalidating pages, take care to shoot down any ptes which map them
        as well.
      
        This ensures that the next mmap access to the page will generate a major
        fault, so NFS's server-side modifications are picked up.
      
        This also allows us to call invalidate_complete_page() on all pages, so
        filesytems such as ext3 get a chance to invalidate the buffer_heads.
      
      - Don't mark in-pagetable pages as non-uptodate any more.  That broke a
        previous guarantee that mapped-into-user-process pages are always uptodate.
      
      - Check the return value of invalidate_complete_page().  It can fail if
        someone redirties a page after generic_file_direct_IO() write it back.
      
      But we still have a problem.  If invalidate_inode_pages2() calls
      unmap_mapping_range(), that can cause zap_pte_range() to dirty the pagecache
      pages.  That will redirty the page's buffers and will cause
      invalidate_complete_page() to fail.
      
      So, in generic_file_direct_IO() we do a complete pte shootdown on the file
      up-front, prior to writing back dirty pagecache.  This is only done for
      O_DIRECT writes.  It _could_ be done for O_DIRECT reads too, providing full
      mmap-vs-direct-IO coherency for both O_DIRECT reads and O_DIRECT writes, but
      permitting the pte shootdown on O_DIRECT reads trivially allows people to nuke
      other people's mapped pagecache.
      
      NFS also uses invalidate_inode_pages2() for handling server-side modification
      notifications.  But in the NFS case the clear_page_dirty() in
      invalidate_inode_pages2() is sufficient, because NFS doesn't have to worry
      about the "dirty buffers against a clean page" problem. (I think)
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      918798e7
    • Andrew Morton's avatar
      [PATCH] readpage-vs-invalidate fix · ba1f08f1
      Andrew Morton authored
      A while ago we merged a patch which tried to solve a problem wherein a
      concurrent read() and invalidate_inode_pages() would cause the read() to
      return -EIO because invalidate cleared PageUptodate() at the wrong time.
      
      That patch tests for (page_count(page) != 2) in invalidate_complete_page() and
      bales out if false.
      
      Problem is, the page may be in the per-cpu LRU front-ends over in
      lru_cache_add.  This elevates the refcount pending spillage of the page onto
      the LRU for real.  That causes a false positive in invalidate_complete_page(),
      causing the page to not get invalidated.  This screws up the logic in my new
      O_DIRECT-vs-buffered coherency fix.
      
      So let's solve the invalidate-vs-read in a different manner.  Over on the
      read() side, add an explicit check to see if the page was invalidated.  If so,
      just drop it on the floor and redo the read from scratch.
      
      Note that only do_generic_mapping_read() needs treatment.  filemap_nopage(),
      filemap_getpage() and read_cache_page() are already doing the
      oh-it-was-invalidated-so-try-again thing.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ba1f08f1
    • Rusty Russell's avatar
      [PATCH] Remove EXPORT_SYMBOL_NOVERS · 7d2b8702
      Rusty Russell authored
      Vadim Lobanov points out that EXPORT_SYMBOL_NOVERS is no longer used; in
      fact, SH still uses it, but once we fix that, the kernel is clean.  Remove
      it.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7d2b8702
    • H. Peter Anvin's avatar
      [PATCH] raid6: altivec support · d4d539a9
      H. Peter Anvin authored
      This patch adds Altivec support for RAID-6, if appropriately configured on
      the ppc or ppc64 architectures.  Note that it changes the compile flags for
      ppc64 in order to handle -maltivec correctly; this change was vetted on the
      ppc64 mailing list and OK'd by paulus.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d4d539a9
    • David Brownell's avatar
      [PATCH] fbdev: rivafb should recognize NF2/IGP · 71a9a3d2
      David Brownell authored
      I got tired of not seeing the boot time penguin on my Shuttle SN41G2, and
      not having a decently large text display when I bypass X11.  XFree86 says
      it's "Chipset GeForce4 MX Integrated GPU", and the kernel driver has hooks
      for this chip ID although it doesn't have a #define to match.
      Signed-off-by: default avatarDavid Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      71a9a3d2
    • Kamezawa Hiroyuki's avatar
      [PATCH] no buddy bitmap patch revist: for ia64 · 926721e9
      Kamezawa Hiroyuki authored
      This patch is for ia64 kernel, and defines CONFIG_HOLES_IN_ZONE in
      arch/ia64/Kconfig.  IA64 has memory holes smaller than its MAX_ORDER and
      its virtual memmap allows holes in a zone's memmap.
      
      This patch makes vmemmap aligned with IA64_GRANULE_SIZE in
      arch/ia64/mm/init.c.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      926721e9
    • Kamezawa Hiroyuki's avatar
      [PATCH] no buddy bitmap patch revisit: for mm/page_alloc.c · 69fba2dd
      Kamezawa Hiroyuki authored
      This patch removes bitmaps from page allocator in mm/page_alloc.c.
      
      This buddy system uses page->private field to record free page's order
      instead of using bitmaps.
      
      The algorithm of the buddy system is unchanged. Only bitmaps are removed.
      
      In this buddy system, 2 pages,a page and "buddy", can be coalesced when
      
      (buddy->private & PG_private) &&
      (page_order(page)) == (page_order(buddy)) &&
      !PageReserved(buddy) &&
      page_count(buddy) == 0
      
      this also means "buddy" is a head of continuous free pages
      of length of (1 << page_order(buddy)).
      
      bad_range() is called from inner loop of __free_pages_bulk().
      In many archs, bad_range() is only a sanity check, it will always return 0.
      But if a zone's memmap has a hole, it sometimes returns 1.
      An architecture with memory holes in a zone has to define CONFIG_HOLES_IN_ZONE.
      When CONFIG_HOLES_IN_ZONE is defined, pfn_valid() is called for checking
      whether a buddy pages is valid or not.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      69fba2dd
    • Kamezawa Hiroyuki's avatar
      [PATCH] no buddy bitmap patch revist: intro and includes · 6951e82f
      Kamezawa Hiroyuki authored
      Followings are patches for removing bitmaps from the buddy allocator.  This
      is benefical to memory-hot-plug stuffs, because this removes a data
      structure which must meet to a host's physical memory layout.
      
      This is one step to manage physical memory in nonlinear / discontiguous way
      and will reduce some amounts of codes to implement memory-hot-plug.
      
      This patch removes bitmaps from zone->free_area[] in include/linux/mmzone.h,
      and adds some comments on page->private field in include/linux/mm.h.
      
      non-atomic ops for changing PG_private bit is added in include/page-flags.h.
      zone->lock is always acquired when PG_private of "a free page" is changed.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6951e82f
    • William Lee Irwin III's avatar
      [PATCH] vm: for -mm only: remove remap_page_range() completely · ce1d9c8a
      William Lee Irwin III authored
      All in-tree references to remap_page_range() have been removed by prior
      patches in the series.  This patch, intended to be applied after some waiting
      period for people to adjust to the API change, notice __deprecated, etc., does
      the final removal of remap_page_range() as a function symbol declared within
      kernel headers and/or implemented in kernel sources.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ce1d9c8a
    • Neil Brown's avatar
      [PATCH] md: improve 'hash' code in linear.c · 23526361
      Neil Brown authored
      The hashtable that linear uses to find the right device stores
      two pointers for every entry.
      
      The second is always one of:
         The first plus 1
         NULL
      When NULL, it is never accessed, so any value can be stored.
      
      Thus it could always be "first plus 1", and so we don't need to store
      it as it is trivial to calculate.
      
      This patch halves the size of this table, which results in some simpler
      code as well.
      Signed-off-by: default avatarNeil Brown <neilb@cse.unsw.edu.au>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      23526361
    • Nathan Lynch's avatar
      [PATCH] cpu_down() warning fix · e975169b
      Nathan Lynch authored
      Fix (harmless?) smp_processor_id() usage in preemptible section of
      cpu_down.
      Signed-off-by: default avatarNathan Lynch <nathanl@austin.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e975169b
    • Ingo Molnar's avatar
      [PATCH] oprofile preempt warning fixes · dc852722
      Ingo Molnar authored
      From: Peter Zijlstra <peter@programming.kicks-ass.net>
      
      I have to use oprofile a lot but do want to enable preemption checks.
      This gives some noise; I think andrew allready mentioned fixin this.
      
      The following patch fixes about half of the warnings.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      dc852722
    • Ingo Molnar's avatar
      [PATCH] remove the BKL by turning it into a semaphore · fb8f6499
      Ingo Molnar authored
      This is the current remove-BKL patch.  I test-booted it on x86 and x64, trying
      every conceivable combination of SMP, PREEMPT and PREEMPT_BKL.  All other
      architectures should compile as well.  (most of the testing was done with the
      zaphod patch undone but it applies cleanly on vanilla -mm3 as well and should
      work fine.)
      
      this is the debugging-enabled variant of the patch which has two main
      debugging features:
      
       - debug potentially illegal smp_processor_id() use. Has caught a number
         of real bugs - e.g. look at the printk.c fix in the patch.
      
       - make it possible to enable/disable the BKL via a .config. If this 
         goes upstream we dont want this of course, but for now it gives
         people a chance to find out whether any particular problem was caused
         by this patch.
      
      This patch has one important fix over the previous BKL patch: on PREEMPT
      kernels if we preempted BKL-using code then the code still auto-dropped the
      BKL by mistake.  This caused a number of breakages for testers, which
      breakages went away once this bug was fixed.
      
      Also the debugging mechanism has been improved alot relative to the previous
      BKL patch.
      
      Would be nice to test-drive this in -mm.  There will likely be some more
      smp_processor_id() false positives but they are 1) harmless 2) easy to fix up.
      We could as well find more real smp_processor_id() related breakages as well.
      
      The most noteworthy fact is that no BKL-using code was found yet that relied
      on smp_processor_id(), which is promising from a compatibility POV.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fb8f6499
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: restart_addr in truncate_count · 8a1a48b7
      Hugh Dickins authored
      Despite its restart_pgoff pretentions, unmap_mapping_range_vma was fatally
      unable to distinguish a vma to be restarted from the case where that vma
      has been freed, and its vm_area_struct reused for the top part of a
      !new_below split of an isomorphic vma yet to be scanned.
      
      The obvious answer is to note restart_vma in the struct address_space, and
      cancel it when that vma is freed; but I'm reluctant to enlarge every struct
      inode just for this.  Another answer is to flag valid restart in the
      vm_area_struct; but vm_flags is protected by down_write of mmap_sem, which
      we cannot take within down_write of i_sem.  If we're going to need yet
      another field, better to record the restart_addr itself: restart_vma only
      recorded the last restart, but a busy tree could well use more.
      
      Actually, we don't need another field: we can neatly (though naughtily)
      keep restart_addr in vm_truncate_count, provided mapping->truncate_count
      leaps over those values which look like a page-aligned address.  Zero
      remains good for forcing a scan (though now interpreted as restart_addr 0),
      and it turns out no change is needed to any of the vm_truncate_count
      settings in dup_mmap, vma_link, vma_adjust, move_one_page.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8a1a48b7
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: bug if page_mapped · d5c772ed
      Hugh Dickins authored
      If unmap_mapping_range (and mapping->truncate_count) are doing their jobs
      right, truncate_complete_page should never find the page mapped: add BUG_ON
      for our immediate testing, but this patch should probably not go to mainline -
      a mapped page here is not a catastrophe.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d5c772ed
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: vm_truncate_count race caution · 0b5d6831
      Hugh Dickins authored
      Fix some unlikely races in respect of vm_truncate_count.
      
      Firstly, it's supposed to be guarded by i_mmap_lock, but some places copy a
      vma structure by *new_vma = *old_vma: if the compiler implements that with a
      bytewise copy, new_vma->vm_truncate_count could be munged, and new_vma later
      appear up-to-date when it's not; so set it properly once under lock.
      
      vma_link set vm_truncate_count to mapping->truncate_count when adding an empty
      vma: if new vmas are being added profusely while vmtruncate is in progess,
      this lets them be skipped without scanning.
      
      vma_adjust has vm_truncate_count problem much like it had with anon_vma under
      mprotect merge: when merging be careful not to leave vma marked as up-to-date
      when it might not be, lest unmap_mapping_range in progress - set
      vm_truncate_count 0 when in doubt.  Similarly when mremap moving ptes from one
      vma to another.
      
      Cut a little code from __anon_vma_merge: now vma_adjust sets "importer" in the
      remove_next case (to get its vm_truncate_count right), its anon_vma is already
      linked by the time __anon_vma_merge is called.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0b5d6831
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: unmap_mapping dropping i_mmap_lock · 3ee07371
      Hugh Dickins authored
      vmtruncate (or more generally, unmap_mapping_range) has been observed
      responsible for very high latencies: the lockbreak work in unmap_vmas is good
      for munmap or exit_mmap, but no use while mapping->i_mmap_lock is held, to
      keep our place in the prio_tree (or list) of a file's vmas.
      
      Extend the zap_details block with i_mmap_lock pointer, so unmap_vmas can
      detect if that needs lockbreak, and break_addr so it can notify where it left
      off.
      
      Add unmap_mapping_range_vma, used from both prio_tree and nonlinear list
      handlers.  This is what now calls zap_page_range (above unmap_vmas), but
      handles the lockbreak and restart issues: letting unmap_mapping_range_ tree or
      list know when they need to start over because lock was dropped.
      
      When restarting, of course there's a danger of never making progress.  Add
      vm_truncate_count field to vm_area_struct, update that to mapping->
      truncate_count once fully scanned, skip up-to-date vmas without a scan (and
      without dropping i_mmap_lock).
      
      Further danger of never making progress if a vma is very large: when breaking
      out, save restart_vma and restart_addr (and restart_pgoff to confirm, in case
      vma gets reused), to help continue where we left off.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3ee07371
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: unmap_mapping_range_tree · 25f5906c
      Hugh Dickins authored
      Move unmap_mapping_range's nonlinear vma handling out to its own inline,
      parallel to the prio_tree handling; unmap_mapping_range_list is a better name
      for the nonlinear list, rename the other unmap_mapping_range_tree.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      25f5906c
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: restore unmap_vmas zap_bytes · 84c496cf
      Hugh Dickins authored
      The low-latency unmap_vmas patch silently moved the zap_bytes test after the
      TLB finish and lockbreak and regather: why?  That not only makes zap_bytes
      redundant (might as well use ZAP_BLOCK_SIZE), it makes the unmap_vmas level
      redundant too - it's all about saving TLB flushes when unmapping a series of
      small vmas.
      
      Move zap_bytes test back before the lockbreak, and delete the curious comment
      that a small zap block size doesn't matter: it's true need_flush prevents TLB
      flush when no page has been unmapped, but unmapping pages in small blocks
      involves many more TLB flushes than in large blocks.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      84c496cf
    • Hugh Dickins's avatar
      [PATCH] vmtrunc: truncate_count not atomic · b37e39b0
      Hugh Dickins authored
      Why is mapping->truncate_count atomic?  It's incremented inside i_mmap_lock
      (and i_sem), and the reads don't need it to be atomic.
      
      And why smp_rmb() before call to ->nopage?  The compiler cannot reorder the
      initial assignment of sequence after the call to ->nopage, and no cpu (yet!)
      can read from the future, which is all that matters there.
      
      And delete totally bogus reset of truncate_count from blkmtd add_device.
      truncate_count is all about detecting i_size changes: i_size does not change
      there; and if it did, the count should be incremented not reset.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b37e39b0
    • Andrew Morton's avatar
      [PATCH] block2mtd: avoid touching truncate_count · de146a08
      Andrew Morton authored
      blockmtd doesn't need to initialise address_space.truncate_count:
      open_bdev_excl did that.
      
      Plus I have a patch queued up which removes ->truncate_count.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      de146a08
    • Matthew Dobson's avatar
      [PATCH] Replace 'numnodes' with 'node_online_map' - arch-independent · 042ac4ab
      Matthew Dobson authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      Without passing this parameter by reference, the changes to used_node_mask
      are meaningless and do not affect the caller's copy.
      
      This leads to boot-time failure. This proposed fix passes it by reference.
      Signed-off-by: default avatarWilliam Irwin <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      042ac4ab
    • Matthew Dobson's avatar
    • Matthew Dobson's avatar
    • Matthew Dobson's avatar
    • Matthew Dobson's avatar
    • Matthew Dobson's avatar
    • Matthew Dobson's avatar
      [PATCH] Replace 'numnodes' with 'node_online_map' - ia64 · 9269293d
      Matthew Dobson authored
      From: Jesse Barnes <jbarnes@engr.sgi.com>
      
      Here are some compile fixes for this patch.  Looks like simple typos.
      Signed-off-by: default avatarJesse Barnes <jbarnes@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9269293d
    • Matthew Dobson's avatar
    • Matthew Dobson's avatar
      b9462e80
    • Matthew Dobson's avatar
    • Nick Piggin's avatar
      [PATCH] debug sched domains before attach · 7f257c1b
      Nick Piggin authored
      Change the sched-domain debug routine to be called on a per-CPU basis, and
      executed before the domain is actually attached to the CPU.  Previously, all
      CPUs would have their new domains attached, and then the debug routine would
      loop over all of them.
      
      This has two advantages: First, there is no longer any theoretical races: we
      are running the debug routine on a domain that isn't yet active, and should
      have no racing access from another CPU.  Second, if there is a problem with a
      domain, the validator will have a better chance to catch the error and print a
      diagnostic _before_ the domain is attached, which may take down the system.
      
      Also, change reporting of detected error conditions to KERN_ERR instead of
      KERN_DEBUG, so they have a better chance of being seen in a hang on boot
      situation.
      
      The patch also does an unrelated (and harmless) cleanup in migration_thread().
      Signed-off-by: default avatarNick Piggin <nickpiggin@yahoo.com.au>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7f257c1b
    • Ingo Molnar's avatar
      [PATCH] Fix smp_processor_id() warning in numa_node_id() · c919160e
      Ingo Molnar authored
      The patch below fixes smp_processor_id() warnings that are triggered by
      numa_node_id().
      
      All uses of numa_node_id() in mm/mempolicy.c seem to use it as a 'hint'
      only, not as a correctness number.  Once a node is established, it's used
      in a preemption-safe way.  So the simple fix is to disable the checking for
      numa_node_id().  But additional review would be more than welcome, because
      this patch turns off the preemption-checking of numa_node_id() permanently.
      Tested on amd64.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c919160e