1. 14 Jan, 2011 10 commits
    • Mel Gorman's avatar
      mm: page allocator: adjust the per-cpu counter threshold when memory is low · 88f5acf8
      Mel Gorman authored
      Commit aa454840 ("calculate a better estimate of NR_FREE_PAGES when memory
      is low") noted that watermarks were based on the vmstat NR_FREE_PAGES.  To
      avoid synchronization overhead, these counters are maintained on a per-cpu
      basis and drained both periodically and when a threshold is above a
      threshold.  On large CPU systems, the difference between the estimate and
      real value of NR_FREE_PAGES can be very high.  The system can get into a
      case where pages are allocated far below the min watermark potentially
      causing livelock issues.  The commit solved the problem by taking a better
      reading of NR_FREE_PAGES when memory was low.
      
      Unfortately, as reported by Shaohua Li this accurate reading can consume a
      large amount of CPU time on systems with many sockets due to cache line
      bouncing.  This patch takes a different approach.  For large machines
      where counter drift might be unsafe and while kswapd is awake, the per-cpu
      thresholds for the target pgdat are reduced to limit the level of drift to
      what should be a safe level.  This incurs a performance penalty in heavy
      memory pressure by a factor that depends on the workload and the machine
      but the machine should function correctly without accidentally exhausting
      all memory on a node.  There is an additional cost when kswapd wakes and
      sleeps but the event is not expected to be frequent - in Shaohua's test
      case, there was one recorded sleep and wake event at least.
      
      To ensure that kswapd wakes up, a safe version of zone_watermark_ok() is
      introduced that takes a more accurate reading of NR_FREE_PAGES when called
      from wakeup_kswapd, when deciding whether it is really safe to go back to
      sleep in sleeping_prematurely() and when deciding if a zone is really
      balanced or not in balance_pgdat().  We are still using an expensive
      function but limiting how often it is called.
      
      When the test case is reproduced, the time spent in the watermark
      functions is reduced.  The following report is on the percentage of time
      spent cumulatively spent in the functions zone_nr_free_pages(),
      zone_watermark_ok(), __zone_watermark_ok(), zone_watermark_ok_safe(),
      zone_page_state_snapshot(), zone_page_state().
      
      vanilla                      11.6615%
      disable-threshold            0.2584%
      
      David said:
      
      : We had to pull aa454840 "mm: page allocator: calculate a better estimate
      : of NR_FREE_PAGES when memory is low and kswapd is awake" from 2.6.36
      : internally because tests showed that it would cause the machine to stall
      : as the result of heavy kswapd activity.  I merged it back with this fix as
      : it is pending in the -mm tree and it solves the issue we were seeing, so I
      : definitely think this should be pushed to -stable (and I would seriously
      : consider it for 2.6.37 inclusion even at this late date).
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Reported-by: default avatarShaohua Li <shaohua.li@intel.com>
      Reviewed-by: default avatarChristoph Lameter <cl@linux.com>
      Tested-by: default avatarNicolas Bareil <nico@chdir.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: <stable@kernel.org>		[2.6.37.1, 2.6.36.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      88f5acf8
    • Dave Jones's avatar
      sched: remove long deprecated CLONE_STOPPED flag · 43bb40c9
      Dave Jones authored
      This warning was added in commit bdff746a ("clone: prepare to recycle
      CLONE_STOPPED") three years ago.  2.6.26 came and went.  As far as I know,
      no-one is actually using CLONE_STOPPED.
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43bb40c9
    • Claudio Scordino's avatar
      atmel_serial: fix RTS high after initialization in RS485 mode · 5dfbd1d7
      Claudio Scordino authored
      When working in RS485 mode, the atmel_serial driver keeps RTS high after
      the initialization of the serial port.  It goes low only after the first
      character has been sent.
      
      [akpm@linux-foundation.org: simplify code]
      Signed-off-by: default avatarClaudio Scordino <claudio@evidence.eu.com>
      Signed-off-by: default avatarArkadiusz Bubala <arkadiusz.bubala@gmail.com>
      Tested-by: default avatarArkadiusz Bubala <arkadiusz.bubala@gmail.com>
      Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5dfbd1d7
    • Eric Dumazet's avatar
      irq: use per_cpu kstat_irqs · 6c9ae009
      Eric Dumazet authored
      Use modern per_cpu API to increment {soft|hard}irq counters, and use
      per_cpu allocation for (struct irq_desc)->kstats_irq instead of an array.
      
      This gives better SMP/NUMA locality and saves few instructions per irq.
      
      With small nr_cpuids values (8 for example), kstats_irq was a small array
      (less than L1_CACHE_BYTES), potentially source of false sharing.
      
      In the !CONFIG_SPARSE_IRQ case, remove the huge, NUMA/cache unfriendly
      kstat_irqs_all[NR_IRQS][NR_CPUS] array.
      
      Note: we still populate kstats_irq for all possible irqs in
      early_irq_init().  We probably could use on-demand allocations.  (Code
      included in alloc_descs()).  Problem is not all IRQS are used with a prior
      alloc_descs() call.
      
      kstat_irqs_this_cpu() is not used anymore, remove it.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Reviewed-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c9ae009
    • Bruce Chang's avatar
      MAINTAINERS: update entries affecting VIA Technologies · 558bbb2f
      Bruce Chang authored
      Since the original maintainer-Joseph Chan (josephchan@via.com.tw) doesn't
      handle the Linux driver for VIA now, I would like to request to update the
      maintainer for the SD/MMC CARD CONTROLLER DRIVER and VIA
      UNICHROME(PRO)/CHROME9 FRAMEBUFFER DRIVER before we find a better one.
      Signed-off-by: default avatarBruce Chang <brucechang@via.com.tw>
      Signed-off-by: default avatarFlorian Tobias Schandinat <FlorianSchandinat@gmx.de>
      Cc: Joseph Chan <JosephChan@via.com.tw>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Harald Welte <HaraldWelte@viatech.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      558bbb2f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm · f6bcfd94
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (32 commits)
        dm: raid456 basic support
        dm: per target unplug callback support
        dm: introduce target callbacks and congestion callback
        dm mpath: delay activate_path retry on SCSI_DH_RETRY
        dm: remove superfluous irq disablement in dm_request_fn
        dm log: use PTR_ERR value instead of ENOMEM
        dm snapshot: avoid storing private suspended state
        dm snapshot: persistent make metadata_wq multithreaded
        dm: use non reentrant workqueues if equivalent
        dm: convert workqueues to alloc_ordered
        dm stripe: switch from local workqueue to system_wq
        dm: dont use flush_scheduled_work
        dm snapshot: remove unused dm_snapshot queued_bios_work
        dm ioctl: suppress needless warning messages
        dm crypt: add loop aes iv generator
        dm crypt: add multi key capability
        dm crypt: add post iv call to iv generator
        dm crypt: use io thread for reads only if mempool exhausted
        dm crypt: scale to multiple cpus
        dm crypt: simplify compatible table output
        ...
      f6bcfd94
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://neil.brown.name/md · 509e4aef
      Linus Torvalds authored
      * 'for-linus' of git://neil.brown.name/md:
        md: Fix removal of extra drives when converting RAID6 to RAID5
        md: range check slot number when manually adding a spare.
        md/raid5: handle manually-added spares in start_reshape.
        md: fix sync_completed reporting for very large drives (>2TB)
        md: allow suspend_lo and suspend_hi to decrease as well as increase.
        md: Don't let implementation detail of curr_resync leak out through sysfs.
        md: separate meta and data devs
        md-new-param-to_sync_page_io
        md-new-param-to-calc_dev_sboffset
        md: Be more careful about clearing flags bit in ->recovery
        md: md_stop_writes requires mddev_lock.
        md/raid5: use sysfs_notify_dirent_safe to avoid NULL pointer
        md: Ensure no IO request to get md device before it is properly initialised.
        md: Fix single printks with multiple KERN_<level>s
        md: fix regression resulting in delays in clearing bits in a bitmap
        md: fix regression with re-adding devices to arrays with no metadata
      509e4aef
    • Linus Torvalds's avatar
      375b6f5a
    • Linus Torvalds's avatar
      Revert "gpiolib: annotate gpio-intialization with __must_check" · d8a3515e
      Linus Torvalds authored
      This reverts commit 0fdae42d, which
      wasn't really supposed to go in, and causes lots of annoying warnings.
      
      Quoth Andrew:
        "Complete brainfart - I meant to drop that patch ages ago."
      
      Quoth Greg:
        "Ick, yeah, that patch isn't ok to go in as-is, all of the callers
         need to be fixed up first, which is what I thought we had agreed on..."
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarGreg KH <greg@kroah.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d8a3515e
    • Linus Torvalds's avatar
      ecryptfs: fix broken build · 6254b32b
      Linus Torvalds authored
      Stephen Rothwell reports that the vfs merge broke the build of ecryptfs.
      The breakage comes from commit 66cb7666 ("sanitize ecryptfs
      ->mount()") which was obviously not even build tested. Tssk, tssk, Al.
      
      This is the minimal build fixup for the situation, although I don't have
      a filesystem to actually test it with.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6254b32b
  2. 13 Jan, 2011 30 commits