1. 09 Jul, 2012 1 commit
    • NeilBrown's avatar
      md/raid1: fix use-after-free bug in RAID1 data-check code. · 2d4f4f33
      NeilBrown authored
      This bug has been present ever since data-check was introduce
      in 2.6.16.  However it would only fire if a data-check were
      done on a degraded array, which was only possible if the array
      has 3 or more devices.  This is certainly possible, but is quite
      uncommon.
      
      Since hot-replace was added in 3.3 it can happen more often as
      the same condition can arise if not all possible replacements are
      present.
      
      The problem is that as soon as we submit the last read request, the
      'r1_bio' structure could be freed at any time, so we really should
      stop looking at it.  If the last device is being read from we will
      stop looking at it.  However if the last device is not due to be read
      from, we will still check the bio pointer in the r1_bio, but the
      r1_bio might already be free.
      
      So use the read_targets counter to make sure we stop looking for bios
      to submit as soon as we have submitted them all.
      
      This fix is suitable for any -stable kernel since 2.6.16.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarArnold Schulz <arnysch@gmx.net>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      2d4f4f33
  2. 03 Jul, 2012 14 commits
    • NeilBrown's avatar
      md/raid10: fix careless build error · 10684112
      NeilBrown authored
      build error introduced by commit b357f04a
      
      That function doesn't get extra args until a later patch.  Bother.
      
      Reported-by: Fengguang Wu <wfg@linux.intel.com> 
      Reported-by: default avatarSimon Kirby <sim@hostway.ca>
      Reported-by: default avatarTobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      10684112
    • NeilBrown's avatar
      md: fix up plugging (again). · b357f04a
      NeilBrown authored
      The value returned by "mddev_check_plug" is only valid until the
      next 'schedule' as that will unplug things.  This could happen at any
      call to mempool_alloc.
      So just calling mddev_check_plug at the start doesn't really make
      sense.
      
      So call it just before, or just after, queuing things for the thread.
      As the action that happens at unplug is to wake the thread, this makes
      lots of sense.
      If we cannot add a plug (which requires a small GFP_ATOMIC alloc) we
      wake thread immediately.
      
      RAID5 is a bit different.  Requests are queued for the thread and the
      thread is woken by release_stripe.  So we don't need to wake the
      thread on failure.
      However the thread doesn't perform certain actions when there is any
      active plug, so it is important to install a plug before waking the
      thread.  So for RAID5 we install the plug *before* queuing the request
      and waking the thread.
      
      Without this patch it is possible for raid1 or raid10 to queue a
      request without then waking the thread, resulting in the array locking
      up.
      
      Also change raid10 to only flush_pending_write when there are not
      active plugs, just like raid1.
      
      This patch is suitable for 3.0 or later.  I plan to submit it to
      -stable, but I'll like to let it spend a few weeks in mainline
      first to be sure it is completely safe.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      b357f04a
    • NeilBrown's avatar
      md: support re-add of recovering devices. · f4563091
      NeilBrown authored
      We currently only allow a device to be re-added if it appear to be
      in-sync.  This is overly restrictive as it may be desirable to re-add
      a device that is in the middle of recovery.
      
      So remove the test for "InSync" - the test on rdev->raid_disk is
      sufficient to ensure that the re-add will succeed.
      Reported-by: default avatarAlexander Lyakas <alex.bolshoy@gmail.com>
      Tested-by: default avatarAlexander Lyakas <alex.bolshoy@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      f4563091
    • NeilBrown's avatar
      md/raid1: fix bug in read_balance introduced by hot-replace · 32644afd
      NeilBrown authored
      When we added hot_replace we doubled the number of devices
      that could be in a RAID1 array.  So we doubled how far read_balance
      would search.  Unfortunately we didn't double the point at which
      it looped back to the beginning - so it effectively loops over
      all non-replacement disks twice.
      This doesn't cause bad behaviour, but it pointless and means we
      never read from replacement devices.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      32644afd
    • Shaohua Li's avatar
      raid5: delayed stripe fix · fab363b5
      Shaohua Li authored
      There isn't locking setting STRIPE_DELAYED and STRIPE_PREREAD_ACTIVE bits, but
      the two bits have relationship. A delayed stripe can be moved to hold list only
      when preread active stripe count is below IO_THRESHOLD. If a stripe has both
      the bits set, such stripe will be in delayed list and preread count not 0,
      which will make such stripe never leave delayed list.
      Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      fab363b5
    • majianpeng's avatar
      md/raid456: When read error cannot be recovered, record bad block · 2e8ac303
      majianpeng authored
      We may not be able to fix a bad block if:
       - the array is degraded
       - the over-write fails.
      
      In these cases we currently eject the device, but we should
      record a bad block if possible.
      Signed-off-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      2e8ac303
    • NeilBrown's avatar
      md: make 'name' arg to md_register_thread non-optional. · 0232605d
      NeilBrown authored
      Having the 'name' arg optional and defaulting to the current
      personality name is no necessary and leads to errors, as when
      changing the level of an array we can end up using the
      name of the old level instead of the new one.
      
      So make it non-optional and always explicitly pass the name
      of the level that the array will be.
      Reported-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      0232605d
    • NeilBrown's avatar
      md/raid10: fix failure when trying to repair a read error. · 055d3747
      NeilBrown authored
      commit 58c54fcc
           md/raid10: handle further errors during fix_read_error better.
      
      in 3.1 added "r10_sync_page_io" which takes an IO size in sectors.
      But we were passing the IO size in bytes!!!
      This resulting in bio_add_page failing, and empty request being sent
      down, and a consequent BUG_ON in scsi_lib.
      
      [fix missing space in error message at same time]
      
      This fix is suitable for 3.1.y and later.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarChristian Balzer <chibi@gol.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      055d3747
    • NeilBrown's avatar
      md/raid5: fix refcount problem when blocked_rdev is set. · 5f066c63
      NeilBrown authored
      commit 43220aa0
          md/raid5: fix a hang on device failure.
      
      fixed a hang, but introduced a refcounting in-balance so
      that if the presence of bad-blocks ever caused an rdev to
      be 'blocked' we would increment the refcount on the rdev and
      never decrement it.
      
      So added the needed rdev_dec_pending when md_wait_for_blocked_rdev
      is not called.
      Reported-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      5f066c63
    • majianpeng's avatar
      md:Add blk_plug in sync_thread. · 7c2c57c9
      majianpeng authored
      Add blk_plug in sync_thread will increase the performance of sync.
      Because sync_thread did not blk_plug,so when raid sync, the bio merge
      not well.
      
      Testing environment:
      SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
      Controller.
      OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
      x86_64 x86_64 x86_64 GNU/Linux.
      RAID5: four ST31000524NS disk.
      
      Without blk_plug:recovery speed about 63M/Sec;
      Add blk_plug:recovery speed about 120M/Sec.
      
      Using blktrace:
      blktrace -d /dev/sdb -w 60  -o -|blkparse -i -
      
      without blk_plug:
      Total (8,16):
       Reads Queued:      309811,     1239MiB	 Writes Queued:           0,        0KiB
       Read Dispatches:   283583,     1189MiB	 Write Dispatches:        0,        0KiB
       Reads Requeued:         0		 Writes Requeued:         0
       Reads Completed:   273351,     1149MiB	 Writes Completed:        0,        0KiB
       Read Merges:        23533,    94132KiB	 Write Merges:            0,        0KiB
       IO unplugs:             0        	 Timer unplugs:           0
      
      add blk_plug:
      Total (8,16):
       Reads Queued:      428697,     1714MiB	 Writes Queued:           0,        0KiB
       Read Dispatches:     3954,     1714MiB	 Write Dispatches:        0,        0KiB
       Reads Requeued:         0		 Writes Requeued:         0
       Reads Completed:     3956,     1715MiB	 Writes Completed:        0,        0KiB
       Read Merges:       424743,     1698MiB	 Write Merges:            0,        0KiB
       IO unplugs:             0        	 Timer unplugs:        3384
      
      The ratio of merge will be markedly increased.
      Signed-off-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      7c2c57c9
    • majianpeng's avatar
      md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev · 1850753d
      majianpeng authored
      In ops_run_io(), the call to md_wait_for_blocked_rdev will decrement
      nr_pending so we lose the reference we hold on the rdev.
      So atomic_inc it first to maintain the reference.
      
      This bug was introduced by commit  73e92e51
          md/raid5.  Don't write to known bad block on doubtful devices.
      
      which appeared in 3.0, so patch is suitable for stable kernels since
      then.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      1850753d
    • majianpeng's avatar
      md/raid5: Do not add data_offset before call to is_badblock · 6c0544e2
      majianpeng authored
      In chunk_aligned_read() we are adding data_offset before calling
      is_badblock.  But is_badblock also adds data_offset, so that is bad.
      
      So move the addition of data_offset to after the call to
      is_badblock.
      
      This bug was introduced by commit 31c176ec
           md/raid5: avoid reading from known bad blocks.
      which first appeared in 3.0.  So that patch is suitable for any
      -stable kernel from 3.0.y onwards.  However it will need minor
      revision for most of those (as the comment didn't appear until
      recently).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      6c0544e2
    • NeilBrown's avatar
      md/raid5: prefer replacing failed devices over want-replacement devices. · 5cfb22a1
      NeilBrown authored
      If a RAID5 has both a failed device and a device marked as
      'WantReplacement', then we should preferentially replace the failed
      device.
      However the current code replaces whichever is found first.
      So split into 2 loops, check fail failed/missing first, and only check
      for WantReplacement if nothing is failed or missing.
      Reported-by: default avatarmajianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      5cfb22a1
    • NeilBrown's avatar
      md/raid10: Don't try to recovery unmatched (and unused) chunks. · fc448a18
      NeilBrown authored
      If a RAID10 has an odd number of chunks - as might happen when there
      are an odd number of devices - the last chunk has no pair and so is
      not mirrored.  We don't store data there, but when recovering the last
      device in an array we retry to recover that last chunk from a
      non-existent location.  This results in an error, and the recovery
      aborts.
      
      When we get to that last chunk we should just stop - there is nothing
      more to do anyway.
      
      This bug has been present since the introduction of RAID10, so the
      patch is appropriate for any -stable kernel.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarChristian Balzer <chibi@gol.com>
      Tested-by: default avatarChristian Balzer <chibi@gol.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      fc448a18
  3. 24 Jun, 2012 7 commits
  4. 23 Jun, 2012 6 commits
  5. 22 Jun, 2012 9 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs · 369c4f54
      Linus Torvalds authored
      Pull XFS fixes from Ben Myers:
       - Fix stale data exposure with unwritten extents
       - Fix a warning in xfs_alloc_vextent with ODEBUG
       - Fix overallocation and alignment of pages for xfs_bufs
       - Fix a cursor leak
       - Fix a log hang
       - Fix a crash related to xfs_sync_worker
       - Rename xfs log structure from struct log to struct xlog so we can use
         crash dumps effectively
      
      * tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs:
        xfs: rename log structure to xlog
        xfs: shutdown xfs_sync_worker before the log
        xfs: Fix overallocation in xfs_buf_allocate_memory()
        xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
        xfs: check for stale inode before acquiring iflock on push
        xfs: fix debug_object WARN at xfs_alloc_vextent()
        xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
      369c4f54
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a1163719
      Linus Torvalds authored
      Pull perf updates from Ingo Molnar.
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ftrace: Make all inline tags also include notrace
        perf: Use css_tryget() to avoid propping up css refcount
        perf tools: Fix synthesizing tracepoint names from the perf.data headers
        perf stat: Fix default output file
        perf tools: Fix endianity swapping for adds_features bitmask
      a1163719
    • Dave Airlie's avatar
      drm: drop comment about this header being autogenerated. · 59bbe27b
      Dave Airlie authored
      This comment is well out of date.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      59bbe27b
    • Ricardo Neri's avatar
      ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled · dc57aef5
      Ricardo Neri authored
      As per the OMAP4 documentation, audio over HDMI must be transmitted in
      no-idle mode. This patch adds the HWMOD_SWSUP_SIDLE so that omap_hwmod uses
      no-idle/force-idle settings instead of smart-idle mode.
      
      This is required as the DSS interface clock is used as functional clock
      for the HDMI wrapper audio FIFO. If no-idle mode is not used, audio could
      be choppy, have bad quality or not be audible at all.
      Signed-off-by: default avatarRicardo Neri <ricardo.neri@ti.com>
      [b-cousson@ti.com: Update the subject and align the .flags
      location with the script template]
      Signed-off-by: default avatarBenoit Cousson <b-cousson@ti.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      dc57aef5
    • Paul Walmsley's avatar
      ARM: OMAP2+: mux: fix sparse warning · 65e25976
      Paul Walmsley authored
      Commit bbd707ac ("ARM: omap2: use
      machine specific hook for late init") resulted in the addition of this
      sparse warning:
      
      arch/arm/mach-omap2/mux.c:791:12: warning: symbol 'omap_mux_late_init' was not declared. Should it be static?
      
      Fix by including the header file containing the prototype.
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: Tony Lindgren <tony@atomide.com>
      65e25976
    • Paul Walmsley's avatar
      ARM: OMAP2+: CM: increase the module disable timeout · b8f15b7e
      Paul Walmsley authored
      Increase the timeout for disabling an IP block to five milliseconds.
      This is to handle the usb_host_fs idle latency, which takes almost
      four milliseconds after a host controller reset.
      
      This is the second of two patches needed to resolve the following
      boot warning:
      
      omap_hwmod: usb_host_fs: _wait_target_disable failed
      
      Thanks to Sergei Shtylyov <sshtylyov@mvista.com> for finding
      an unrelated hunk in a previous version of this patch.
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Cc: Sergei Shtylyov <sshtylyov@mvista.com>
      Cc: Tero Kristo <t-kristo@ti.com>
      b8f15b7e
    • Paul Walmsley's avatar
      ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks · 9a47d32d
      Paul Walmsley authored
      Until the OMAP4 code is converted to disable the use of the clock
      framework-based clockdomain enable/disable sequence, any clock used as
      a hwmod main_clk must have a clockdomain associated with it.  This
      patch populates some clock structure clockdomain names to resolve the
      following warnings during kernel init:
      
      omap_hwmod: dpll_mpu_m2_ck: missing clockdomain for dpll_mpu_m2_ck.
      omap_hwmod: trace_clk_div_ck: missing clockdomain for trace_clk_div_ck.
      omap_hwmod: l3_div_ck: missing clockdomain for l3_div_ck.
      omap_hwmod: ddrphy_ck: missing clockdomain for ddrphy_ck.
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Cc: Rajendra Nayak <rnayak@ti.com>
      Cc: Benoît Cousson <b-cousson@ti.com>
      9a47d32d
    • Paul Walmsley's avatar
      ARM: OMAP4: hwmod data: fix 32k sync timer idle modes · 252a4c54
      Paul Walmsley authored
      The 32k sync timer IP block target idle modes in the hwmod data are
      incorrect.  The IP block does not support any smart-idle modes.
      Update the data to reflect the correct modes.
      
      This problem was initially identified and a diff fragment posted to
      the lists by Benoît Cousson <b-cousson@ti.com>.  A patch description
      bug in the first version was also identified by Benoît.
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Cc: Benoît Cousson <b-cousson@ti.com>
      Cc: Tero Kristo <t-kristo@ti.com>
      252a4c54
    • Djamil Elaidi's avatar
      ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby · 561038f0
      Djamil Elaidi authored
      If an IP is configured in Smart-Standby-Wakeup, when disabling wakeup feature the
      IP will not go back to Smart-Standby, but will remain in Smart-Standby-Wakeup.
      Signed-off-by: default avatarDjamil Elaidi <d-elaidi@ti.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      561038f0
  6. 21 Jun, 2012 3 commits