1. 25 Jun, 2015 3 commits
    • Neil Brown's avatar
      md: clear Blocked flag on failed devices when array is read-only. · ab16bfc7
      Neil Brown authored
      The Blocked flag indicates that a device has failed but that this
      fact hasn't been recorded in the metadata yet.  Writes to such
      devices cannot be allowed until the metadata has been updated.
      
      On a read-only array, the Blocked flag will never be cleared.
      This prevents the device being removed from the array.
      
      If the metadata is being handled by the kernel
      (i.e. !mddev->external), then we can be sure that if the array is
      switch to writable, then a metadata update will happen and will
      record the failure.  So we don't need the flag set.
      
      If metadata is externally managed, it is upto the external manager
      to clear the 'blocked' flag.
      Reported-by: default avatarXiaoNi <xni@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      ab16bfc7
    • NeilBrown's avatar
      md: unlock mddev_lock on an error path. · 9a8c0fa8
      NeilBrown authored
      This error path retuns while still holding the lock - bad.
      
      Fixes: 6791875e ("md: make reconfig_mutex optional for writes to md sysfs files.")
      Cc: stable@vger.kernel.org (v4.0+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      9a8c0fa8
    • NeilBrown's avatar
      md: clear mddev->private when it has been freed. · bd691922
      NeilBrown authored
      If ->private is set when ->run is called, it is assumed to be
      a 'config'  prepared as part of 'reshape'.
      
      So it is important when we free that config, that we also clear ->private.
      This is not often a problem as the mddev will normally be discarded
      shortly after the config us freed.
      However if an 'assemble' races with a final close, the assemble can use
      the old mddev which has a stale ->private.  This leads to any of
      various sorts of crashes.
      
      So clear ->private after calling ->free().
      Reported-by: default avatarNate Clark <nate@neworld.us>
      Cc: stable@vger.kernel.org (v4.0+)
      Fixes: afa0f557 ("md: rename ->stop to ->free")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      bd691922
  2. 17 Jun, 2015 6 commits
    • Firo Yang's avatar
      md: fix a build warning · 4e023612
      Firo Yang authored
      Warning like this:
      
      drivers/md/md.c: In function "update_array_info":
      drivers/md/md.c:6394:26: warning: logical not is only applied
      to the left hand side of comparison [-Wlogical-not-parentheses]
            !mddev->persistent  != info->not_persistent||
      
      Fix it as Neil Brown said:
      mddev->persistent != !info->not_persistent ||
      Signed-off-by: default avatarFiro Yang <firogm@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      4e023612
    • Shaohua Li's avatar
      md/raid5: ignore released_stripes check · 713bc5c2
      Shaohua Li authored
      conf->released_stripes list isn't always related to where there are
      free stripes pending. Active stripes can be in the list too.
      And even free stripes were active very recently.
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      713bc5c2
    • Yuanhan Liu's avatar
      md/raid5: per hash value and exclusive wait_for_stripe · e9e4c377
      Yuanhan Liu authored
      I noticed heavy spin lock contention at get_active_stripe() with fsmark
      multiple thread write workloads.
      
      Here is how this hot contention comes from. We have limited stripes, and
      it's a multiple thread write workload. Hence, those stripes will be taken
      soon, which puts later processes to sleep for waiting free stripes. When
      enough stripes(>= 1/4 total stripes) are released, all process are woken,
      trying to get the lock. But there is one only being able to get this lock
      for each hash lock, making other processes spinning out there for acquiring
      the lock.
      
      Thus, it's effectiveless to wakeup all processes and let them battle for
      a lock that permits one to access only each time. Instead, we could make
      it be a exclusive wake up: wake up one process only. That avoids the heavy
      spin lock contention naturally.
      
      To do the exclusive wake up, we've to split wait_for_stripe into multiple
      wait queues, to make it per hash value, just like the hash lock.
      
      Here are some test results I have got with this patch applied(all test run
      3 times):
      
      `fsmark.files_per_sec'
      =====================
      
      next-20150317                 this patch
      -------------------------     -------------------------
      metric_value     ±stddev      metric_value     ±stddev     change      testbox/benchmark/testcase-params
      -------------------------     -------------------------   --------     ------------------------------
            25.600     ±0.0              92.700     ±2.5          262.1%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
            25.600     ±0.0              77.800     ±0.6          203.9%     ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
            32.000     ±0.0              93.800     ±1.7          193.1%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
            32.000     ±0.0              81.233     ±1.7          153.9%     ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
            48.800     ±14.5             99.667     ±2.0          104.2%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
             6.400     ±0.0              12.800     ±0.0          100.0%     ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
            63.133     ±8.2              82.800     ±0.7           31.2%     ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
           245.067     ±0.7             306.567     ±7.9           25.1%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
            17.533     ±0.3              21.000     ±0.8           19.8%     ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
           188.167     ±1.9             215.033     ±3.1           14.3%     ivb44/fsmark/1x-1t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
           254.500     ±1.8             290.733     ±2.4           14.2%     ivb44/fsmark/1x-1t-9BRD_6G-RAID5-btrfs-4M-30G-NoSync
      
      `time.system_time'
      =====================
      
      next-20150317                 this patch
      -------------------------    -------------------------
      metric_value     ±stddev     metric_value     ±stddev     change       testbox/benchmark/testcase-params
      -------------------------    -------------------------    --------     ------------------------------
          7235.603     ±1.2             185.163     ±1.9          -97.4%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
          7666.883     ±2.9             202.750     ±1.0          -97.4%     ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
         14567.893     ±0.7             421.230     ±0.4          -97.1%     ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
          3697.667     ±14.0            148.190     ±1.7          -96.0%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
          5572.867     ±3.8             310.717     ±1.4          -94.4%     ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
          5565.050     ±0.5             313.277     ±1.5          -94.4%     ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
          2420.707     ±17.1            171.043     ±2.7          -92.9%     ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
          3743.300     ±4.6             379.827     ±3.5          -89.9%     ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
          3308.687     ±6.3             363.050     ±2.0          -89.0%     ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
      
      Where,
      
           1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
      
           1t, 64t: where 't' means thread
      
           4M: means the single file size, corresponding to the '-s' option of fsmark
           40G, 30G, 120G: means the total test size
      
           4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
                     the size of one ramdisk. So, it would be 48G in total. And we made a
                     raid on those ramdisk
      
      As you can see, though there are no much performance gain for hard disk
      workload, the system time is dropped heavily, up to 97%. And as expected,
      the performance increased a lot, up to 260%, for fast device(ram disk).
      
      v2: use bits instead of array to note down wait queue need to wake up.
      Signed-off-by: default avatarYuanhan Liu <yuanhan.liu@linux.intel.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      e9e4c377
    • Yuanhan Liu's avatar
      md/raid5: split wait_for_stripe and introduce wait_for_quiescent · b1b46486
      Yuanhan Liu authored
      I noticed heavy spin lock contention at get_active_stripe(), introduced
      at being wake up stage, where a bunch of processes try to re-hold the
      spin lock again.
      
      After giving some thoughts on this issue, I found the lock could be
      relieved(and even avoided) if we turn the wait_for_stripe to per
      waitqueue for each lock hash and make the wake up exclusive: wake up
      one process each time, which avoids the lock contention naturally.
      
      Before go hacking with wait_for_stripe, I found it actually has 2
      usages: for the array to enter or leave the quiescent state, and also
      to wait for an available stripe in each of the hash lists.
      
      So this patch splits the first usage off into a separate wait_queue,
      wait_for_quiescent, and the next patch will turn the second usage into
      one waitqueue for each hash value, and make it exclusive, to relieve
      the lock contention.
      
      v2: wake_up(wait_for_quiescent) when (active_stripes == 0)
          Commit log refactor suggestion from Neil.
      Signed-off-by: default avatarYuanhan Liu <yuanhan.liu@linux.intel.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      b1b46486
    • Yuanhan Liu's avatar
      wait: introduce wait_event_exclusive_cmd · 9f3520c3
      Yuanhan Liu authored
      It's just a variant of wait_event_cmd(), with exclusive flag being set.
      
      For cases like RAID5, which puts many processes to sleep until 1/4
      resources are free, a wake_up wakes up all processes to run, but
      there is one process being able to get the resource as it's protected
      by a spin lock. That ends up introducing heavy lock contentions, and
      hurts performance badly.
      
      Here introduce wait_event_exclusive_cmd to relieve the lock contention
      naturally by letting wake_up just wake up one process.
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      v2: its assumed that wait*() and __wait*() have the same arguments - peterz
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarYuanhan Liu <yuanhan.liu@linux.intel.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      9f3520c3
    • Alexey Dobriyan's avatar
      md: convert to kstrto*() · 4c9309c0
      Alexey Dobriyan authored
      Convert away from deprecated simple_strto*() functions.
      
      Add "fit into sector_t" checks.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      4c9309c0
  3. 16 Jun, 2015 1 commit
  4. 15 Jun, 2015 7 commits
  5. 14 Jun, 2015 1 commit
    • Linus Torvalds's avatar
      Merge tag 'sound-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2fbbada1
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Most of commits are regression fixes for HD-audio: a few corner case
        fixes for regmap transition, and i915 binding issues.
      
        In addition, a quirk for another USB-audio device supporting DSD"
      
      * tag 'sound-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Abort the probe without i915 binding for HSW/BDW
        ALSA: hda - Re-add the lost fake mute support
        ALSA: hda - Continue probing even if i915 binding fails
        ALSA: hda - Don't actually write registers for caps overwrites
        ALSA: hda - fix number of devices query on hotplug
        ALSA: usb-audio: add native DSD support for JLsounds I2SoverUSB
      2fbbada1
  6. 13 Jun, 2015 3 commits
    • Jaedon Shin's avatar
      MPI: MIPS: Fix compilation error with GCC 5.1 · 36f58113
      Jaedon Shin authored
      This patch fixes mips compilation error:
      
      lib/mpi/generic_mpih-mul1.c: In function 'mpihelp_mul_1':
      lib/mpi/longlong.h:651:2: error: impossible constraint in 'asm'
      Signed-off-by: default avatarJaedon Shin <jaedon.shin@gmail.com>
      Cc: Linux-MIPS <linux-mips@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/10546/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      36f58113
    • Rabin Vincent's avatar
      IRQCHIP: mips-gic: Don't nest calls to do_IRQ() · 1b3ed367
      Rabin Vincent authored
      The GIC chained handlers use do_IRQ() to call the subhandlers.  This
      means that irq_enter() calls get nested, which leads to preempt count
      looking like we're in nested interrupts, which in turn leads to all
      system time being accounted as IRQ time in account_system_time().
      
      Fix it by using generic_handle_irq().  Since these same functions are
      used in some systems (if cpu_has_veic) from a low-level vectored
      interrupt handler which does not go throught do_IRQ(), we need to do it
      conditionally.
      Signed-off-by: default avatarRabin Vincent <rabin.vincent@axis.com>
      Reviewed-by: default avatarAndrew Bresticker <abrestic@chromium.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: linux-mips@linux-mips.org
      Cc: tglx@linutronix.de
      Cc: jason@lakedaemon.net
      Patchwork: https://patchwork.linux-mips.org/patch/10545/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      1b3ed367
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c8d17b45
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix uninitialized struct station_info in cfg80211_wireless_stats(),
          from Johannes Berg.
      
       2) Revert commit attempt to fix ipv6 protocol resubmission, it adds
          regressions.
      
       3) Endless loops can be created in bridge port lists, fix from Nikolay
          Aleksandrov.
      
       4) Don't WARN_ON() if sk->sk_forward_alloc is non-zero in
          sk_clear_memalloc, it is a legal situation during swap deactivation.
          Fix from Mel Gorman.
      
       5) Fix order of disabling interrupts and unlocking NAPI in enic driver
          to avoid a race.  From Govindarajulu Varadarajan.
      
       6) High and low register writes are swapped when programming the start
          of periodic output in igb driver.  From Richard Cochran.
      
       7) Fix device rename handling in mpls stack, from Robert Shearman.
      
       8) Do not trigger compaction synchronously when optimistically trying
          to allocate an order 3 page in alloc_skb_with_frags() and
          skb_page_frag_refill().  From Shaohua Li.
      
       9) Authentication with COOKIE_ECHO is not handled properly in SCTP, fix
          from Marcelo Ricardo Leitner.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        Doc: networking: Fix URL for wiki.wireshark.org in udplite.txt
        sctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO
        net: don't wait for order-3 page allocation
        mpls: handle device renames for per-device sysctls
        net: igb: fix the start time for periodic output signals
        enic: fix memory leak in rq_clean
        enic: check return value for stat dump
        enic: unlock napi busy poll before unmasking intr
        net, swap: Remove a warning and clarify why sk_mem_reclaim is required when deactivating swap
        bridge: fix multicast router rlist endless loop
        tipc: disconnect socket directly after probe failure
        Revert "ipv6: Fix protocol resubmission"
        cfg80211: wext: clear sinfo struct before calling driver
      c8d17b45
  7. 12 Jun, 2015 15 commits
    • Masanari Iida's avatar
      Doc: networking: Fix URL for wiki.wireshark.org in udplite.txt · b07d4961
      Masanari Iida authored
      This patch fix URL (http to https) for wiki.wireshark.org.
      Signed-off-by: default avatarMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b07d4961
    • Marcelo Ricardo Leitner's avatar
      sctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO · ae36806a
      Marcelo Ricardo Leitner authored
      Currently, we can ask to authenticate DATA chunks and we can send DATA
      chunks on the same packet as COOKIE_ECHO, but if you try to combine
      both, the DATA chunk will be sent unauthenticated and peer won't accept
      it, leading to a communication failure.
      
      This happens because even though the data was queued after it was
      requested to authenticate DATA chunks, it was also queued before we
      could know that remote peer can handle authenticating, so
      sctp_auth_send_cid() returns false.
      
      The fix is whenever we set up an active key, re-check send queue for
      chunks that now should be authenticated. As a result, such packet will
      now contain COOKIE_ECHO + AUTH + DATA chunks, in that order.
      Reported-by: default avatarLiu Wei <weliu@redhat.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae36806a
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · b85dfd30
      Linus Torvalds authored
      Pull block layer fixes from Jens Axboe:
       "Remember about a week ago when I sent the last pull request for 4.1?
        Well, I lied.  Now, I don't want to shift the blame, but Dan, Ming,
        and Richard made a liar out of me.
      
        Here are three small patches that should go into 4.1.  More
        specifically, this pull request contains:
      
         - A Kconfig dependency for the pmem block driver, so it can't be
           selected if HAS_IOMEM isn't availble.  From Richard Weinberger.
      
         - A fix for genhd, making the ext_devt_lock softirq safe.  This makes
           lockdep happier, since we also end up grabbing this lock on release
           off the softirq path.  From Dan Williams.
      
         - A blk-mq software queue release fix from Ming Lei.
      
        Last two are headed to stable, first fixes an issue introduced in this
        cycle"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: pmem: Add dependency on HAS_IOMEM
        block: fix ext_dev_lock lockdep report
        blk-mq: free hctx->ctxs in queue's release handler
      b85dfd30
    • Linus Torvalds's avatar
      Merge tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md · 7b565d9d
      Linus Torvalds authored
      Pull three more md fixes from Neil Brown:
       "Hasn't been a good cycle for md has it :-(
      
        The main issue fixed here is a rare race which can result in two
        reshape threads running at once, which doesn't end well.
      
        Also a minor issue with a write to a sysfs file returning the wrong
        value.  Backports to 4.0-stable are indicated"
      
      * tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md:
        md: make sure MD_RECOVERY_DONE is clear before starting recovery/resync
        md: Close race when setting 'action' to 'idle'.
        md: don't return 0 from array_state_store
      7b565d9d
    • Linus Torvalds's avatar
      Merge git://git.infradead.org/intel-iommu · c39f3bc6
      Linus Torvalds authored
      Pull VT-d hardware workarounds from David Woodhouse:
       "This contains a workaround for hardware issues which I *thought* were
        never going to be seen on production hardware.  I'm glad I checked
        that before the 4.1 release...
      
        Firstly, PASID support is so broken on existing chips that we're just
        going to declare the old capability bit 28 as 'reserved' and change
        the VT-d spec to move PASID support to another bit.  So any existing
        hardware doesn't support SVM; it only sets that (now) meaningless bit
        28.
      
        That patch *wasn't* imperative for 4.1 because we don't have PASID
        support yet.  But *even* the extended context tables are broken — if
        you just enable the wider tables and use none of the new bits in them,
        which is precisely what 4.1 does, you find that translations don't
        work.  It's this problem which I thought was caught in time to be
        fixed before production, but wasn't.
      
        To avoid triggering this issue, we now *only* enable the extended
        context tables on hardware which also advertises "we have PASID
        support and we actually tested it this time" with the new PASID
        feature bit.
      
        In addition, I've added an 'intel_iommu=ecs_off' command line
        parameter to allow us to disable it manually if we need to"
      
      * git://git.infradead.org/intel-iommu:
        iommu/vt-d: Only enable extended context tables if PASID is supported
        iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register
      c39f3bc6
    • David Woodhouse's avatar
      iommu/vt-d: Only enable extended context tables if PASID is supported · c83b2f20
      David Woodhouse authored
      Although the extended tables are theoretically a completely orthogonal
      feature to PASID and anything else that *uses* the newly-available bits,
      some of the early hardware has problems even when all we do is enable
      them and use only the same bits that were in the old context tables.
      
      For now, there's no motivation to support extended tables unless we're
      going to use PASID support to do SVM. So just don't use them unless
      PASID support is advertised too. Also add a command-line bailout just in
      case later chips also have issues.
      
      The equivalent problem for PASID support has already been fixed with the
      upcoming VT-d spec update and commit bd00c606 ("iommu/vt-d: Change
      PASID support to bit 40 of Extended Capability Register"), because the
      problematic platforms use the old definition of the PASID-capable bit,
      which is now marked as reserved and meaningless.
      
      So with this change, we'll magically start using ECS again only when we
      see the new hardware advertising "hey, we have PASID support and we
      actually tested it this time" on bit 40.
      
      The VT-d hardware architect has promised that we are not going to have
      any reason to support ECS *without* PASID any time soon, and he'll make
      sure he checks with us before changing that.
      
      In the future, if hypothetical new features also use new bits in the
      context tables and can be seen on implementations *without* PASID support,
      we might need to add their feature bits to the ecs_enabled() macro.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      c83b2f20
    • NeilBrown's avatar
      md: make sure MD_RECOVERY_DONE is clear before starting recovery/resync · ea358cd0
      NeilBrown authored
      MD_RECOVERY_DONE is normally cleared by md_check_recovery after a
      resync etc finished.  However it is possible for raid5_start_reshape
      to race and start a reshape before MD_RECOVERY_DONE is cleared.  This
      can lean to multiple reshapes running at the same time, which isn't
      good.
      
      To make sure it is cleared before starting a reshape, and also clear
      it when reaping a thread, just to be safe.
      Signed-off-by: default avatarNeilBrown  <neilb@suse.de>
      ea358cd0
    • NeilBrown's avatar
      md: Close race when setting 'action' to 'idle'. · 8e8e2518
      NeilBrown authored
      Checking ->sync_thread without holding the mddev_lock()
      isn't really safe, even after flushing the workqueue which
      ensures md_start_sync() has been run.
      
      While this code is waiting for the lock, md_check_recovery could reap
      the thread itself, and then start another thread (e.g. recovery might
      finish, then reshape starts).  When this thread gets the lock
      md_start_sync() hasn't run so it doesn't get reaped, but
      MD_RECOVERY_RUNNING gets cleared.  This allows two threads to start
      which leads to confusion.
      
      So don't both if MD_RECOVERY_RUNNING isn't set, but if it is do
      the flush and the test and the reap all under the mddev_lock to
      avoid any race with md_check_recovery.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Fixes: 6791875e ("md: make reconfig_mutex optional for writes to md sysfs files.")
      Cc: stable@vger.kernel.org (v4.0+)
      8e8e2518
    • NeilBrown's avatar
      md: don't return 0 from array_state_store · c008f1d3
      NeilBrown authored
      Returning zero from a 'store' function is bad.
      The return value should be either len length of the string
      or an error.
      
      So use 'len' if 'err' is zero.
      
      Fixes: 6791875e ("md: make reconfig_mutex optional for writes to md sysfs files.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Cc: stable@vger.kernel (v4.0+)
      c008f1d3
    • Krzysztof Kozlowski's avatar
      dmaengine: Fix choppy sound because of unimplemented resume · 88d04643
      Krzysztof Kozlowski authored
      Some drivers implement only pause operation (no resuming). Example is
      pl330 where pause is needed for getting residuum. pl330 does not support
      resume operation, transfer must be stopped after pause.
      
      However for slaves this is exposed always as "pause and resume" which
      introduces subtle errors on Odroid U3 board (Exynos4412 with pl330).
      After adding pause function to pl330 driver the audio playback
      (utilizing DMA) gets choppy after some time (approximately 24 hours).
      
      Fix this by exposing "cmd_pause" if and only if pause and resume are
      implemented.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Reported-by: gabriel@unseen.is
      Reported-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Cc: <stable@vger.kernel.org>
      Fixes: 88987d2c ("dmaengine: pl330: add DMA_PAUSE feature")
      Acked-by: default avatarMaxime Ripard <maxime.ripard@free-electrons.com>
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      88d04643
    • Takashi Iwai's avatar
      ALSA: hda - Abort the probe without i915 binding for HSW/BDW · 535115b5
      Takashi Iwai authored
      The previous patch tried to continue the probe if i915 binding fails.
      For for simplicity reason, we haven't implemented abort even for
      controller chips that are dedicated for HDMI/DP on HSW and BDW.
      However, Mengdong suggested that this can be dangerous; BIOS may
      disable gfx power well although the PCI entry for HD-audio is left,
      and this may result in the unexpected behavior, kernel errors, etc.
      
      For avoiding this situation, abort the probe at i915 binding failure
      only for HSW/BDW chips selectively.  For other chips, it still
      continues.
      
      Fixes: bf06848b ('ALSA: hda - Continue probing even if i915 binding fails')
      Reported-by: default avatarMengdong Lin <mengdong.lin@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      535115b5
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · df5f4158
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "i915 and radeon fixes:
      
        i915:
            fix for connector oops regression
            DDC probing fix
      
        radeon:
            two radeon reverts, along with a freeze workaround and a fix"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO
        Revert "drm/radeon: adjust pll when audio is not enabled"
        Revert "drm/radeon: don't share plls if monitors differ in audio support"
        drm/radeon: fix freeze for laptop with Turks/Thames GPU.
        drm/i915: Fix DDC probe for passive adapters
        drm/i915: Properly initialize SDVO analog connectors
      df5f4158
    • Shaohua Li's avatar
      net: don't wait for order-3 page allocation · fb05e7a8
      Shaohua Li authored
      We saw excessive direct memory compaction triggered by skb_page_frag_refill.
      This causes performance issues and add latency. Commit 5640f768
      introduces the order-3 allocation. According to the changelog, the order-3
      allocation isn't a must-have but to improve performance. But direct memory
      compaction has high overhead. The benefit of order-3 allocation can't
      compensate the overhead of direct memory compaction.
      
      This patch makes the order-3 page allocation atomic. If there is no memory
      pressure and memory isn't fragmented, the alloction will still success, so we
      don't sacrifice the order-3 benefit here. If the atomic allocation fails,
      direct memory compaction will not be triggered, skb_page_frag_refill will
      fallback to order-0 immediately, hence the direct memory compaction overhead is
      avoided. In the allocation failure case, kswapd is waken up and doing
      compaction, so chances are allocation could success next time.
      
      alloc_skb_with_frags is the same.
      
      The mellanox driver does similar thing, if this is accepted, we must fix
      the driver too.
      
      V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
      V2: make the changelog clearer
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Debabrata Banerjee <dbavatar@gmail.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb05e7a8
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2015-06-11' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 6e2eb00f
      Dave Airlie authored
      Fix for the regression Linus called out, and another for probing
      dongles.
      
      * tag 'drm-intel-fixes-2015-06-11' of git://anongit.freedesktop.org/drm-intel:
        drm/i915: Fix DDC probe for passive adapters
        drm/i915: Properly initialize SDVO analog connectors
      6e2eb00f
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 950c3707
      Dave Airlie authored
      Two regression reverts, and two fixes, one for a dpm boot freeze.
      
      * 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux:
        drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO
        Revert "drm/radeon: adjust pll when audio is not enabled"
        Revert "drm/radeon: don't share plls if monitors differ in audio support"
        drm/radeon: fix freeze for laptop with Turks/Thames GPU.
      950c3707
  8. 11 Jun, 2015 4 commits
    • Robert Shearman's avatar
      mpls: handle device renames for per-device sysctls · 0fae3bf0
      Robert Shearman authored
      If a device is renamed and the original name is subsequently reused
      for a new device, the following warning is generated:
      
      sysctl duplicate entry: /net/mpls/conf/veth0//input
      CPU: 3 PID: 1379 Comm: ip Not tainted 4.1.0-rc4+ #20
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
       0000000000000000 0000000000000000 ffffffff81566aaf 0000000000000000
       ffffffff81236279 ffff88002f7d7f00 0000000000000000 ffff88000db336d8
       ffff88000db33698 0000000000000005 ffff88002e046000 ffff8800168c9280
      Call Trace:
       [<ffffffff81566aaf>] ? dump_stack+0x40/0x50
       [<ffffffff81236279>] ? __register_sysctl_table+0x289/0x5a0
       [<ffffffffa051a24f>] ? mpls_dev_notify+0x1ff/0x300 [mpls_router]
       [<ffffffff8108db7f>] ? notifier_call_chain+0x4f/0x70
       [<ffffffff81470e72>] ? register_netdevice+0x2b2/0x480
       [<ffffffffa0524748>] ? veth_newlink+0x178/0x2d3 [veth]
       [<ffffffff8147f84c>] ? rtnl_newlink+0x73c/0x8e0
       [<ffffffff8147f27a>] ? rtnl_newlink+0x16a/0x8e0
       [<ffffffff81459ff2>] ? __kmalloc_reserve.isra.30+0x32/0x90
       [<ffffffff8147ccfd>] ? rtnetlink_rcv_msg+0x8d/0x250
       [<ffffffff8145b027>] ? __alloc_skb+0x47/0x1f0
       [<ffffffff8149badb>] ? __netlink_lookup+0xab/0xe0
       [<ffffffff8147cc70>] ? rtnetlink_rcv+0x30/0x30
       [<ffffffff8149e7a0>] ? netlink_rcv_skb+0xb0/0xd0
       [<ffffffff8147cc64>] ? rtnetlink_rcv+0x24/0x30
       [<ffffffff8149df17>] ? netlink_unicast+0x107/0x1a0
       [<ffffffff8149e4be>] ? netlink_sendmsg+0x50e/0x630
       [<ffffffff8145209c>] ? sock_sendmsg+0x3c/0x50
       [<ffffffff81452beb>] ? ___sys_sendmsg+0x27b/0x290
       [<ffffffff811bd258>] ? mem_cgroup_try_charge+0x88/0x110
       [<ffffffff811bd5b6>] ? mem_cgroup_commit_charge+0x56/0xa0
       [<ffffffff811d7700>] ? do_filp_open+0x30/0xa0
       [<ffffffff8145336e>] ? __sys_sendmsg+0x3e/0x80
       [<ffffffff8156c3f2>] ? system_call_fastpath+0x16/0x75
      
      Fix this by unregistering the previous sysctl table (registered for
      the path containing the original device name) and re-registering the
      table for the path containing the new device name.
      
      Fixes: 37bde799 ("mpls: Per-device enabling of packet input")
      Reported-by: default avatarScott Feldman <sfeldma@gmail.com>
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fae3bf0
    • Richard Cochran's avatar
      net: igb: fix the start time for periodic output signals · 58c98be1
      Richard Cochran authored
      When programming the start of a periodic output, the code wrongly places
      the seconds value into the "low" register and the nanoseconds into the
      "high" register.  Even though this is backwards, it slipped through my
      testing, because the re-arming code in the interrupt service routine is
      correct, and the signal does appear starting with the second edge.
      
      This patch fixes the issue by programming the registers correctly.
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58c98be1
    • Richard Weinberger's avatar
      block: pmem: Add dependency on HAS_IOMEM · b6f2098f
      Richard Weinberger authored
      Not all architectures have io memory.
      
      Fixes:
      drivers/block/pmem.c: In function ‘pmem_alloc’:
      drivers/block/pmem.c:146:2: error: implicit declaration of function ‘ioremap_nocache’ [-Werror=implicit-function-declaration]
        pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
        ^
      drivers/block/pmem.c:146:18: warning: assignment makes pointer from integer without a cast [enabled by default]
        pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
                        ^
      drivers/block/pmem.c:182:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
        iounmap(pmem->virt_addr);
        ^
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      b6f2098f
    • Linus Torvalds's avatar
      Merge tag 'trace-rb-bm-fix-4.1-rc7' of... · cff100f5
      Linus Torvalds authored
      Merge tag 'trace-rb-bm-fix-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull ring buffer benchmark buglet fix from Steven Rostedt:
       "Wang Long fixed a minor bug in the module parameter for the ring
        buffer benchmark, where the produce_fifo was being ignored and the
        producer thread's priority was being set with the consumer_fifo
        parameter"
      
      * tag 'trace-rb-bm-fix-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ring-buffer-benchmark: Fix the wrong sched_priority of producer
      cff100f5