1. 09 Jul, 2014 2 commits
  2. 23 Jun, 2014 2 commits
    • Paul E. McKenney's avatar
      rcu: Reduce overhead of cond_resched() checks for RCU · 4a81e832
      Paul E. McKenney authored
      Commit ac1bea85 (Make cond_resched() report RCU quiescent states)
      fixed a problem where a CPU looping in the kernel with but one runnable
      task would give RCU CPU stall warnings, even if the in-kernel loop
      contained cond_resched() calls.  Unfortunately, in so doing, it introduced
      performance regressions in Anton Blanchard's will-it-scale "open1" test.
      The problem appears to be not so much the increased cond_resched() path
      length as an increase in the rate at which grace periods complete, which
      increased per-update grace-period overhead.
      
      This commit takes a different approach to fixing this bug, mainly by
      moving the RCU-visible quiescent state from cond_resched() to
      rcu_note_context_switch(), and by further reducing the check to a
      simple non-zero test of a single per-CPU variable.  However, this
      approach requires that the force-quiescent-state processing send
      resched IPIs to the offending CPUs.  These will be sent only once
      the grace period has reached an age specified by the boot/sysfs
      parameter rcutree.jiffies_till_sched_qs, or once the grace period
      reaches an age halfway to the point at which RCU CPU stall warnings
      will be emitted, whichever comes first.
      Reported-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Lameter <cl@gentwo.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      [ paulmck: Made rcu_momentary_dyntick_idle() as suggested by the
        ktest build robot.  Also fixed smp_mb() comment as noted by
        Oleg Nesterov. ]
      
      Merge with e552592e (Reduce overhead of cond_resched() checks for RCU)
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4a81e832
    • Paul E. McKenney's avatar
      rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() · 546a9d85
      Paul E. McKenney authored
      Currently, call_rcu() relies on implicit allocation and initialization
      for the debug-objects handling of RCU callbacks.  If you hammer the
      kernel hard enough with Sasha's modified version of trinity, you can end
      up with the sl*b allocators recursing into themselves via this implicit
      call_rcu() allocation.
      
      This commit therefore exports the debug_init_rcu_head() and
      debug_rcu_head_free() functions, which permits the allocators to allocated
      and pre-initialize the debug-objects information, so that there no longer
      any need for call_rcu() to do that initialization, which in turn prevents
      the recursion into the memory allocators.
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Looks-good-to: Christoph Lameter <cl@linux.com>
      546a9d85
  3. 16 Jun, 2014 4 commits
    • Linus Torvalds's avatar
      Linux 3.16-rc1 · 7171511e
      Linus Torvalds authored
      7171511e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · a9be2242
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix checksumming regressions, from Tom Herbert.
      
       2) Undo unintentional permissions changes for SCTP rto_alpha and
          rto_beta sysfs knobs, from Denial Borkmann.
      
       3) VXLAN, like other IP tunnels, should advertize it's encapsulation
          size using dev->needed_headroom instead of dev->hard_header_len.
          From Cong Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: sctp: fix permissions for rto_alpha and rto_beta knobs
        vxlan: Checksum fixes
        net: add skb_pop_rcv_encapsulation
        udp: call __skb_checksum_complete when doing full checksum
        net: Fix save software checksum complete
        net: Fix GSO constants to match NETIF flags
        udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup
        vxlan: use dev->needed_headroom instead of dev->hard_header_len
        MAINTAINERS: update cxgb4 maintainer
      a9be2242
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux · dd1845af
      Linus Torvalds authored
      Pull more clock framework updates from Mike Turquette:
       "This contains the second half the of the clk changes for 3.16.
      
        They are simply fixes and code refactoring for the OMAP clock drivers.
        The sunxi clock driver changes include splitting out the one
        mega-driver into several smaller pieces and adding support for the A31
        SoC clocks"
      
      * tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits)
        clk: sunxi: document PRCM clock compatible strings
        clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support
        clk: sun6i: Protect SDRAM gating bit
        clk: sun6i: Protect CPU clock
        clk: sunxi: Rework clock protection code
        clk: sunxi: Move the GMAC clock to a file of its own
        clk: sunxi: Move the 24M oscillator to a file of its own
        clk: sunxi: Remove calls to clk_put
        clk: sunxi: document new A31 USB clock compatible
        clk: sunxi: Implement A31 USB clock
        ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies
        CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies
        ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC)
        CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck
        CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)
        dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings
        ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock
        CLK: TI: gate: add composite interface clock to OMAP2 only build
        ARM: OMAP2: clock: add DT boot support for cpufreq_ck
        CLK: TI: OMAP2: add clock init support
        ...
      dd1845af
    • Linus Torvalds's avatar
      Merge git://git.infradead.org/users/willy/linux-nvme · b55b3902
      Linus Torvalds authored
      Pull NVMe update from Matthew Wilcox:
       "Mostly bugfixes again for the NVMe driver.  I'd like to call out the
        exported tracepoint in the block layer; I believe Keith has cleared
        this with Jens.
      
        We've had a few reports from people who're really pounding on NVMe
        devices at scale, hence the timeout changes (and new module
        parameters), hotplug cpu deadlock, tracepoints, and minor performance
        tweaks"
      
      [ Jens hadn't seen that tracepoint thing, but is ok with it - it will
        end up going away when mq conversion happens ]
      
      * git://git.infradead.org/users/willy/linux-nvme: (22 commits)
        NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.
        NVMe: Use Log Page constants in SCSI emulation
        NVMe: Define Log Page constants
        NVMe: Fix hot cpu notification dead lock
        NVMe: Rename io_timeout to nvme_io_timeout
        NVMe: Use last bytes of f/w rev SCSI Inquiry
        NVMe: Adhere to request queue block accounting enable/disable
        NVMe: Fix nvme get/put queue semantics
        NVMe: Delete NVME_GET_FEAT_TEMP_THRESH
        NVMe: Make admin timeout a module parameter
        NVMe: Make iod bio timeout a parameter
        NVMe: Prevent possible NULL pointer dereference
        NVMe: Fix the buffer size passed in GetLogPage(CDW10.NUMD)
        NVMe: Update data structures for NVMe 1.2
        NVMe: Enable BUILD_BUG_ON checks
        NVMe: Update namespace and controller identify structures to the 1.1a spec
        NVMe: Flush with data support
        NVMe: Configure support for block flush
        NVMe: Add tracepoints
        NVMe: Protect against badly formatted CQEs
        ...
      b55b3902
  4. 15 Jun, 2014 11 commits
    • Daniel Borkmann's avatar
      net: sctp: fix permissions for rto_alpha and rto_beta knobs · b58537a1
      Daniel Borkmann authored
      Commit 3fd091e7 ("[SCTP]: Remove multiple levels of msecs
      to jiffies conversions.") has silently changed permissions for
      rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
      this was to discourage users from tweaking rto_alpha and
      rto_beta knobs in production environments since they are key
      to correctly compute rtt/srtt.
      
      RFC4960 under section 6.3.1. RTO Calculation says regarding
      rto_alpha and rto_beta under rule C3 and C4:
      
        [...]
        C3)  When a new RTT measurement R' is made, set
      
             RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'|
      
             and
      
             SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R'
      
             Note: The value of SRTT used in the update to RTTVAR
             is its value before updating SRTT itself using the
             second assignment. After the computation, update
             RTO <- SRTT + 4 * RTTVAR.
      
        C4)  When data is in flight and when allowed by rule C5
             below, a new RTT measurement MUST be made each round
             trip. Furthermore, new RTT measurements SHOULD be
             made no more than once per round trip for a given
             destination transport address. There are two reasons
             for this recommendation: First, it appears that
             measuring more frequently often does not in practice
             yield any significant benefit [ALLMAN99]; second,
             if measurements are made more often, then the values
             of RTO.Alpha and RTO.Beta in rule C3 above should be
             adjusted so that SRTT and RTTVAR still adjust to
             changes at roughly the same rate (in terms of how many
             round trips it takes them to reflect new values) as
             they would if making only one measurement per
             round-trip and using RTO.Alpha and RTO.Beta as given
             in rule C3. However, the exact nature of these
             adjustments remains a research issue.
        [...]
      
      While it is discouraged to adjust rto_alpha and rto_beta
      and not further specified how to adjust them, the RFC also
      doesn't explicitly forbid it, but rather gives a RECOMMENDED
      default value (rto_alpha=3, rto_beta=2). We have a couple
      of users relying on the old permissions before they got
      changed. That said, if someone really has the urge to adjust
      them, we could allow it with a warning in the log.
      
      Fixes: 3fd091e7 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b58537a1
    • David S. Miller's avatar
      Merge branch 'csum_fixes' · e4f7ae93
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      Fixes related to some recent checksum modifications.
      
      - Fix GSO constants to match NETIF flags
      - Fix logic in saving checksum complete in __skb_checksum_complete
      - Call __skb_checksum_complete from UDP if we are checksumming over
        whole packet in order to save checksum.
      - Fixes to VXLAN to work correctly with checksum complete
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4f7ae93
    • Tom Herbert's avatar
      vxlan: Checksum fixes · f79b064c
      Tom Herbert authored
      Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet
      header to work properly with checksum complete.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f79b064c
    • Tom Herbert's avatar
      net: add skb_pop_rcv_encapsulation · e5eb4e30
      Tom Herbert authored
      This function is used by UDP encapsulation protocols in RX when
      crossing encapsulation boundary. If ip_summed is set to
      CHECKSUM_UNNECESSARY and encapsulation is not set, change to
      CHECKSUM_NONE since the checksum has not been validated within the
      encapsulation. Clears csum_valid by the same rationale.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5eb4e30
    • Tom Herbert's avatar
      udp: call __skb_checksum_complete when doing full checksum · bbdff225
      Tom Herbert authored
      In __udp_lib_checksum_complete check if checksum is being done over all
      the data (len is equal to skb->len) and if it is call
      __skb_checksum_complete instead of __skb_checksum_complete_head. This
      allows checksum to be saved in checksum complete.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbdff225
    • Tom Herbert's avatar
      net: Fix save software checksum complete · 46fb51eb
      Tom Herbert authored
      Geert reported issues regarding checksum complete and UDP.
      The logic introduced in commit 7e3cead5
      ("net: Save software checksum complete") is not correct.
      
      This patch:
      1) Restores code in __skb_checksum_complete_header except for setting
         CHECKSUM_UNNECESSARY. This function may be calculating checksum on
         something less than skb->len.
      2) Adds saving checksum to __skb_checksum_complete. The full packet
         checksum 0..skb->len is calculated without adding in pseudo header.
         This value is saved in skb->csum and then the pseudo header is added
         to that to derive the checksum for validation.
      3) In both __skb_checksum_complete_header and __skb_checksum_complete,
         set skb->csum_valid to whether checksum of zero was computed. This
         allows skb_csum_unnecessary to return true without changing to
         CHECKSUM_UNNECESSARY which was done previously.
      4) Copy new csum related bits in __copy_skb_header.
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46fb51eb
    • Tom Herbert's avatar
      net: Fix GSO constants to match NETIF flags · 4b28252c
      Tom Herbert authored
      Joseph Gasparakis reported that VXLAN GSO offload stopped working with
      i40e device after recent UDP changes. The problem is that the
      SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This
      patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several
      GSO constants that were missing to avoid the problem in the future.
      Reported-by: default avatarJoseph Gasparakis <joseph.gasparakis@intel.com>
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b28252c
    • Linus Torvalds's avatar
      Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · abf04af7
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "This is just a couple of drivers (hpsa and lpfc) that got left out for
        further testing in linux-next.  We also have one fix to a prior
        submission (qla2xxx sparse)"
      
      * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (36 commits)
        qla2xxx: fix sparse warnings introduced by previous target mode t10-dif patch
        lpfc: Update lpfc version to driver version 10.2.8001.0
        lpfc: Fix ExpressLane priority setup
        lpfc: mark old devices as obsolete
        lpfc: Fix for initializing RRQ bitmap
        lpfc: Fix for cleaning up stale ring flag and sp_queue_event entries
        lpfc: Update lpfc version to driver version 10.2.8000.0
        lpfc: Update Copyright on changed files from 8.3.45 patches
        lpfc: Update Copyright on changed files
        lpfc: Fixed locking for scsi task management commands
        lpfc: Convert runtime references to old xlane cfg param to fof cfg param
        lpfc: Fix FW dump using sysfs
        lpfc: Fix SLI4 s abort loop to process all FCP rings and under ring_lock
        lpfc: Fixed kernel panic in lpfc_abort_handler
        lpfc: Fix locking for postbufq when freeing
        lpfc: Fix locking for lpfc_hba_down_post
        lpfc: Fix dynamic transitions of FirstBurst from on to off
        hpsa: fix handling of hpsa_volume_offline return value
        hpsa: return -ENOMEM not -1 on kzalloc failure in hpsa_get_device_id
        hpsa: remove messages about volume status VPD inquiry page not supported
        ...
      abf04af7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 16d52ef7
      Linus Torvalds authored
      Pull more btrfs updates from Chris Mason:
       "This has a few fixes since our last pull and a new ioctl for doing
        btree searches from userland.  It's very similar to the existing
        ioctl, but lets us return larger items back down to the app"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs: fix error handling in create_pending_snapshot
        btrfs: fix use of uninit "ret" in end_extent_writepage()
        btrfs: free ulist in qgroup_shared_accounting() error path
        Btrfs: fix qgroups sanity test crash or hang
        btrfs: prevent RCU warning when dereferencing radix tree slot
        Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting
        btrfs: new ioctl TREE_SEARCH_V2
        btrfs: tree_search, search_ioctl: direct copy to userspace
        btrfs: new function read_extent_buffer_to_user
        btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW
        btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer
        btrfs: tree_search, search_ioctl: accept varying buffer
        btrfs: tree_search: eliminate redundant nr_items check
      16d52ef7
    • Linus Torvalds's avatar
      Merge git://git.kvack.org/~bcrl/aio-next · a311c480
      Linus Torvalds authored
      Pull aio fix and cleanups from Ben LaHaise:
       "This consists of a couple of code cleanups plus a minor bug fix"
      
      * git://git.kvack.org/~bcrl/aio-next:
        aio: cleanup: flatten kill_ioctx()
        aio: report error from io_destroy() when threads race in io_destroy()
        fs/aio.c: Remove ctx parameter in kiocb_cancel
      a311c480
    • Al Viro's avatar
      fix __swap_writepage() compile failure on old gcc versions · 05064084
      Al Viro authored
      Tetsuo Handa wrote:
       "Commit 62a8067a ("bio_vec-backed iov_iter") introduced an unnamed
        union inside a struct which gcc-4.4.7 cannot handle.  Name the unnamed
         union as u in order to fix build failure"
      
      Let's do this instead: there is only one place in the entire tree that
      steps into this breakage.  Anon structs and unions work in older gcc
      versions; as the matter of fact, we have those in the tree - see e.g.
      struct ieee80211_tx_info in include/net/mac80211.h
      
      What doesn't work is handling their initializers:
      
      struct {
      	int a;
      	union {
      		int b;
      		char c;
      	};
      } x[2] = {{.a = 1, .c = 'a'}, {.a = 0, .b = 1}};
      
      is the obvious syntax for initializer, perfectly fine for C11 and
      handled correctly by gcc-4.7 or later.
      
      Earlier versions, though, break on it - declaration is fine and so's
      access to fields (i.e.  x[0].c = 'a'; would produce the right code), but
      members of the anon structs and unions are not inserted into the right
      namespace.  Tellingly, those older versions will not barf on struct {int
      a; struct {int a;};}; - looks like they just have it hacked up somewhere
      around the handling of .  and -> instead of doing the right thing.
      
      The easiest way to deal with that crap is to turn initialization of
      those fields (in the only place where we have such initializer of
      iov_iter) into plain assignment.
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      05064084
  5. 14 Jun, 2014 4 commits
  6. 13 Jun, 2014 17 commits
    • Eric Dumazet's avatar
      udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup · 63c6f81c
      Eric Dumazet authored
      Its too easy to add thousand of UDP sockets on a particular bucket,
      and slow down an innocent multicast receiver.
      
      Early demux is supposed to be an optimization, we should avoid spending
      too much time in it.
      
      It is interesting to note __udp4_lib_demux_lookup() only tries to
      match first socket in the chain.
      
      10 is the threshold we already have in __udp4_lib_lookup() to switch
      to secondary hash.
      
      Fixes: 421b3885 ("udp: ipv4: Add udp early demux")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDavid Held <drheld@google.com>
      Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63c6f81c
    • Cong Wang's avatar
      vxlan: use dev->needed_headroom instead of dev->hard_header_len · 2853af6a
      Cong Wang authored
      When we mirror packets from a vxlan tunnel to other device,
      the mirror device should see the same packets (that is, without
      outer header). Because vxlan tunnel sets dev->hard_header_len,
      tcf_mirred() resets mac header back to outer mac, the mirror device
      actually sees packets with outer headers
      
      Vxlan tunnel should set dev->needed_headroom instead of
      dev->hard_header_len, like what other ip tunnels do. This fixes
      the above problem.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: stephen hemminger <stephen@networkplumber.org>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2853af6a
    • Dimitris Michailidis's avatar
      MAINTAINERS: update cxgb4 maintainer · 56f16c74
      Dimitris Michailidis authored
      Hari's been doing the patch submissions for a while now and he'll be
      taking over as maintainer.
      Signed-off-by: default avatarDimitris Michailidis <dm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56f16c74
    • Andy Lutomirski's avatar
      x86/vdso: Fix vdso_install · a934fb5b
      Andy Lutomirski authored
      "make vdso_install" installs unstripped versions of the vdso objects
      for the benefit of the debugger.  This was broken by checkin:
      
      6f121e54 x86, vdso: Reimplement vdso.so preparation in build-time C
      
      The filenames are different now, so update the Makefile to cope.
      
      This still installs the 64-bit vdso as vdso64.so.  We believe this
      will be okay, as the only known user is a patched gdb which is known
      to use build-ids, but if it turns out to be a problem we may have to
      add a link.
      
      Inspired by a patch from Sam Ravnborg.
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Reported-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Tested-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Link: http://lkml.kernel.org/r/b10299edd8ba98d17e07dafcd895b8ecf4d99eff.1402586707.git.luto@amacapital.netSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      a934fb5b
    • Dan McLeran's avatar
      NVMe: Fix START_STOP_UNIT Scsi->NVMe translation. · b8e08084
      Dan McLeran authored
      This patch contains several fixes for Scsi START_STOP_UNIT. The previous
      code did not account for signed vs. unsigned arithmetic which resulted
      in an invalid lowest power state caculation when the device only supports
      1 power state.
      
      The code for Power Condition == 2 (Idle) was not following the spec. The
      spec calls for setting the device to specific power states, depending
      upon Power Condition Modifier, without accounting for the number of
      power states supported by the device.
      
      The code for Power Condition == 3 (Standby) was using a hard-coded '0'
      which is replaced with the macro POWER_STATE_0.
      Signed-off-by: default avatarDan McLeran <daniel.mcleran@intel.com>
      Reviewed-by: default avatarVishal Verma <vishal.l.verma@linux.intel.com>
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      b8e08084
    • Eric Sandeen's avatar
      btrfs: fix error handling in create_pending_snapshot · 47a306a7
      Eric Sandeen authored
      fcebe456 cut and pasted some code to a later point
      in create_pending_snapshot(), but didn't switch
      to the appropriate error handling for this stage
      of the function.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      47a306a7
    • Eric Sandeen's avatar
      btrfs: fix use of uninit "ret" in end_extent_writepage() · 3e2426bd
      Eric Sandeen authored
      If this condition in end_extent_writepage() is false:
      
      	if (tree->ops && tree->ops->writepage_end_io_hook)
      
      we will then test an uninitialized "ret" at:
      
      	ret = ret < 0 ? ret : -EIO;
      
      The test for ret is for the case where ->writepage_end_io_hook
      failed, and we'd choose that ret as the error; but if
      there is no ->writepage_end_io_hook, nothing sets ret.
      
      Initializing ret to 0 should be sufficient; if
      writepage_end_io_hook wasn't set, (!uptodate) means
      non-zero err was passed in, so we choose -EIO in that case.
      Signed-of-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      3e2426bd
    • Eric Sandeen's avatar
      btrfs: free ulist in qgroup_shared_accounting() error path · d7372780
      Eric Sandeen authored
      If tmp = ulist_alloc(GFP_NOFS) fails, we return without
      freeing the previously allocated qgroups = ulist_alloc(GFP_NOFS)
      and cause a memory leak.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      d7372780
    • Filipe Manana's avatar
      Btrfs: fix qgroups sanity test crash or hang · b050f9f6
      Filipe Manana authored
      Often when running the qgroups sanity test, a crash or a hang happened.
      This is because the extent buffer the test uses for the root node doesn't
      have an header level explicitly set, making it have a random level value.
      This is a problem when it's not zero for the btrfs_search_slot() calls
      the test ends up doing, resulting in crashes or hangs such as the following:
      
      [ 6454.127192] Btrfs loaded, debug=on, assert=on, integrity-checker=on
      (...)
      [ 6454.127760] BTRFS: selftest: Running qgroup tests
      [ 6454.127964] BTRFS: selftest: Running test_test_no_shared_qgroup
      [ 6454.127966] BTRFS: selftest: Qgroup basic add
      [ 6480.152005] BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:5383]
      [ 6480.152005] Modules linked in: btrfs(+) xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc i2c_piix4 i2c_core pcspkr evbug psmouse serio_raw e1000 [last unloaded: btrfs]
      [ 6480.152005] irq event stamp: 188448
      [ 6480.152005] hardirqs last  enabled at (188447): [<ffffffff8168ef5c>] restore_args+0x0/0x30
      [ 6480.152005] hardirqs last disabled at (188448): [<ffffffff81698e6a>] apic_timer_interrupt+0x6a/0x80
      [ 6480.152005] softirqs last  enabled at (188446): [<ffffffff810516cf>] __do_softirq+0x1cf/0x450
      [ 6480.152005] softirqs last disabled at (188441): [<ffffffff81051c25>] irq_exit+0xb5/0xc0
      [ 6480.152005] CPU: 0 PID: 5383 Comm: modprobe Not tainted 3.15.0-rc8-fdm-btrfs-next-33+ #4
      [ 6480.152005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [ 6480.152005] task: ffff8802146125a0 ti: ffff8800d0d00000 task.ti: ffff8800d0d00000
      [ 6480.152005] RIP: 0010:[<ffffffff81349a63>]  [<ffffffff81349a63>] __write_lock_failed+0x13/0x20
      [ 6480.152005] RSP: 0018:ffff8800d0d038e8  EFLAGS: 00000287
      [ 6480.152005] RAX: 0000000000000000 RBX: ffffffff8168ef5c RCX: 000005deb8525852
      [ 6480.152005] RDX: 0000000000000000 RSI: 0000000000001d45 RDI: ffff8802105000b8
      [ 6480.152005] RBP: ffff8800d0d038e8 R08: fffffe12710f63db R09: ffffffffa03196fb
      [ 6480.152005] R10: ffff8802146125a0 R11: ffff880214612e28 R12: ffff8800d0d03858
      [ 6480.152005] R13: 0000000000000000 R14: ffff8800d0d00000 R15: ffff8802146125a0
      [ 6480.152005] FS:  00007f14ff804700(0000) GS:ffff880215e00000(0000) knlGS:0000000000000000
      [ 6480.152005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 6480.152005] CR2: 00007fff4df0dac8 CR3: 00000000d1796000 CR4: 00000000000006f0
      [ 6480.152005] Stack:
      [ 6480.152005]  ffff8800d0d03908 ffffffff810ae967 0000000000000001 ffff8802105000b8
      [ 6480.152005]  ffff8800d0d03938 ffffffff8168e57e ffffffffa0319c16 0000000000000007
      [ 6480.152005]  ffff880210500000 ffff880210500100 ffff8800d0d039b8 ffffffffa0319c16
      [ 6480.152005] Call Trace:
      [ 6480.152005]  [<ffffffff810ae967>] do_raw_write_lock+0x47/0xa0
      [ 6480.152005]  [<ffffffff8168e57e>] _raw_write_lock+0x5e/0x80
      [ 6480.152005]  [<ffffffffa0319c16>] ? btrfs_tree_lock+0x116/0x270 [btrfs]
      [ 6480.152005]  [<ffffffffa0319c16>] btrfs_tree_lock+0x116/0x270 [btrfs]
      [ 6480.152005]  [<ffffffffa02b2acb>] btrfs_lock_root_node+0x3b/0x50 [btrfs]
      [ 6480.152005]  [<ffffffffa02b81a6>] btrfs_search_slot+0x916/0xa20 [btrfs]
      [ 6480.152005]  [<ffffffff811a727f>] ? create_object+0x23f/0x300
      [ 6480.152005]  [<ffffffffa02b9958>] btrfs_insert_empty_items+0x78/0xd0 [btrfs]
      [ 6480.152005]  [<ffffffffa036041a>] insert_normal_tree_ref.constprop.4+0xa2/0x19a [btrfs]
      [ 6480.152005]  [<ffffffffa03605c3>] test_no_shared_qgroup+0xb1/0x1ca [btrfs]
      [ 6480.152005]  [<ffffffff8108cad6>] ? local_clock+0x16/0x30
      [ 6480.152005]  [<ffffffffa035ef8e>] btrfs_test_qgroups+0x1ae/0x1d7 [btrfs]
      [ 6480.152005]  [<ffffffffa03a69d2>] ? ftrace_define_fields_btrfs_space_reservation+0xfd/0xfd [btrfs]
      [ 6480.152005]  [<ffffffffa03a6a86>] init_btrfs_fs+0xb4/0x153 [btrfs]
      [ 6480.152005]  [<ffffffff81000352>] do_one_initcall+0x102/0x150
      [ 6480.152005]  [<ffffffff8103d223>] ? set_memory_nx+0x43/0x50
      [ 6480.152005]  [<ffffffff81682668>] ? set_section_ro_nx+0x6d/0x74
      [ 6480.152005]  [<ffffffff810d91cc>] load_module+0x1cdc/0x2630
      (...)
      
      Therefore initialize the extent buffer as an empty leaf (level 0).
      
      Issue easy to reproduce when btrfs is built as a module via:
      
          $ for ((i = 1; i <= 1000000; i++)); do rmmod btrfs; modprobe btrfs; done
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      b050f9f6
    • Sasha Levin's avatar
      btrfs: prevent RCU warning when dereferencing radix tree slot · f1e3c289
      Sasha Levin authored
      Mark the dereference as protected by lock. Not doing so triggers
      an RCU warning since the radix tree assumed that RCU is in use.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      f1e3c289
    • Wang Shilong's avatar
      Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting · 5fbc7c59
      Wang Shilong authored
      Steps to reproduce:
      
       # mkfs.btrfs -f /dev/sd[b-f] -m raid5 -d raid5
       # mkfs.ext4 /dev/sdc --->corrupt one of btrfs device
       # mount /dev/sdb /mnt -o degraded
       # btrfs scrub start -BRd /mnt
      
      This is because readahead would skip missing device, this is not true
      for RAID5/6, because REQ_GET_READ_MIRRORS return 1 for RAID5/6 block
      mapping. If expected data locates in missing device, readahead thread
      would not call __readahead_hook() which makes event @rc->elems=0
      wait forever.
      
      Fix this problem by checking return value of btrfs_map_block(),we
      can only skip missing device safely if there are several mirrors.
      Signed-off-by: default avatarWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      5fbc7c59
    • Gerhard Heift's avatar
      btrfs: new ioctl TREE_SEARCH_V2 · cc68a8a5
      Gerhard Heift authored
      This new ioctl call allows the user to supply a buffer of varying size in which
      a tree search can store its results. This is much more flexible if you want to
      receive items which are larger than the current fixed buffer of 3992 bytes or
      if you want to fetch more items at once. Items larger than this buffer are for
      example some of the type EXTENT_CSUM.
      Signed-off-by: default avatarGerhard Heift <Gerhard@Heift.Name>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Acked-by: default avatarDavid Sterba <dsterba@suse.cz>
      cc68a8a5
    • Matthew Wilcox's avatar
      NVMe: Use Log Page constants in SCSI emulation · ef351b97
      Matthew Wilcox authored
      The nvme-scsi file defined its own Log Page constant.  Use the
      newly-defined one from the header file instead.
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      ef351b97
    • Matthew Wilcox's avatar
      NVMe: Define Log Page constants · 3d69bb6e
      Matthew Wilcox authored
      Taken from the 1.1a version of the spec
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      3d69bb6e
    • Keith Busch's avatar
      NVMe: Fix hot cpu notification dead lock · f3db22fe
      Keith Busch authored
      There is a potential dead lock if a cpu event occurs during nvme probe
      since it registered with hot cpu notification. This fixes the race by
      having the module register with notification outside of probe rather
      than have each device register.
      
      The actual work is done in a scheduled work queue instead of in the
      notifier since assigning IO queues has the potential to block if the
      driver creates additional queues.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      f3db22fe
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 6391f34e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Most of changes are small and easy cleanup or fixes:
      
         - a few HD-audio Realtek codec fixes and quirks
         - Intel HDMI audio fixes for Broadwell and Haswell / ValleyView
         - FireWire sound stack cleanups
         - a couple of sequencer core fixes
         - compress ABI fix for 64bit
         - conversion to modern ktime*() API"
      
      * tag 'sound-fix-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
        ALSA: hda/realtek - Add more entry for enable HP mute led
        ALSA: hda - Add quirk for external mic on Lifebook U904
        ALSA: hda - fix a fixup value for codec alc293 in the pin_quirk table
        ALSA: intel8x0: Use ktime and ktime_get()
        ALSA: core: Use ktime_get_ts()
        ALSA: hda - verify pin:converter connection on unsol event for HSW and VLV
        ALSA: compress: Cancel the optimization of compiler and fix the size of struct for all platform.
        ALSA: hda - Add quirk for ABit AA8XE
        Revert "ALSA: hda - mask buggy stream DMA0 for Broadwell display controller"
        ALSA: hda - using POS_FIX_LPIB on Broadwell HDMI Audio
        ALSA: hda/realtek - Add support of ALC667 codec
        ALSA: hda/realtek - Add more codec rename
        ALSA: hda/realtek - New vendor ID for ALC233
        ALSA: hda - add two new pin tables
        ALSA: hda/realtek - Add support of ALC891 codec
        ALSA: seq: Continue broadcasting events to ports if one of them fails
        ALSA: bebob: Remove unused function prototype
        ALSA: fireworks: Remove meaningless mutex_destroy()
        ALSA: fireworks: Remove a constant over width to which it's applied
        ALSA: fireworks: Improve comments about Fireworks transaction
        ...
      6391f34e
    • Linus Torvalds's avatar
      Merge tag 'dlm-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 4bdeb312
      Linus Torvalds authored
      Pull dlm fix from David Teigland:
       "This contains one small fix related to resending SCTP messages"
      
      * tag 'dlm-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        dlm: keep listening connection alive with sctp mode
      4bdeb312