1. 14 Sep, 2016 2 commits
    • Paul E. McKenney's avatar
      Merge branches 'doc.2016.08.22c', 'exp.2016.08.22c', 'fixes.2016.09.14a',... · d74b62bc
      Paul E. McKenney authored
      Merge branches 'doc.2016.08.22c', 'exp.2016.08.22c', 'fixes.2016.09.14a', 'hotplug.2016.08.22c' and 'torture.2016.08.22c' into HEAD
      
      doc.2016.08.22c: Documentation updates
      exp.2016.08.22c: Expedited grace-period updates
      fixes.2016.09.14a: Miscellaneous fixes
      hotplug.2016.08.22c: CPU-hotplug changes
      torture.2016.08.22c: Torture-test changes
      d74b62bc
    • Chris Wilson's avatar
      list: Expand list_first_entry_or_null() · 12adfd88
      Chris Wilson authored
      Due to the use of READ_ONCE() in list_empty() the compiler cannot
      optimise !list_empty() ? list_first_entry() : NULL very well. By
      manually expanding list_first_entry_or_null() we can take advantage of
      the READ_ONCE() to avoid the list element changing under the test while
      the compiler can generate smaller code.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      12adfd88
  2. 22 Aug, 2016 21 commits
    • SeongJae Park's avatar
      torture: TOROUT_STRING(): Insert a space between flag and message · 489bb3d2
      SeongJae Park authored
      The TOROUT_STRING() macro does not insert a space between the flag and
      the message.  In contrast, other similar torture-test dmesg messages
      consistently supply a single space character.  This difference makes the
      output hard to read and to mechanically parse.  This commit therefore
      adds a space character between flag and message in TOROUT_STRING() output.
      Signed-off-by: default avatarSeongJae Park <sj38.park@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      489bb3d2
    • SeongJae Park's avatar
      rcuperf: Consistently insert space between flag and message · a56fefa2
      SeongJae Park authored
      A few rcuperf dmesg output messages have no space between the flag and
      the start of the message. In contrast, every other messages consistently
      supplies a single space.  This difference makes rcuperf dmesg output
      hard to read and to mechanically parse.  This commit therefore fixes
      this problem by modifying a pr_alert() call and PERFOUT_STRING() macro
      function to provide that single space.
      Signed-off-by: default avatarSeongJae Park <sj38.park@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a56fefa2
    • SeongJae Park's avatar
      rcutorture: Print out barrier error as document says · 472213a6
      SeongJae Park authored
      Tests for rcu_barrier() were introduced by commit fae4b54f ("rcu:
      Introduce rcutorture testing for rcu_barrier()").  This commit updated
      the documentation to say that the "rtbe" field in rcutorture's dmesg
      output indicates test failure.  However, the code was not updated, only
      the documentation.  This commit therefore updates the code to match the
      updated documentation.
      Signed-off-by: default avatarSeongJae Park <sj38.park@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      472213a6
    • Paul E. McKenney's avatar
      torture: Add task state to writer-task stall printk()s · 4ffa6699
      Paul E. McKenney authored
      This commit adds a dump of the scheduler state for stalled rcutorture
      writer tasks.  This addition provides yet more debug for the intermittent
      "failures to proceed", where grace periods move ahead but the rcutorture
      writer tasks fail to do so.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4ffa6699
    • Paul E. McKenney's avatar
      torture: Convert torture_shutdown() to hrtimer · 31257c3c
      Paul E. McKenney authored
      Upcoming changes to the timer wheel introduce significant inaccuracy
      and possibly also an ultimate limit on timeout duration.  This is a
      problem for the current implementation of torture_shutdown() because
      (1) shutdown times are user-specified, and can therefore be quite long,
      and (2) the torture scripting will kill a test instance that runs for
      more than a few minutes longer than scheduled.  This commit therefore
      converts the torture_shutdown() timed waits to an hrtimer, thus avoiding
      too-short torture test runs as well as death by scripting.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      31257c3c
    • Sebastian Andrzej Siewior's avatar
      rcutorture: Convert to hotplug state machine · 0ffd374b
      Sebastian Andrzej Siewior authored
      Install the callbacks via the state machine and let the core invoke
      the callbacks on the already online CPUs.
      
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      0ffd374b
    • Sebastian Andrzej Siewior's avatar
      cpu/hotplug: Get rid of CPU_STARTING reference · 0c6d4576
      Sebastian Andrzej Siewior authored
      CPU_STARTING is scheduled for removal. There is no use of it in drivers
      and core code uses it only for compatibility with old-style CPU-hotplug
      notifiers.  This patch removes therefore removes CPU_STARTING from an
      RCU-related comment.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      0c6d4576
    • Paul E. McKenney's avatar
      rcu: Provide exact CPU-online tracking for RCU · 7ec99de3
      Paul E. McKenney authored
      Up to now, RCU has assumed that the CPU-online process makes it from
      CPU_UP_PREPARE to set_cpu_online() within one jiffy.  Given the recent
      rise of virtualized environments, this assumption is very clearly
      obsolete.  Failing to meet this deadline can result in RCU paying
      attention to an incoming CPU for one jiffy, then ignoring it until the
      grace period following the one in which that CPU sets itself online.
      This situation might prove to be fatally disappointing to any RCU
      read-side critical sections that had the misfortune to execute during
      the time in which RCU was ignoring the slow-to-come-online CPU.
      
      This commit therefore updates RCU's internal CPU state-tracking
      information at notify_cpu_starting() time, thus providing RCU with
      an exact transition of the CPU's state from offline to online.
      
      Note that this means that incoming CPUs must not use RCU read-side
      critical section (other than those of SRCU) until notify_cpu_starting()
      time.  Note also that the CPU_STARTING notifiers -are- allowed to use
      RCU read-side critical sections.  (Of course, CPU-hotplug notifiers are
      rapidly becoming obsolete, so you need to act fast!)
      
      If a given architecture or CPU family needs to use RCU read-side
      critical sections earlier, the call to rcu_cpu_starting() from
      notify_cpu_starting() will need to be architecture-specific, with
      architectures that need early use being required to hand-place
      the call to rcu_cpu_starting() at some point preceding the call to
      notify_cpu_starting().
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      7ec99de3
    • Paul E. McKenney's avatar
      rcu: Avoid redundant quiescent-state chasing · 3563a438
      Paul E. McKenney authored
      Currently, __note_gp_changes() checks to see if the CPU has slept through
      multiple grace periods.  If it has, it resynchronizes that CPU's view
      of the grace-period state, which includes whether or not the current
      grace period needs a quiescent state from this CPU.  The fact of this
      need (or lack thereof) needs to be in two places, rdp->cpu_no_qs.b.norm
      and rdp->core_needs_qs.  The former tells RCU's context-switch code to
      go get a quiescent state and the latter says that it needs to be reported.
      The current code unconditionally sets the former to true, but correctly
      sets the latter.
      
      This does not result in failures, but it does unnecessarily increase
      the amount of work done on average at context-switch time.  This commit
      therefore correctly sets both fields.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3563a438
    • Paul Gortmaker's avatar
      rcu: Don't use modular infrastructure in non-modular code · e77b7041
      Paul Gortmaker authored
      The Kconfig currently controlling compilation of tree.c is:
      
      init/Kconfig:config TREE_RCU
      init/Kconfig:   bool
      
      ...and update.c and sync.c are "obj-y" meaning that none are ever
      built as a module by anyone.
      
      Since MODULE_ALIAS is a no-op for non-modular code, we can remove
      them from these files.
      
      We leave moduleparam.h behind since the files instantiate some boot
      time configuration parameters with module_param() still.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e77b7041
    • Paul E. McKenney's avatar
      sched: Make wake_up_nohz_cpu() handle CPUs going offline · 379d9ecb
      Paul E. McKenney authored
      Both timers and hrtimers are maintained on the outgoing CPU until
      CPU_DEAD time, at which point they are migrated to a surviving CPU.  If a
      mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems
      will splat in native_smp_send_reschedule() when attempting to wake up
      the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel:
      
      [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40
      [ 7976.741595] Modules linked in:
      [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1
      [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      [ 7976.741595]  0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000
      [ 7976.741595]  0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18
      [ 7976.741595]  0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00
      [ 7976.741595] Call Trace:
      [ 7976.741595]  [<ffffffff8138ab2e>] dump_stack+0x67/0x99
      [ 7976.741595]  [<ffffffff8105cabc>] __warn+0xcc/0xf0
      [ 7976.741595]  [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20
      [ 7976.741595]  [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40
      [ 7976.741595]  [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190
      [ 7976.741595]  [<ffffffff810d275a>] internal_add_timer+0x7a/0x80
      [ 7976.741595]  [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0
      [ 7976.741595]  [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380
      [ 7976.741595]  [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30
      [ 7976.741595]  [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80
      [ 7976.741595]  [<ffffffff8108068f>] kthread+0xdf/0x100
      [ 7976.741595]  [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40
      [ 7976.741595]  [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200
      
      However, in this case, the wakeup is redundant, because the timer
      migration will reprogram timer hardware as needed.  Note that the fact
      that preemption is disabled does not avoid the splat, as the offline
      operation has already passed both the synchronize_sched() and the
      stop_machine() that would be blocked by disabled preemption.
      
      This commit therefore modifies wake_up_nohz_cpu() to avoid attempting
      to wake up offline CPUs.  It also adds a comment stating that the
      caller must tolerate lost wakeups when the target CPU is going offline,
      and suggesting the CPU_DEAD notifier as a recovery mechanism.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      379d9ecb
    • Jisheng Zhang's avatar
      rcu: Use rcu_gp_kthread_wake() to wake up grace period kthreads · 94d44776
      Jisheng Zhang authored
      Commit abedf8e2 ("rcu: Use simple wait queues where possible in
      rcutree") converts Tree RCU's wait queues to simple wait queues,
      but it incorrectly reverts the commit 2aa792e6 ("rcu: Use
      rcu_gp_kthread_wake() to wake up grace period kthreads").  This can
      result in redundant self-wakeups.
      
      This commit therefore replaces the simple wait-queue wakeups with
      rcu_gp_kthread_wake(), thus avoiding the redundant wakeups.
      Signed-off-by: default avatarJisheng Zhang <jszhang@marvell.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      94d44776
    • Paul E. McKenney's avatar
      rcu: Use RCU's online-CPU state for expedited IPI retry · 385c859f
      Paul E. McKenney authored
      This commit improves the accuracy of the interaction between CPU hotplug
      operations and RCU's expedited grace periods by using RCU's online-CPU
      state to determine when failed IPIs should be retried.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      385c859f
    • Paul E. McKenney's avatar
      rcu: Exclude RCU-offline CPUs from expedited grace periods · 98834b83
      Paul E. McKenney authored
      The expedited RCU grace periods currently rely on a failure indication
      from smp_call_function_single() to determine that a given CPU is offline.
      This works after a fashion, but is more contorted and less precise than
      relying on RCU's internal state.  This commit therefore takes a first
      step towards relying on internal state.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      98834b83
    • Paul E. McKenney's avatar
      rcu: Make expedited RCU CPU stall warnings respond to controls · 24a6cff2
      Paul E. McKenney authored
      The expedited RCU CPU stall warnings currently responds to neither the
      panic_on_rcu_stall sysctl setting nor the rcupdate.rcu_cpu_stall_suppress
      kernel boot parameter.  This commit therefore updates the expedited code
      to respond to these two controls.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      24a6cff2
    • Paul E. McKenney's avatar
      rcu: Stop disabling expedited RCU CPU stall warnings · 908d2c1f
      Paul E. McKenney authored
      Now that RCU expedited grace periods are always driven by a workqueue,
      there is no need to account for signal reception, and thus no need
      to disable expedited RCU CPU stall warnings due to signal reception.
      This commit therefore removes the signal-reception checks, leaving a
      WARN_ON() to catch possible future bugs.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      908d2c1f
    • Paul E. McKenney's avatar
      rcu: Drive expedited grace periods from workqueue · 8b355e3b
      Paul E. McKenney authored
      The current implementation of expedited grace periods has the user
      task drive the grace period.  This works, but has downsides: (1) The
      user task must awaken tasks piggybacking on this grace period, which
      can result in latencies rivaling that of the grace period itself, and
      (2) User tasks can receive signals, which interfere with RCU CPU stall
      warnings.
      
      This commit therefore uses workqueues to drive the grace periods, so
      that the user task need not do the awakening.  A subsequent commit
      will remove the now-unnecessary code allowing for signals.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8b355e3b
    • Paul E. McKenney's avatar
      rcu: Consolidate expedited grace period machinery · f7b8eb84
      Paul E. McKenney authored
      The functions synchronize_rcu_expedited() and synchronize_sched_expedited()
      have nearly identical code.  This commit therefore consolidates this code
      into a new _synchronize_rcu_expedited() function.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f7b8eb84
    • Paul E. McKenney's avatar
      documentation: Record reason for rcu_head two-byte alignment · ed2bec07
      Paul E. McKenney authored
      There is an assertion in __call_rcu() that checks only the bottom
      bit of the rcu_head pointer, rather than the bottom two (as might be
      expected for 32-bit systems) or the bottom three (as might be expected
      for 64-bit systems).  This choice might be a bit surprising in these days
      of ubiquitous 32-bit and 64-bit systems.  This commit therefore records
      the reason for this odd alignment check, namely that m68k guarantees
      only two-byte alignment despite being a 32-bit architectures.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      ed2bec07
    • SeongJae Park's avatar
      rcutorture: Remove outdated config option description · e1ef6921
      SeongJae Park authored
      CONFIG_RCU_TORTURE_TEST_RUNNABLE was removed by commit 4e9a073f
      ("torture: Remove CONFIG_RCU_TORTURE_TEST_RUNNABLE, simplify code"),
      but the documentation was not updated accordingly.  This commit therefore
      updates the documentation to reflect CONFIG_RCU_TORTURE_TEST_RUNNABLE's
      removal and to add a description for the alternative module parameter.
      Signed-off-by: default avatarSeongJae Park <sj38.park@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e1ef6921
    • Ding Tianhong's avatar
      rcu: Fix soft lockup for rcu_nocb_kthread · bedc1969
      Ding Tianhong authored
      Carrying out the following steps results in a softlockup in the
      RCU callback-offload (rcuo) kthreads:
      
      1. Connect to ixgbevf, and set the speed to 10Gb/s.
      2. Use ifconfig to bring the nic up and down repeatedly.
      
      [  317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
      [  368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
      [  368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  368.106005] task: ffff88057dd8a220 ti: ffff88057dd9c000 task.ti: ffff88057dd9c000
      [  368.106005] RIP: 0010:[<ffffffff81579e04>]  [<ffffffff81579e04>] fib_table_lookup+0x14/0x390
      [  368.106005] RSP: 0018:ffff88061fc83ce8  EFLAGS: 00000286
      [  368.106005] RAX: 0000000000000001 RBX: 00000000020155c0 RCX: 0000000000000001
      [  368.106005] RDX: ffff88061fc83d50 RSI: ffff88061fc83d70 RDI: ffff880036d11a00
      [  368.106005] RBP: ffff88061fc83d08 R08: 0000000000000001 R09: 0000000000000000
      [  368.106005] R10: ffff880036d11a00 R11: ffffffff819e0900 R12: ffff88061fc83c58
      [  368.106005] R13: ffffffff816154dd R14: ffff88061fc83d08 R15: 00000000020155c0
      [  368.106005] FS:  0000000000000000(0000) GS:ffff88061fc80000(0000) knlGS:0000000000000000
      [  368.106005] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  368.106005] CR2: 00007f8c2aee9c40 CR3: 000000057b222000 CR4: 00000000000407e0
      [  368.106005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  368.106005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  368.106005] Stack:
      [  368.106005]  00000000010000c0 ffff88057b766000 ffff8802e380b000 ffff88057af03e00
      [  368.106005]  ffff88061fc83dc0 ffffffff815349a6 ffff88061fc83d40 ffffffff814ee146
      [  368.106005]  ffff8802e380af00 00000000e380af00 ffffffff819e0900 020155c0010000c0
      [  368.106005] Call Trace:
      [  368.106005]  <IRQ>
      [  368.106005]
      [  368.106005]  [<ffffffff815349a6>] ip_route_input_noref+0x516/0xbd0
      [  368.106005]  [<ffffffff814ee146>] ? skb_release_data+0xd6/0x110
      [  368.106005]  [<ffffffff814ee20a>] ? kfree_skb+0x3a/0xa0
      [  368.106005]  [<ffffffff8153698f>] ip_rcv_finish+0x29f/0x350
      [  368.106005]  [<ffffffff81537034>] ip_rcv+0x234/0x380
      [  368.106005]  [<ffffffff814fd656>] __netif_receive_skb_core+0x676/0x870
      [  368.106005]  [<ffffffff814fd868>] __netif_receive_skb+0x18/0x60
      [  368.106005]  [<ffffffff814fe4de>] process_backlog+0xae/0x180
      [  368.106005]  [<ffffffff814fdcb2>] net_rx_action+0x152/0x240
      [  368.106005]  [<ffffffff81077b3f>] __do_softirq+0xef/0x280
      [  368.106005]  [<ffffffff8161619c>] call_softirq+0x1c/0x30
      [  368.106005]  <EOI>
      [  368.106005]
      [  368.106005]  [<ffffffff81015d95>] do_softirq+0x65/0xa0
      [  368.106005]  [<ffffffff81077174>] local_bh_enable+0x94/0xa0
      [  368.106005]  [<ffffffff81114922>] rcu_nocb_kthread+0x232/0x370
      [  368.106005]  [<ffffffff81098250>] ? wake_up_bit+0x30/0x30
      [  368.106005]  [<ffffffff811146f0>] ? rcu_start_gp+0x40/0x40
      [  368.106005]  [<ffffffff8109728f>] kthread+0xcf/0xe0
      [  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
      [  368.106005]  [<ffffffff816147d8>] ret_from_fork+0x58/0x90
      [  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
      
      ==================================cut here==============================
      
      It turns out that the rcuos callback-offload kthread is busy processing
      a very large quantity of RCU callbacks, and it is not reliquishing the
      CPU while doing so.  This commit therefore adds an cond_resched_rcu_qs()
      within the loop to allow other tasks to run.
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      [ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ]
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bedc1969
  3. 08 Aug, 2016 1 commit
  4. 07 Aug, 2016 10 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 857953d7
      Linus Torvalds authored
      Pull more block fixes from Jens Axboe:
       "As mentioned in the pull the other day, a few more fixes for this
        round, all related to the bio op changes in this series.
      
        Two fixes, and then a cleanup, renaming bio->bi_rw to bio->bi_opf.  I
        wanted to do that change right after or right before -rc1, so that
        risk of conflict was reduced.  I just rebased the series on top of
        current master, and no new ->bi_rw usage has snuck in"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: rename bio bi_rw to bi_opf
        target: iblock_execute_sync_cache() should use bio_set_op_attrs()
        mm: make __swap_writepage() use bio_set_op_attrs()
        block/mm: make bdev_ops->rw_page() take a bool for read/write
      857953d7
    • Linus Torvalds's avatar
      Merge tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux · 635a4ba1
      Linus Torvalds authored
      Pull drm zpos property support from Dave Airlie:
       "This tree was waiting on some media stuff I hadn't had time to get a
        stable branchpoint off, so I just waited until it was all in your tree
        first.
      
        It's been around a bit on the list and shouldn't affect anything
        outside adding the generic API and moving some ARM drivers to using
        it"
      
      * tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux:
        drm: rcar: use generic code for managing zpos plane property
        drm/exynos: use generic code for managing zpos plane property
        drm: sti: use generic zpos for plane
        drm: add generic zpos property
      635a4ba1
    • Jens Axboe's avatar
      block: rename bio bi_rw to bi_opf · 1eff9d32
      Jens Axboe authored
      Since commit 63a4cc24, bio->bi_rw contains flags in the lower
      portion and the op code in the higher portions. This means that
      old code that relies on manually setting bi_rw is most likely
      going to be broken. Instead of letting that brokeness linger,
      rename the member, to force old and out-of-tree code to break
      at compile time instead of at runtime.
      
      No intended functional changes in this commit.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      1eff9d32
    • Jens Axboe's avatar
      target: iblock_execute_sync_cache() should use bio_set_op_attrs() · 31c64f78
      Jens Axboe authored
      The original commit missed this function, it needs to mark it a
      write flush.
      
      Cc: Mike Christie <mchristi@redhat.com>
      Fixes: e742fc32 ("target: use bio op accessors")
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      31c64f78
    • Jens Axboe's avatar
      mm: make __swap_writepage() use bio_set_op_attrs() · ba13e83e
      Jens Axboe authored
      Cleaner than manipulating bio->bi_rw flags directly.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      ba13e83e
    • Jens Axboe's avatar
      block/mm: make bdev_ops->rw_page() take a bool for read/write · c11f0c0b
      Jens Axboe authored
      Commit abf54548 changed it from an 'rw' flags type to the
      newer ops based interface, but now we're effectively leaking
      some bdev internals to the rest of the kernel. Since we only
      care about whether it's a read or a write at that level, just
      pass in a bool 'is_write' parameter instead.
      
      Then we can also move op_is_write() and friends back under
      CONFIG_BLOCK protection.
      Reviewed-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c11f0c0b
    • Linus Torvalds's avatar
      Merge tag 'doc-4.8-fixes' of git://git.lwn.net/linux · 52ddb7e9
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "Three fixes for the docs build, including removing an annoying warning
        on 'make help' if sphinx isn't present"
      
      * tag 'doc-4.8-fixes' of git://git.lwn.net/linux:
        DocBook: use DOCBOOKS="" to ignore DocBooks instead of IGNORE_DOCBOOKS=1
        Documenation: update cgroup's document path
        Documentation/sphinx: do not warn about missing tools in 'make help'
      52ddb7e9
    • Linus Torvalds's avatar
      Merge tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc · e9d488c3
      Linus Torvalds authored
      Pull binfmt_misc update from James Bottomley:
       "This update is to allow architecture emulation containers to function
        such that the emulation binary can be housed outside the container
        itself.  The container and fs parts both have acks from relevant
        experts.
      
        To use the new feature you have to add an F option to your binfmt_misc
        configuration"
      
      From the docs:
       "The usual behaviour of binfmt_misc is to spawn the binary lazily when
        the misc format file is invoked.  However, this doesn't work very well
        in the face of mount namespaces and changeroots, so the F mode opens
        the binary as soon as the emulation is installed and uses the opened
        image to spawn the emulator, meaning it is always available once
        installed, regardless of how the environment changes"
      
      * tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc:
        binfmt_misc: add F option description to documentation
        binfmt_misc: add persistent opened binary handler for containers
        fs: add filp_clone_open API
      e9d488c3
    • Eryu Guan's avatar
      fs: return EPERM on immutable inode · 337684a1
      Eryu Guan authored
      In most cases, EPERM is returned on immutable inode, and there're only a
      few places returning EACCES. I noticed this when running LTP on
      overlayfs, setxattr03 failed due to unexpected EACCES on immutable
      inode.
      
      So converting all EACCES to EPERM on immutable inode.
      Acked-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      337684a1
    • Linus Torvalds's avatar
      Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · fe64f328
      Linus Torvalds authored
      Pull more vfs updates from Al Viro:
       "Assorted cleanups and fixes.
      
        In the "trivial API change" department - ->d_compare() losing 'parent'
        argument"
      
      * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        cachefiles: Fix race between inactivating and culling a cache object
        9p: use clone_fid()
        9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
        vfs: make dentry_needs_remove_privs() internal
        vfs: remove file_needs_remove_privs()
        vfs: fix deadlock in file_remove_privs() on overlayfs
        get rid of 'parent' argument of ->d_compare()
        cifs, msdos, vfat, hfs+: don't bother with parent in ->d_compare()
        affs ->d_compare(): don't bother with ->d_inode
        fold _d_rehash() and __d_rehash() together
        fold dentry_rcuwalk_invalidate() into its only remaining caller
      fe64f328
  5. 06 Aug, 2016 6 commits
    • Linus Torvalds's avatar
      Merge tag 'xfs-rmap-for-linus-4.8-rc1' of... · 0cbbc422
      Linus Torvalds authored
      Merge tag 'xfs-rmap-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
      
      Pull more xfs updates from Dave Chinner:
       "This is the second part of the XFS updates for this merge cycle, and
        contains the new reverse block mapping feature for XFS.
      
        Reverse mapping allows us to track the owner of a specific block on
        disk precisely.  It is implemented as a set of btrees (one per
        allocation group) that track the owners of allocated extents.
        Effectively it is a "used space tree" that is updated when we allocate
        or free extents.  i.e. it is coherent with the free space btrees we
        already maintain and never overlaps with them.
      
        This reverse mapping infrastructure is the building block of several
        upcoming features - reflink, copy-on-write data, dedupe, online
        metadata and data scrubbing, highly accurate bad sector/data loss
        reporting to users, and significantly improved reconstruction of
        damaged and corrupted filesystems.  There's a lot of new stuff coming
        along in the next couple of cycles,a nd it all builds in the rmap
        infrastructure.
      
        As such, it's a huge chunk of new code with new on-disk format
        features and internal infrastructure.  It warns at mount time as an
        experimental feature and that it may eat data (as we do with all new
        on-disk features until they stabilise).  We have not released
        userspace suport for it yet - userspace support currently requires
        download from Darrick's xfsprogs repo and build from source, so the
        access to this feature is really developer/tester only at this point.
        Initial userspace support will be released at the same time kernel
        with this code in it is released.
      
        The new rmap enabled code regresses 3 xfstests - all are ENOSPC
        related corner cases, one of which Darrick posted a fix for a few
        hours ago.  The other two are fixed by infrastructure that is part of
        the upcoming reflink patchset.  This new ENOSPC infrastructure
        requires a on-disk format tweak required to keep mount times in
        check - we need to keep an on-disk count of allocated rmapbt blocks so
        we don't have to scan the entire btrees at mount time to count them.
      
        This is currently being tested and will be part of the fixes sent in
        the next week or two so users will not be exposed to this change"
      
      * tag 'xfs-rmap-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (52 commits)
        xfs: move (and rename) the deferred bmap-free tracepoints
        xfs: collapse single use static functions
        xfs: remove unnecessary parentheses from log redo item recovery functions
        xfs: remove the extents array from the rmap update done log item
        xfs: in btree_lshift, only allocate temporary cursor when needed
        xfs: remove unnecesary lshift/rshift key initialization
        xfs: remove the get*keys and update_keys btree ops pointers
        xfs: enable the rmap btree functionality
        xfs: don't update rmapbt when fixing agfl
        xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabled
        xfs: add rmap btree block detection to log recovery
        xfs: add rmap btree geometry feature flag
        xfs: propagate bmap updates to rmapbt
        xfs: enable the xfs_defer mechanism to process rmaps to update
        xfs: log rmap intent items
        xfs: create rmap update intent log items
        xfs: add rmap btree insert and delete helpers
        xfs: convert unwritten status of reverse mappings
        xfs: remove an extent from the rmap btree
        xfs: add an extent to the rmap btree
        ...
      0cbbc422
    • Linus Torvalds's avatar
      Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 835c92d4
      Linus Torvalds authored
      Pull qstr constification updates from Al Viro:
       "Fairly self-contained bunch - surprising lot of places passes struct
        qstr * as an argument when const struct qstr * would suffice; it
        complicates analysis for no good reason.
      
        I'd prefer to feed that separately from the assorted fixes (those are
        in #for-linus and with somewhat trickier topology)"
      
      * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        qstr: constify instances in adfs
        qstr: constify instances in lustre
        qstr: constify instances in f2fs
        qstr: constify instances in ext2
        qstr: constify instances in vfat
        qstr: constify instances in procfs
        qstr: constify instances in fuse
        qstr constify instances in fs/dcache.c
        qstr: constify instances in nfs
        qstr: constify instances in ocfs2
        qstr: constify instances in autofs4
        qstr: constify instances in hfs
        qstr: constify instances in hfsplus
        qstr: constify instances in logfs
        qstr: constify dentry_init_security
      835c92d4
    • Linus Torvalds's avatar
      Merge tag 'media/v4.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · ce804bf5
      Linus Torvalds authored
      Pull mailcap fixlets from Mauro Carvalho Chehab:
       "A small fixup for my and Shuah's entries in .mailcap.
      
        Basically, those entries were with a syntax that makes
        get_maintainer.pl to do the wrong thing"
      
      * tag 'media/v4.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        .mailmap: Correct entries for Mauro Carvalho Chehab and Shuah Khan
      ce804bf5
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 0803e040
      Linus Torvalds authored
      Pull virtio/vhost updates from Michael Tsirkin:
      
       - new vsock device support in host and guest
      
       - platform IOMMU support in host and guest, including compatibility
         quirks for legacy systems.
      
       - misc fixes and cleanups.
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        VSOCK: Use kvfree()
        vhost: split out vringh Kconfig
        vhost: detect 32 bit integer wrap around
        vhost: new device IOTLB API
        vhost: drop vringh dependency
        vhost: convert pre sorted vhost memory array to interval tree
        vhost: introduce vhost memory accessors
        VSOCK: Add Makefile and Kconfig
        VSOCK: Introduce vhost_vsock.ko
        VSOCK: Introduce virtio_transport.ko
        VSOCK: Introduce virtio_vsock_common.ko
        VSOCK: defer sock removal to transports
        VSOCK: transport-specific vsock_transport functions
        vhost: drop vringh dependency
        vop: pull in vhost Kconfig
        virtio: new feature to detect IOMMU device quirk
        balloon: check the number of available pages in leak balloon
        vhost: lockless enqueuing
        vhost: simplify work flushing
      0803e040
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 80fac0f5
      Linus Torvalds authored
      Pull more KVM updates from Paolo Bonzini:
       - ARM bugfix and MSI injection support
       - x86 nested virt tweak and OOPS fix
       - Simplify pvclock code (vdso bits acked by Andy Lutomirski).
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        nvmx: mark ept single context invalidation as supported
        nvmx: remove comment about missing nested vpid support
        KVM: lapic: fix access preemption timer stuff even if kernel_irqchip=off
        KVM: documentation: fix KVM_CAP_X2APIC_API information
        x86: vdso: use __pvclock_read_cycles
        pvclock: introduce seqcount-like API
        arm64: KVM: Set cpsr before spsr on fault injection
        KVM: arm: vgic-irqfd: Workaround changing kvm_set_routing_entry prototype
        KVM: arm/arm64: Enable MSI routing
        KVM: arm/arm64: Enable irqchip routing
        KVM: Move kvm_setup_default/empty_irq_routing declaration in arch specific header
        KVM: irqchip: Convey devid to kvm_set_msi
        KVM: Add devid in kvm_kernel_irq_routing_entry
        KVM: api: Pass the devid in the msi routing entry
      80fac0f5
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 4305f424
      Linus Torvalds authored
      Pull MIPS updates from Ralf Baechle:
       "This is the main pull request for MIPS for 4.8.  Also includes is a
        minor SSB cleanup as SSB code traditionally is merged through the MIPS
        tree:
      
        ATH25:
          - MIPS: Add default configuration for ath25
      
        Boot:
          - For zboot, copy appended dtb to the end of the kernel
          - store the appended dtb address in a variable
      
        BPF:
          - Fix off by one error in offset allocation
      
        Cobalt code:
          - Fix typos
      
        Core code:
          - debugfs_create_file returns NULL on error, so don't use IS_ERR for
            testing for errors.
          - Fix double locking issue in RM7000 S-cache code.  This would only
            affect RM7000 ARC systems on reboot.
          - Fix page table corruption on THP permission changes.
          - Use compat_sys_keyctl for 32 bit userspace on 64 bit kernels.
            David says, there are no compatibility issues raised by this fix.
          - Move some signal code around.
          - Rewrite r4k count/compare clockevent device registration such that
            min_delta_ticks/max_delta_ticks files are guaranteed to be
            initialized.
          - Only register r4k count/compare as clockevent device if we can
            assume the clock to be constant.
          - Fix MSA asm warnings in control reg accessors
          - uasm and tlbex fixes and tweaking.
          - Print segment physical address when EU=1.
          - Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO.
          - CP: Allow booting by VP other than VP 0
          - Cache handling fixes and optimizations for r4k class caches
          - Add hotplug support for R6 processors
          - Cleanup hotplug bits in kconfig
          - traps: return correct si code for accessing nonmapped addresses
          - Remove cpu_has_safe_index_cacheops
      
        Lantiq:
          - Register IRQ handler for virtual IRQ number
          - Fix EIU interrupt loading code
          - Use the real EXIN count
          - Fix build error.
      
        Loongson 3:
          - Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES
      
        Octeon:
          - Delete built-in DTB pruning code for D-Link DSR-1000N.
          - Clean up GPIO definitions in dlink_dsr-1000n.dts.
          - Add more LEDs to the DSR-100n DTS
          - Fix off by one in octeon_irq_gpio_map()
          - Typo fixes
          - Enable SATA by default in cavium_octeon_defconfig
          - Support readq/writeq()
          - Remove forced mappings of USB interrupts.
          - Ensure DMA descriptors are always in the low 4GB
          - Improve USB reset code for OCTEON II.
      
        Pistachio:
          - Add maintainers entry for pistachio SoC Support
          - Remove plat_setup_iocoherency
      
        Ralink:
          - Fix pwm UART in spis group pinmux.
      
        SSB:
          - Change bare unsigned to unsigned int to suit coding style
      
        Tools:
          - Fix reloc tool compiler warnings.
      
        Other:
          - Delete use of ARCH_WANT_OPTIONAL_GPIOLIB"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (61 commits)
        MIPS: mm: Fix definition of R6 cache instruction
        MIPS: tools: Fix relocs tool compiler warnings
        MIPS: Cobalt: Fix typo
        MIPS: Octeon: Fix typo
        MIPS: Lantiq: Fix build failure
        MIPS: Use CPHYSADDR to implement mips32 __pa
        MIPS: Octeon: Dlink_dsr-1000n.dts: add more leds.
        MIPS: Octeon: Clean up GPIO definitions in dlink_dsr-1000n.dts.
        MIPS: Octeon: Delete built-in DTB pruning code for D-Link DSR-1000N.
        MIPS: store the appended dtb address in a variable
        MIPS: ZBOOT: copy appended dtb to the end of the kernel
        MIPS: ralink: fix spis group pinmux
        MIPS: Factor o32 specific code into signal_o32.c
        MIPS: non-exec stack & heap when non-exec PT_GNU_STACK is present
        MIPS: Use per-mm page to execute branch delay slot instructions
        MIPS: Modify error handling
        MIPS: c-r4k: Use SMP calls for CM indexed cache ops
        MIPS: c-r4k: Avoid small flush_icache_range SMP calls
        MIPS: c-r4k: Local flush_icache_range cache op override
        MIPS: c-r4k: Split r4k_flush_kernel_vmap_range()
        ...
      4305f424