1. 25 Jul, 2013 22 commits
    • Srivatsa S. Bhat's avatar
      cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression · 916f4dbc
      Srivatsa S. Bhat authored
      commit e8d05276 upstream.
      
      commit 2f7021a8 "cpufreq: protect 'policy->cpus' from offlining
      during __gov_queue_work()" caused a regression in CPU hotplug,
      because it lead to a deadlock between cpufreq governor worker thread
      and the CPU hotplug writer task.
      
      Lockdep splat corresponding to this deadlock is shown below:
      
      [   60.277396] ======================================================
      [   60.277400] [ INFO: possible circular locking dependency detected ]
      [   60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
      [   60.277411] -------------------------------------------------------
      [   60.277417] bash/2225 is trying to acquire lock:
      [   60.277422]  ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
      [   60.277444] but task is already holding lock:
      [   60.277449]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
      [   60.277465] which lock already depends on the new lock.
      
      [   60.277472] the existing dependency chain (in reverse order) is:
      [   60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
      [   60.277490]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
      [   60.277503]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
      [   60.277514]        [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
      [   60.277522]        [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
      [   60.277532]        [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
      [   60.277543]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
      [   60.277552]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
      [   60.277560]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
      [   60.277569]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
      [   60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
      [   60.277592]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
      [   60.277600]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
      [   60.277608]        [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
      [   60.277616]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
      [   60.277624]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
      [   60.277633]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
      [   60.277640]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
      [   60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
      [   60.277661]        [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
      [   60.277669]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
      [   60.277677]        [<ffffffff810621ed>] flush_work+0x3d/0x280
      [   60.277685]        [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
      [   60.277693]        [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
      [   60.277701]        [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
      [   60.277709]        [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
      [   60.277719]        [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
      [   60.277728]        [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
      [   60.277737]        [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
      [   60.277747]        [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
      [   60.277759]        [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
      [   60.277768]        [<ffffffff815a0a68>] _cpu_down+0x88/0x330
      [   60.277779]        [<ffffffff815a0d46>] cpu_down+0x36/0x50
      [   60.277788]        [<ffffffff815a2748>] store_online+0x98/0xd0
      [   60.277796]        [<ffffffff81452a28>] dev_attr_store+0x18/0x30
      [   60.277806]        [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
      [   60.277818]        [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
      [   60.277826]        [<ffffffff811686fc>] SyS_write+0x4c/0xa0
      [   60.277834]        [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
      [   60.277842] other info that might help us debug this:
      
      [   60.277848] Chain exists of:
        (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
      
      [   60.277864]  Possible unsafe locking scenario:
      
      [   60.277869]        CPU0                    CPU1
      [   60.277873]        ----                    ----
      [   60.277877]   lock(cpu_hotplug.lock);
      [   60.277885]                                lock(&j_cdbs->timer_mutex);
      [   60.277892]                                lock(cpu_hotplug.lock);
      [   60.277900]   lock((&(&j_cdbs->work)->work));
      [   60.277907]  *** DEADLOCK ***
      
      [   60.277915] 6 locks held by bash/2225:
      [   60.277919]  #0:  (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
      [   60.277937]  #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
      [   60.277954]  #2:  (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
      [   60.277972]  #3:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
      [   60.277990]  #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
      [   60.278007]  #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
      [   60.278023] stack backtrace:
      [   60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
      [   60.278037] Hardware name: Acer             Aspire 5741G    /Aspire 5741G    , BIOS V1.20 02/08/2011
      [   60.278042]  ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
      [   60.278055]  ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
      [   60.278068]  ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
      [   60.278081] Call Trace:
      [   60.278091]  [<ffffffff815b3d90>] dump_stack+0x19/0x1b
      [   60.278101]  [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
      [   60.278111]  [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
      [   60.278123]  [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
      [   60.278134]  [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
      [   60.278142]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
      [   60.278151]  [<ffffffff810621ed>] flush_work+0x3d/0x280
      [   60.278159]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
      [   60.278169]  [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
      [   60.278178]  [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
      [   60.278188]  [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [   60.278196]  [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
      [   60.278206]  [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
      [   60.278214]  [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
      [   60.278225]  [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
      [   60.278234]  [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
      [   60.278244]  [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
      [   60.278255]  [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
      [   60.278265]  [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
      [   60.278275]  [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
      [   60.278284]  [<ffffffff815a0a68>] _cpu_down+0x88/0x330
      [   60.278292]  [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
      [   60.278302]  [<ffffffff815a0d46>] cpu_down+0x36/0x50
      [   60.278311]  [<ffffffff815a2748>] store_online+0x98/0xd0
      [   60.278320]  [<ffffffff81452a28>] dev_attr_store+0x18/0x30
      [   60.278329]  [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
      [   60.278337]  [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
      [   60.278347]  [<ffffffff81185950>] ? fget_light+0x320/0x4b0
      [   60.278355]  [<ffffffff811686fc>] SyS_write+0x4c/0xa0
      [   60.278364]  [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
      [   60.280582] smpboot: CPU 1 is now offline
      
      The intention of that commit was to avoid warnings during CPU
      hotplug, which indicated that offline CPUs were getting IPIs from the
      cpufreq governor's work items.  But the real root-cause of that
      problem was commit a66b2e50 (cpufreq: Preserve sysfs files across
      suspend/resume) because it totally skipped all the cpufreq callbacks
      during CPU hotplug in the suspend/resume path, and hence it never
      actually shut down the cpufreq governor's worker threads during CPU
      offline in the suspend/resume path.
      
      Reflecting back, the reason why we never suspected that commit as the
      root-cause earlier, was that the original issue was reported with
      just the halt command and nobody had brought in suspend/resume to the
      equation.
      
      The reason for _that_ in turn, as it turns out, is that earlier
      halt/shutdown was being done by disabling non-boot CPUs while tasks
      were frozen, just like suspend/resume....  but commit cf7df378
      (reboot: migrate shutdown/reboot to boot cpu) which came somewhere
      along that very same time changed that logic: shutdown/halt no longer
      takes CPUs offline.  Thus, the test-cases for reproducing the bug
      were vastly different and thus we went totally off the trail.
      
      Overall, it was one hell of a confusion with so many commits
      affecting each other and also affecting the symptoms of the problems
      in subtle ways.  Finally, now since the original problematic commit
      (a66b2e50) has been completely reverted, revert this intermediate fix
      too (2f7021a8), to fix the CPU hotplug deadlock.  Phew!
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Tested-by: default avatarPeter Wu <lekensteyn@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      916f4dbc
    • Srivatsa S. Bhat's avatar
      cpufreq: Revert commit a66b2e to fix suspend/resume regression · 9d3ce4af
      Srivatsa S. Bhat authored
      commit aae760ed upstream.
      
      commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume)
      has unfortunately caused several things in the cpufreq subsystem to
      break subtly after a suspend/resume cycle.
      
      The intention of that patch was to retain the file permissions of the
      cpufreq related sysfs files across suspend/resume.  To achieve that,
      the commit completely removed the calls to cpufreq_add_dev() and
      __cpufreq_remove_dev() during suspend/resume transitions.  But the
      problem is that those functions do 2 kinds of things:
        1. Low-level initialization/tear-down that are critical to the
           correct functioning of cpufreq-core.
        2. Kobject and sysfs related initialization/teardown.
      
      Ideally we should have reorganized the code to cleanly separate these
      two responsibilities, and skipped only the sysfs related parts during
      suspend/resume.  Since we skipped the entire callbacks instead (which
      also included some CPU and cpufreq-specific critical components),
      cpufreq subsystem started behaving erratically after suspend/resume.
      
      So revert the commit to fix the regression.  We'll revisit and address
      the original goal of that commit separately, since it involves quite a
      bit of careful code reorganization and appears to be non-trivial.
      
      (While reverting the commit, note that another commit f51e1eb6
       (cpufreq: Fix cpufreq regression after suspend/resume) already
       reverted part of the original set of changes.  So revert only the
       remaining ones).
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Tested-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d3ce4af
    • Michael Ellerman's avatar
      powerpc/perf: Don't enable if we have zero events · 382b9efb
      Michael Ellerman authored
      commit 4ea355b5 upstream.
      
      In power_pmu_enable() we still enable the PMU even if we have zero
      events. This should have no effect but doesn't make much sense. Instead
      just return after telling the hypervisor that we are not using the PMCs.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      382b9efb
    • Michael Ellerman's avatar
      powerpc/perf: Use existing out label in power_pmu_enable() · b26eb911
      Michael Ellerman authored
      commit 0a48843d upstream.
      
      In power_pmu_enable() we can use the existing out label to reduce the
      number of return paths.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b26eb911
    • Michael Ellerman's avatar
      powerpc/perf: Freeze PMC5/6 if we're not using them · 8f6c5b6c
      Michael Ellerman authored
      commit 7a7a41f9 upstream.
      
      On Power8 we can freeze PMC5 and 6 if we're not using them. Normally they
      run all the time.
      
      As noticed by Anshuman, we should unfreeze them when we disable the PMU
      as there are legacy tools which expect them to run all the time.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f6c5b6c
    • Michael Ellerman's avatar
      powerpc/perf: Rework disable logic in pmu_disable() · 8cf3478f
      Michael Ellerman authored
      commit 378a6ee9 upstream.
      
      In pmu_disable() we disable the PMU by setting the FC (Freeze Counters)
      bit in MMCR0. In order to do this we have to read/modify/write MMCR0.
      
      It's possible that we read a value from MMCR0 which has PMAO (PMU Alert
      Occurred) set. When we write that value back it will cause an interrupt
      to occur. We will then end up in the PMU interrupt handler even though
      we are supposed to have just disabled the PMU.
      
      We can avoid this by making sure we never write PMAO back. We should not
      lose interrupts because when the PMU is re-enabled the overflowed values
      will cause another interrupt.
      
      We also reorder the clearing of SAMPLE_ENABLE so that is done after the
      PMU is frozen. Otherwise there is a small window between the clearing of
      SAMPLE_ENABLE and the setting of FC where we could take an interrupt and
      incorrectly see SAMPLE_ENABLE not set. This would for example change the
      logic in perf_read_regs().
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cf3478f
    • Michael Ellerman's avatar
      powerpc/perf: Check that events only include valid bits on Power8 · a9514fe5
      Michael Ellerman authored
      commit d8bec4c9 upstream.
      
      A mistake we have made in the past is that we pull out the fields we
      need from the event code, but don't check that there are no unknown bits
      set. This means that we can't ever assign meaning to those unknown bits
      in future.
      
      Although we have once again failed to do this at release, it is still
      early days for Power8 so I think we can still slip this in and get away
      with it.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9514fe5
    • Nathan Fontenot's avatar
      powerpc/numa: Do not update sysfs cpu registration from invalid context · 910a1658
      Nathan Fontenot authored
      commit dd023217 upstream.
      
      The topology update code that updates the cpu node registration in sysfs
      should not be called while in stop_machine(). The register/unregister
      calls take a lock and may sleep.
      
      This patch moves these calls outside of the call to stop_machine().
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      910a1658
    • Chen Gang's avatar
      powerpc/smp: Section mismatch from smp_release_cpus to __initdata spinning_secondaries · 0288917d
      Chen Gang authored
      commit 8246aca7 upstream.
      
      the smp_release_cpus is a normal funciton and called in normal environments,
        but it calls the __initdata spinning_secondaries.
        need modify spinning_secondaries to match smp_release_cpus.
      
      the related warning:
        (the linker report boot_paca.33377, but it should be spinning_secondaries)
      
      -----------------------------------------------------------------------------
      
      WARNING: arch/powerpc/kernel/built-in.o(.text+0x23176): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377
      The function .smp_release_cpus() references
      the variable __initdata boot_paca.33377.
      This is often because .smp_release_cpus lacks a __initdata
      annotation or the annotation of boot_paca.33377 is wrong.
      
      WARNING: arch/powerpc/kernel/built-in.o(.text+0x231fe): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377
      The function .smp_release_cpus() references
      the variable __initdata boot_paca.33377.
      This is often because .smp_release_cpus lacks a __initdata
      annotation or the annotation of boot_paca.33377 is wrong.
      
      -----------------------------------------------------------------------------
      Signed-off-by: default avatarChen Gang <gang.chen@asianux.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0288917d
    • Michael Ellerman's avatar
      powerpc: Wire up the HV facility unavailable exception · d24966cf
      Michael Ellerman authored
      commit b14b6260 upstream.
      
      Similar to the facility unavailble exception, except the facilities are
      controlled by HFSCR.
      
      Adapt the facility_unavailable_exception() so it can be called for
      either the regular or Hypervisor facility unavailable exceptions.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d24966cf
    • Michael Ellerman's avatar
      powerpc: Rename and flesh out the facility unavailable exception handler · 8e0af91a
      Michael Ellerman authored
      commit 021424a1 upstream.
      
      The exception at 0xf60 is not the TM (Transactional Memory) unavailable
      exception, it is the "Facility Unavailable Exception", rename it as
      such.
      
      Flesh out the handler to acknowledge the fact that it can be called for
      many reasons, one of which is TM being unavailable.
      
      Use STD_EXCEPTION_COMMON() for the exception body, for some reason we
      had it open-coded, I've checked the generated code is identical.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e0af91a
    • Michael Ellerman's avatar
      powerpc: Remove KVMTEST from RELON exception handlers · 77d8caac
      Michael Ellerman authored
      commit c9f69518 upstream.
      
      KVMTEST is a macro which checks whether we are taking an exception from
      guest context, if so we branch out of line and eventually call into the
      KVM code to handle the switch.
      
      When running real guests on bare metal (HV KVM) the hardware ensures
      that we never take a relocation on exception when transitioning from
      guest to host. For PR KVM we disable relocation on exceptions ourself in
      kvmppc_core_init_vm(), as of commit a413f474 "Disable relocation on
      exceptions whenever PR KVM is active".
      
      So convert all the RELON macros to use NOTEST, and drop the remaining
      KVM_HANDLER() definitions we have for 0xe40 and 0xe80.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77d8caac
    • Michael Ellerman's avatar
      powerpc: Remove unreachable relocation on exception handlers · 497f0957
      Michael Ellerman authored
      commit 1d567cb4 upstream.
      
      We have relocation on exception handlers defined for h_data_storage and
      h_instr_storage. However we will never take relocation on exceptions for
      these because they can only come from a guest, and we never take
      relocation on exceptions when we transition from guest to host.
      
      We also have a handler for hmi_exception (Hypervisor Maintenance) which
      is defined in the architecture to never be delivered with relocation on,
      see see v2.07 Book III-S section 6.5.
      
      So remove the handlers, leaving a branch to self just to be double extra
      paranoid.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      497f0957
    • Michael Neuling's avatar
      powerpc/tm: Fix return of active 64bit signals · 81bcd526
      Michael Neuling authored
      commit 87b4e539 upstream.
      
      Currently we only restore signals which are transactionally suspended but it's
      possible that the transaction can be restored even when it's active.  Most
      likely this will result in a transactional rollback by the hardware as the
      transaction will have been doomed by an earlier treclaim.
      
      The current code is a legacy of earlier kernel implementations which did
      software rollback of active transactions in the kernel.  That code has now gone
      but we didn't correctly fix up this part of the signals code which still makes
      assumptions based on having software rollback.
      
      This changes the signal return code to always restore both contexts on 64 bit
      signal return.  It also ensures that the MSR TM bits are properly restored from
      the signal context which they are not currently.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81bcd526
    • Michael Neuling's avatar
      powerpc/tm: Fix return of 32bit rt signals to active transactions · f6ff89fc
      Michael Neuling authored
      commit 55e43418 upstream.
      
      Currently we only restore signals which are transactionally suspended but it's
      possible that the transaction can be restored even when it's active.  Most
      likely this will result in a transactional rollback by the hardware as the
      transaction will have been doomed by an earlier treclaim.
      
      The current code is a legacy of earlier kernel implementations which did
      software rollback of active transactions in the kernel.  That code has now gone
      but we didn't correctly fix up this part of the signals code which still makes
      assumptions based on having software rollback.
      
      This changes the signal return code to always restore both contexts on 32 bit
      rt signal return.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6ff89fc
    • Michael Neuling's avatar
      powerpc/tm: Fix restoration of MSR on 32bit signal return · bc8ae522
      Michael Neuling authored
      commit 2c27a18f upstream.
      
      Currently we clear out the MSR TM bits on signal return assuming that the
      signal should never return to an active transaction.
      
      This is bogus as the user may do this.  It's most likely the transaction will
      be doomed due to a treclaim but that's a problem for the HW not the kernel.
      
      The current code is a legacy of earlier kernel implementations which did
      software rollback of active transactions in the kernel.  That code has now gone
      but we didn't correctly fix up this part of the signals code which still makes
      the assumption that it must be returning to a suspended transaction.
      
      This pulls out both MSR TM bits from the user supplied context rather than just
      setting TM suspend.  We pull out only the bits needed to ensure the user can't
      do anything dangerous to the MSR.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc8ae522
    • Michael Neuling's avatar
      powerpc/tm: Fix 32 bit non-rt signals · 74383413
      Michael Neuling authored
      commit fee55450 upstream.
      
      Currently sys_sigreturn() is TM unaware.  Therefore, if we take a 32 bit signal
      without SIGINFO (non RT) inside a transaction, on signal return we don't
      restore the signal frame correctly.
      
      This checks if the signal frame being restoring is an active transaction, and
      if so, it copies the additional state to ptregs so it can be restored.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74383413
    • Michael Neuling's avatar
      powerpc/tm: Fix writing top half of MSR on 32 bit signals · d6ea4422
      Michael Neuling authored
      commit 1d25f11f upstream.
      
      The MSR TM controls are in the top 32 bits of the MSR hence on 32 bit signals,
      we stick the top half of the MSR in the checkpointed signal context so that the
      user can access it.
      
      Unfortunately, we don't currently write anything to the checkpointed signal
      context when coming in a from a non transactional process and hence the top MSR
      bits can contain junk.
      
      This updates the 32 bit signal handling code to always write something to the
      top MSR bits so that users know if the process is transactional or not and the
      kernel can use it on signal return.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6ea4422
    • Benjamin Herrenschmidt's avatar
      powerpc/powernv: Fix iommu initialization again · e544a745
      Benjamin Herrenschmidt authored
      commit 74251fe2 upstream.
      
      So because those things always end up in trainwrecks... In 7846de40
      we moved back the iommu initialization earlier, essentially undoing
      37f02195 which was causing us endless trouble... except that in the
      meantime we had merged 959c9bdd (to workaround the original breakage)
      which is now ... broken :-)
      
      This fixes it by doing a partial revert of the latter (we keep the
      ppc_md. path which will be needed in the hotplug case, which happens
      also during some EEH error recovery situations).
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e544a745
    • Michael Neuling's avatar
      powerpc/hw_brk: Fix off by one error when validating DAWR region end · 3b743326
      Michael Neuling authored
      commit e2a800be upstream.
      
      The Data Address Watchpoint Register (DAWR) on POWER8 can take a 512
      byte range but this range must not cross a 512 byte boundary.
      
      Unfortunately we were off by one when calculating the end of the region,
      hence we were not allowing some breakpoint regions which were actually
      valid.  This fixes this error.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reported-by: default avatarEdjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b743326
    • Michael Neuling's avatar
      powerpc/hw_brk: Fix clearing of extraneous IRQ · 277b5ae1
      Michael Neuling authored
      commit 540e07c6 upstream.
      
      In 9422de3e "powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint
      registers" we changed the way we mark extraneous irqs with this:
      
      -	info->extraneous_interrupt = !((bp->attr.bp_addr <= dar) &&
      -			(dar - bp->attr.bp_addr < bp->attr.bp_len));
      +	if (!((bp->attr.bp_addr <= dar) &&
      +	      (dar - bp->attr.bp_addr < bp->attr.bp_len)))
      +		info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
      
      Unfortunately this is bogus as it never clears extraneous IRQ if it's already
      set.
      
      This correctly clears extraneous IRQ before possibly setting it.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reported-by: default avatarEdjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      277b5ae1
    • Michael Neuling's avatar
      powerpc/hw_brk: Fix setting of length for exact mode breakpoints · b101957a
      Michael Neuling authored
      commit b0b0aa9c upstream.
      
      The smallest match region for both the DABR and DAWR is 8 bytes, so the
      kernel needs to filter matches when users want to look at regions smaller than
      this.
      
      Currently we set the length of PPC_BREAKPOINT_MODE_EXACT breakpoints to 8.
      This is wrong as in exact mode we should only match on 1 address, hence the
      length should be 1.
      
      This ensures that the kernel will filter out any exact mode hardware breakpoint
      matches on any addresses other than the requested one.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reported-by: default avatarEdjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b101957a
  2. 22 Jul, 2013 18 commits