- 07 Aug, 2014 1 commit
-
-
Gavin Shan authored
The function is used by VFIO driver, which might be built as a dynamic module. So it should be exported. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 Aug, 2014 39 commits
-
-
Benjamin Herrenschmidt authored
Some new functions are exposed for use by the IOMMU code but won't build when CONFIG_IOMMU_API isn't set, so shield them appropriately. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Paul Mackerras authored
Some people see things like "Exception: 501" in stack traces in dmesg and assume that means that something has gone badly wrong, when in fact "Exception: 501" just means a device interrupt was taken. This changes "Exception" to "interrupt" to make it clearer that we are just recording the fact of a change in control flow rather than some error condition. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
vmemmap_populated() checks whether the [start, start + page_size) has valid pfn numbers, to know whether a vmemmap mapping has been created that includes this range. Some range before end might not be checked by this loop: sec11start......start11..sec11end/sec12start..end....start12..sec12end as the above, for start11(section 11), it checks [sec11start, sec11end), and loop ends as the next start(start12) is bigger than end. However, [sec11end/sec12start, end) is not checked here. So before the loop, adjust the start to be the start of the section, so we don't miss ranges like the above. After we adjust start to be the start of the section, it also means it's aligned with vmemmap as of the sizeof struct page, so we could use page_to_pfn directly in the loop. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
vmemmap_free() does the opposite of vmemap_populate(). This patch also puts vmemmap_free() and vmemmap_list_free() into CONFIG_MEMMORY_HOTPLUG. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
This is to be called in vmemmap_free(), leave the implementation on BOOK3E empty as before. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
This patch implements vmemmap_list_free() for vmemmap_free(). The freed entries will be removed from vmemmap_list, and form a freed list, with next as the header. The next position in the last allocated page is kept at the list tail. When allocation, if there are freed entries left, get it from the freed list; if no freed entries left, get it like before from the last allocated pages. With this change, realmode_pfn_to_page() also needs to be changed to walk all the entries in the vmemmap_list, as the virt_addr of the entries might not be stored in order anymore. It helps to reuse the memory when continuous doing memory hot-plug/remove operations, but didn't reclaim the pages already allocated, so the memory usage will only increase, but won't exceed the value for the largest memory configuration. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Madhusudanan Kandasamy authored
remap_4k_pfn() silently truncates upper bits of input 4K PFN if it cannot be contained in PTE. This leads invalid memory mapping and could result in a system crash when the memory is accessed. This patch fails remap_4k_pfn() and returns -EINVAL if the input 4K PFN cannot be contained in PTE. V3 : Added parentheses to protect 'pfn' and entire macro as suggested by Brian. V2 : Rewritten to avoid helper function as suggested by Stephen Rothwell. Signed-off-by: Madhusudanan Kandasamy <kmadhu@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mahesh Salgaonkar authored
(NOTE: This patch depends on upstream HMI handling patchset at https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-July/119731.html) The current HMI handling on napping cpus does not take care of endianess issue. On LE host kernel when we wake up from nap due to HMI interrupt we would checkstop while jumping into opal call. There is a similar issue in case of fast sleep wakeup where the code invokes opal_resync_tb opal call without handling LE issue. This patch fixes that as well. With this patch applied, HMIs handling on LE host kernel works fine. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mahesh Salgaonkar authored
HMIs are thread specific and can come while thread is in sleep/nap mode. Hence with SMT=off mode we can receive HMIs on sleeping threads. For interrupt received in nap mode, cpu wakes up at system reset vector, clears the interrupt and go back to nap mode again. But HMIs are sticky and they keep happening until we clear reason bits from HMER. Hence add a special check for HMI in reset vector (through power7_wakeup_* functions) and invoke opal call to handle HMI. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mahesh Salgaonkar authored
When we hit the HMI in Linux, invoke opal call to handle/recover from HMI errors in real mode and then in virtual mode during check_irq_replay() invoke opal_poll_events()/opal_do_notifier() to retrieve HMI event from OPAL and act accordingly. Now that we are ready to handle HMI interrupt directly in linux, remove the HMI interrupt registration with firmware. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mahesh Salgaonkar authored
Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Alexey Kardashevskiy authored
There is a couple of commented debug prints which still use IOMMU_PAGE_SHIFT() which is not defined for POWERPC anymore, replace them with it_page_shift. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The PCI config accessors check for PE frozen state and clear it if EEH isn't functional. The patch handles compound PE in config accessors if PHB supports it. For consistency, all PEs will be put into frozen state if any one in compound group gets frozen by hardware. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch handles compound PE for EEH backend. If one specific PE in compound group has been frozen, we enforces to freeze all PEs in the group. If we're enable DMA or MMIO for one PE in compound group, DMA or MMIO of all PEs in the group will be enabled. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch introduces 3 PHB callbacks: compound PE state retrieval, force freezing and unfreezing compound PE. The PCI config accessors and PowerNV EEH backend can use them in subsequent patches. We don't export the capability of compound PE to EEH core, which helps avoiding more complexity to EEH core. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
Function ioda_eeh_get_state() is used to fetch EEH state for PHB or PE. We're going to support compound PE and the function becomes more complicated with that. The patch splits the function into two functions for PHB and PE cases separately to improve readability. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch synchronizes header file with firmware to have new OPAL API opal_pci_eeh_freeze_set(), which is used to freeze the specified PE in order to support "compound" PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Guo Chao authored
This patch enables M64 aperatus for PHB3. We already had platform hook (ppc_md.pcibios_window_alignment) to affect the PCI resource assignment done in PCI core so that each PE's M32 resource was built on basis of M32 segment size. Similarly, we're using that for M64 assignment on basis of M64 segment size. * We're using last M64 BAR to cover M64 aperatus, and it's shared by all 256 PEs. * We don't support P7IOC yet. However, some function callbacks are added to (struct pnv_phb) so that we can reuse them on P7IOC in future. * PE, corresponding to PCI bus with large M64 BAR device attached, might span multiple M64 segments. We introduce "compound" PE to cover the case. The compound PE is a list of PEs and the master PE is used as before. The slave PEs are just for MMIO isolation. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch allows PE (struct eeh_pe) instance to have auxillary data, whose size is configurable on basis of platform. For PowerNV, the auxillary data will be used to cache PHB diag-data for that PE (frozen PE or fenced PHB). In turn, we can retrieve the diag-data at any later points. It's useful for the case of VFIO PCI devices where the error log should be cached, and then be retrieved by the guest at later point. Also, it can avoid PHB diag-data overwritting if another frozen PE reported and the previous diag-data isn't fetched by guest. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness problems in EEH"). The patch helps to get non-endian-dependent diag-data. Cc: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
pr_warn() is equal to pr_warning(), but the former is a bit more formal according to commit fc62f2f1 ("kernel.h: add pr_warn for symmetry to dev_warn, netdev_warn"). The patch replaces pr_warning() with pr_warn(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch prints 4 PCIE or AER config registers each line, which is part of the EEH log so that it looks a bit more compact. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
According to the experiment I did, PCI config access is blocked on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That means we always get 0xFF's while dumping PCI config space of the frozen PE on P7IOC. We don't have the problem on PHB3. So we have to enable I/O prioir to collecting error log. Otherwise, meaningless 0xFF's are always returned. The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is selectively set to indicate the case for: P7IOC on PowerNV platform, pSeries platform. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
There are multiple global EEH flags. Almost each flag has its own accessor, which doesn't make sense. The patch refactors EEH flag accessors so that they look unified: eeh_add_flag(): Add EEH flag eeh_clear_flag(): Clear EEH flag eeh_has_flag(): Check if one specific flag has been set eeh_enabled(): Check if EEH functionality has been enabled Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
Function eeh_iommu_group_to_pe() iterates each PCI device to check the binding IOMMU group with get_iommu_table_base(), which possibly fetches pdev->dev.archdata.dma_data.dma_offset. It's (0x1 << 59) for "bypass" cases. The patch fixes the issue by iterating devices hooked to the IOMMU group and fetch IOMMU table there. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass through one PCI device, whose hose driver ever enable the bypass mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU table. However, EEH needs access the IOMMU table when the device is owned by guest. The patch fixes pdev->dev.archdata.dma_data.iommu_table when passing through the device to guest in pnv_pci_ioda2_set_bypass(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mike Qiu authored
pci_get_slot() is called with hold of PCI bus semaphore and it's not safe to be called in interrupt context. However, we possibly checks EEH error and calls the function in interrupt context. To avoid using pci_get_slot(), we turn into device tree for fetching location code. Otherwise, we might run into WARN_ON() as following messages indicate: WARNING: at drivers/pci/search.c:223 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc3+ #72 task: c000000001367af0 ti: c000000001444000 task.ti: c000000001444000 NIP: c000000000497b70 LR: c000000000037530 CTR: 000000003003d114 REGS: c000000001446fa0 TRAP: 0700 Not tainted (3.16.0-rc3+) MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 48002422 XER: 20000000 CFAR: c00000000003752c SOFTE: 0 : NIP [c000000000497b70] .pci_get_slot+0x40/0x110 LR [c000000000037530] .eeh_pe_loc_get+0x150/0x190 Call Trace: .of_get_property+0x30/0x60 (unreliable) .eeh_pe_loc_get+0x150/0x190 .eeh_dev_check_failure+0x1b4/0x550 .eeh_check_failure+0x90/0xf0 .lpfc_sli_check_eratt+0x504/0x7c0 [lpfc] .lpfc_poll_eratt+0x64/0x100 [lpfc] .call_timer_fn+0x64/0x190 .run_timer_softirq+0x2cc/0x3e0 Cc: stable@vger.kernel.org Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Lucas Tanure authored
Fix wrong __IO_H definition in boot/io.h Reported-by: Fernando Silveira <fsilveira@gmail.com> Signed-off-by: Lucas Tanure <tanure@linux.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Tyrel Datwyler authored
Commit bcdde7e2 made __sysfs_remove_dir() recursive and introduced a BUG_ON during PHB removal while attempting to delete the power managment attribute group of the bus. This is a result of tearing the bridge and bus devices down out of order in remove_phb_dynamic. Since, the the bus resides below the bridge in the sysfs device tree it should be torn down first. This patch simply moves the device_unregister call for the PHB bridge device after the device_unregister call for the PHB bus. Fixes: bcdde7e2 ("sysfs: make __sysfs_remove_dir() recursive") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian W Hart authored
powerpc defines various machine-specific routines for handling pci_set_dma_mask(). The routines for machine "PowerNV" may neglect to set dev->dma_mask. This could confuse anyone (e.g. drivers) that consult dev->dma_mask to find the current mask. Set the dma_mask in the PowerNV leaf routine. Signed-off-by: Brian W. Hart <hartb@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Scott Wood authored
This silences a section mismatch warning. early_alloc_pgtable() is called from map_kernel_page() which cannot be __init, but only when slab_is_available() returns false which can only happen during early boot. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Vaidyanathan Srinivasan authored
Flags from device-tree need to be parsed with accessors for interpreting correct value in little-endian. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Vaidyanathan Srinivasan authored
Cpufreq depends on platform firmware to implement PStates. In case of platform firmware failure, cpufreq should not panic host kernel with BUG_ON(). Less severe pr_warn() will suffice. Add firmware_has_feature(FW_FEATURE_OPALv3) check to skip probing for device-tree on non-powernv platforms. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Andrey Utkin authored
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81631Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mike Qiu authored
The sysfs entries are lost because of commit 2213fb14 ("powerpc/eeh: Skip eeh sysfs when eeh is disabled"). That commit added condition to create sysfs entries with EEH_ENABLED, which isn't populated when trying to create sysfs entries on PowerNV platform during system boot time. The patch fixes the issue by: * Reoder EEH initialization functions so that they're same on PowerNV/pSeries. * Cache PE's primary bus by PowerNV platform instead of EEH core to avoid kernel crash caused by the function reorder. Another benefit with this is to avoid one eeh_probe_mode_dev() in EEH core. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch adds new IOCTL commands for sPAPR VFIO container device to support EEH functionality for PCI devices, which have been passed through from host to somebody else via VFIO. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
The patch exports functions to be used by new VFIO ioctl command, which will be introduced in subsequent patch, to support EEH functinality for VFIO PCI devices. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gavin Shan authored
We must not handle EEH error on devices which are passed to somebody else. Instead, we expect that the frozen device owner detects an EEH error and recovers from it. This avoids EEH error handling on passed through devices so the device owner gets a chance to handle them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Scott writes: Highlights include e6500 hardware threading support, an e6500 TLB erratum workaround, corenet error reporting, support for a new board, and some minor fixes.
-