1. 30 Sep, 2014 23 commits
  2. 25 Sep, 2014 17 commits
    • Joe Perches's avatar
      powerpc: pci-ioda: Use a single function to emit logging messages · 6d31c2fa
      Joe Perches authored
      No need for 3 functions when a single one will do.
      
      Modify the function declaring macros to call the single function.
      
      Reduces object code size a little:
      
      $ size arch/powerpc/platforms/powernv/pci-ioda.o*
         text	   data	    bss	    dec	    hex	filename
        22303	   1073	   6680	  30056	   7568	arch/powerpc/platforms/powernv/pci-ioda.o.new
        22840	   1121	   6776	  30737	   7811	arch/powerpc/platforms/powernv/pci-ioda.o.old
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6d31c2fa
    • Joe Perches's avatar
      powerpc: pci-ioda: Remove unnecessary return value from printk · 45eb4724
      Joe Perches authored
      The return value is unnecessary and unused, so make the functions
      void instead of int.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      45eb4724
    • Wei Yang's avatar
      powerpc/eeh: Fix kernel crash when passing through VF · 2a58222f
      Wei Yang authored
      When doing vfio passthrough a VF, the kernel will crash with following
      message:
      
      [  442.656459] Unable to handle kernel paging request for data at address 0x00000060
      [  442.656593] Faulting instruction address: 0xc000000000038b88
      [  442.656706] Oops: Kernel access of bad area, sig: 11 [#1]
      [  442.656798] SMP NR_CPUS=1024 NUMA PowerNV
      [  442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core]
      [  442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ #37
      [  442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000
      [  442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0
      [  442.658352] REGS: c000000f6dc3f580 TRAP: 0300   Not tainted  (3.10.42yw-pkvm+)
      [  442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 28004882  XER: 20000000
      [  442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1
      GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000
      GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff
      GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000
      GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0
      GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18
      GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0
      GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0
      GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003
      [  442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130
      [  442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
      [  442.659585] Call Trace:
      [  442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable)
      [  442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
      [  442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80
      [  442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0
      [  442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430
      [  442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0
      [  442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci]
      [  442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630
      [  442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0
      [  442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0
      [  442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98
      [  442.660420] Instruction dump:
      [  442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78
      [  442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002
      [  442.660679] ---[ end trace a64ac9546bcf0328 ]---
      [  442.660724]
      
      The reason is current VF is not EEH enabled.
      
      This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it
      will prevent converting with NULL pointer.
      Signed-off-by: default avatarWei Yang <weiyang@linux.vnet.ibm.com>
      Acked-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      CC: Michael Ellerman <mpe@ellerman.id.au>
      
      V3 -> V4:
         1. move the macro definition from include/linux/pci.h to
            arch/powerpc/include/asm/eeh.h
      
      V2 -> V3:
         1. rebased on 3.17-rc4
         2. introduce a macro
         3. use this macro in several other places
      
      V1 -> V2:
         1. code style and patch subject adjustment
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2a58222f
    • Michael Ellerman's avatar
      powerpc/mm: Unindent htab_dt_scan_page_sizes() · 9e34992a
      Michael Ellerman authored
      We can unindent the bulk of htab_dt_scan_page_sizes() by returning early
      if the property is not found. That is nice in and of itself, but also
      has the advantage of making it clear that we always return success once
      we have found the ibm,segment-page-sizes property.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9e34992a
    • Michael Ellerman's avatar
      powerpc/ppc64: Print CPU/MMU/FW features at boot · 87d99c0e
      Michael Ellerman authored
      "Helps debug funky firmware issues".
      
      After:
        Starting Linux PPC64 #108 SMP Wed Aug 6 19:04:51 EST 2014
        -----------------------------------------------------
        ppc64_pft_size    = 0x1a
        phys_mem_size     = 0x200000000
        cpu_features      = 0x17fc7a6c18500249
          possible        = 0x1fffffff18700649
          always          = 0x0000000000000040
        cpu_user_features = 0xdc0065c2 0xee000000
        mmu_features      = 0x5a000001
        firmware_features = 0x00000001405a440b
        htab_hash_mask    = 0x7ffff
        -----------------------------------------------------
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      87d99c0e
    • Michael Ellerman's avatar
      powerpc/ppc64: Clean up the boot-time settings display · bdce97e9
      Michael Ellerman authored
      At boot we display a bunch of low level settings which can be useful to
      know, and can help to spot bugs when things are fundamentally
      misconfigured.
      
      At the moment they are very widely spaced, so that we can accommodate
      the line:
      
        ppc64_caches.dcache_line_size = 0xYY
      
      But we only print that line when the cache line size is not 128, ie.
      almost never, so it just makes the display look odd usually.
      
      The ppc64_caches prefix is redundant so remove it, which means we can
      align things a bit closer for the common case. While we're there
      replace the last use of camelCase (physicalMemorySize), and use
      phys_mem_size.
      
      Before:
        Starting Linux PPC64 #104 SMP Wed Aug 6 18:41:34 EST 2014
        -----------------------------------------------------
        ppc64_pft_size                = 0x1a
        physicalMemorySize            = 0x200000000
        ppc64_caches.dcache_line_size = 0xf0
        ppc64_caches.icache_line_size = 0xf0
        htab_address                  = 0xdeadbeef
        htab_hash_mask                = 0x7ffff
        physical_start                = 0xf000bar
        -----------------------------------------------------
      
      After:
        Starting Linux PPC64 #103 SMP Wed Aug 6 18:38:04 EST 2014
        -----------------------------------------------------
        ppc64_pft_size    = 0x1a
        phys_mem_size     = 0x200000000
        dcache_line_size  = 0xf0
        icache_line_size  = 0xf0
        htab_address      = 0xdeadbeef
        htab_hash_mask    = 0x7ffff
        physical_start    = 0xf000bar
        -----------------------------------------------------
      
      This patch is final, no bike shedding ;)
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      bdce97e9
    • Pranith Kumar's avatar
      powerpc: Fix build failure when CONFIG_USB=y · 92f792ec
      Pranith Kumar authored
      We are enabling USB unconditionally which results in following build failure
      
      drivers/built-in.o: In function `tb_drom_read':
      (.text+0x1b62b70): undefined reference to `usb_speed_string'
      make: *** [vmlinux] Error
      
      Enable USB only if USB_SUPPORT is set to avoid such failures
      Signed-off-by: default avatarPranith Kumar <bobby.prani@gmail.com>
      Acked-by: default avatarAlistair Popple <alistair@popple.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      92f792ec
    • Pranith Kumar's avatar
      powerpc: Fix build failure on 44x · a9303e1b
      Pranith Kumar authored
      Fix the following build failure
      
      drivers/built-in.o: In function `nhi_init':
      nhi.c:(.init.text+0x63390): undefined reference to `ehci_init_driver'
      
      by adding a dependency on USB_EHCI_HCD which supplies the ehci_init_driver().
      
      Also we need to depend on USB_OHCI_HCD similarly
      Signed-off-by: default avatarPranith Kumar <bobby.prani@gmail.com>
      Acked-by: default avatarAlistair Popple <alistair@popple.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a9303e1b
    • Li Zhong's avatar
      powerpc: some changes in numa_setup_cpu() · 297cf502
      Li Zhong authored
      this patches changes some error handling logics in numa_setup_cpu(),
      when cpu node is not found, so:
      
      if the cpu is possible, but not present, -1 is kept in numa_cpu_lookup_table,
      so later, if the cpu is added, we could set correct numa information for it.
      
      if the cpu is present, then we set the first online node to
      numa_cpu_lookup_table instead of 0 ( in case 0 might not be an online node? )
      
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Acked-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      297cf502
    • Li Zhong's avatar
      powerpc: Only set numa node information for present cpus at boottime · bc3c4327
      Li Zhong authored
      As Nish suggested, it makes more sense to init the numa node informatiion
      for present cpus at boottime, which could also avoid WARN_ON(1) in
      numa_setup_cpu().
      
      With this change, we also need to change the smp_prepare_cpus() to set up
      numa information only on present cpus.
      
      For those possible, but not present cpus, their numa information
      will be set up after they are started, as the original code did before commit
      2fabf084.
      
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Acked-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Tested-by: default avatarCyril Bur <cyril.bur@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      bc3c4327
    • Li Zhong's avatar
      powerpc: Fix warning reported by verify_cpu_node_mapping() · 70ad2375
      Li Zhong authored
      With commit 2fabf084 ("powerpc: reorder per-cpu NUMA information's
      initialization"), during boottime, cpu_numa_callback() is called
      earlier(before their online) for each cpu, and verify_cpu_node_mapping()
      uses cpu_to_node() to check whether siblings are in the same node.
      
      It skips the checking for siblings that are not online yet. So the only
      check done here is for the bootcpu, which is online at that time. But
      the per-cpu numa_node cpu_to_node() uses hasn't been set up yet (which
      will be set up in smp_prepare_cpus()).
      
      So I saw something like following reported:
      [    0.000000] CPU thread siblings 1/2/3 and 0 don't belong to the same
      node!
      
      As we don't actually do the checking during this early stage, so maybe
      we could directly call numa_setup_cpu() in do_init_bootmem().
      
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Acked-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      70ad2375
    • Paul Mackerras's avatar
      powerpc: Implement emulation of string loads and stores · c9f6f4ed
      Paul Mackerras authored
      The size field of the op.type word is now the total number of bytes
      to be loaded or stored.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c9f6f4ed
    • Paul Mackerras's avatar
      powerpc: Emulate icbi, mcrf and conditional-trap instructions · cf87c3f6
      Paul Mackerras authored
      This extends the instruction emulation done by analyse_instr() and
      emulate_step() to handle a few more instructions that are found in
      the kernel.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      cf87c3f6
    • Paul Mackerras's avatar
      powerpc: Split out instruction analysis part of emulate_step() · be96f633
      Paul Mackerras authored
      This splits out the instruction analysis part of emulate_step() into
      a separate analyse_instr() function, which decodes the instruction,
      but doesn't execute any load or store instructions.  It does execute
      integer instructions and branches which can be executed purely by
      updating register values in the pt_regs struct.  For other instructions,
      it returns the instruction type and other details in a new
      instruction_op struct.  emulate_step() then uses that information
      to execute loads, stores, cache operations, mfmsr, mtmsr[d], and
      (on 64-bit) sc instructions.
      
      The reason for doing this is so that the KVM code can use it instead
      of having its own separate instruction emulation code.  Possibly the
      alignment interrupt handler could also use this.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      be96f633
    • Michael Ellerman's avatar
      powerpc: Check flat device tree version at boot · ad72a279
      Michael Ellerman authored
      In commit e6a6928c "of/fdt: Convert FDT functions to use libfdt",
      the kernel stopped supporting old flat device tree formats. The minimum
      supported version is now 0x10.
      
      There was a checking function added, early_init_dt_verify(), but it's
      not called on powerpc.
      
      The result is, if you boot with an old flat device tree, the kernel will
      fail to parse it correctly, think you have no memory etc. and hilarity
      ensues.
      
      We can't really fix it, but we can at least catch the fact that the
      device tree is in an unsupported format and panic(). We can't call
      BUG(), it's too early.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ad72a279
    • Paul Mackerras's avatar
      powerpc/powernv: Don't call generic code on offline cpus · d6a4f709
      Paul Mackerras authored
      On PowerNV platforms, when a CPU is offline, we put it into nap mode.
      It's possible that the CPU wakes up from nap mode while it is still
      offline due to a stray IPI.  A misdirected device interrupt could also
      potentially cause it to wake up.  In that circumstance, we need to clear
      the interrupt so that the CPU can go back to nap mode.
      
      In the past the clearing of the interrupt was accomplished by briefly
      enabling interrupts and allowing the normal interrupt handling code
      (do_IRQ() etc.) to handle the interrupt.  This has the problem that
      this code calls irq_enter() and irq_exit(), which call functions such
      as account_system_vtime() which use RCU internally.  Use of RCU is not
      permitted on offline CPUs and will trigger errors if RCU checking is
      enabled.
      
      To avoid calling into any generic code which might use RCU, we adopt
      a different method of clearing interrupts on offline CPUs.  Since we
      are on the PowerNV platform, we know that the system interrupt
      controller is a XICS being driven directly (i.e. not via hcalls) by
      the kernel.  Hence this adds a new icp_native_flush_interrupt()
      function to the native-mode XICS driver and arranges to call that
      when an offline CPU is woken from nap.  This new function reads the
      interrupt from the XICS.  If it is an IPI, it clears the IPI; if it
      is a device interrupt, it prints a warning and disables the source.
      Then it does the end-of-interrupt processing for the interrupt.
      
      The other thing that briefly enabling interrupts did was to check and
      clear the irq_happened flag in this CPU's PACA.  Therefore, after
      flushing the interrupt from the XICS, we also clear all bits except
      the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the
      irq_happened flag.  The PACA_IRQ_HARD_DIS flag is set by power7_nap()
      and is left set to indicate that interrupts are hard disabled.  This
      means we then have to ignore that flag in power7_nap(), which is
      reasonable since it doesn't indicate that any interrupt event needs
      servicing.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d6a4f709
    • Anton Blanchard's avatar
      powerpc: Use CONFIG_ARCH_HAS_FAST_MULTIPLIER · 423216ed
      Anton Blanchard authored
      I ran some tests to compare hash_64 using shifts and multiplies.
      The results:
      
      POWER6:	~2x slower
      POWER7: ~2x faster
      POWER8: ~2x faster
      
      Now we have a proper config option, select
      CONFIG_ARCH_HAS_FAST_MULTIPLIER on POWER7 and POWER8.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      423216ed