1. 23 Jun, 2016 10 commits
    • David S. Miller's avatar
      sparc64: Fix return from trap window fill crashes. · 9219a052
      David S. Miller authored
      [ Upstream commit 7cafc0b8 ]
      
      We must handle data access exception as well as memory address unaligned
      exceptions from return from trap window fill faults, not just normal
      TLB misses.
      
      Otherwise we can get an OOPS that looks like this:
      
      ld-linux.so.2(36808): Kernel bad sw trap 5 [#1]
      CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 #34
      task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
      TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
      TPC: <do_sparc64_fault+0x5c4/0x700>
      g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
      g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
      o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
      o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
      RPC: <do_sparc64_fault+0x5bc/0x700>
      l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
      l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
      i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
      i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
      I7: <user_rtt_fill_fixup+0x6c/0x7c>
      Call Trace:
       [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c
      
      The window trap handlers are slightly clever, the trap table entries for them are
      composed of two pieces of code.  First comes the code that actually performs
      the window fill or spill trap handling, and then there are three instructions at
      the end which are for exception processing.
      
      The userland register window fill handler is:
      
      	add	%sp, STACK_BIAS + 0x00, %g1;		\
      	ldxa	[%g1 + %g0] ASI, %l0;			\
      	mov	0x08, %g2;				\
      	mov	0x10, %g3;				\
      	ldxa	[%g1 + %g2] ASI, %l1;			\
      	mov	0x18, %g5;				\
      	ldxa	[%g1 + %g3] ASI, %l2;			\
      	ldxa	[%g1 + %g5] ASI, %l3;			\
      	add	%g1, 0x20, %g1;				\
      	ldxa	[%g1 + %g0] ASI, %l4;			\
      	ldxa	[%g1 + %g2] ASI, %l5;			\
      	ldxa	[%g1 + %g3] ASI, %l6;			\
      	ldxa	[%g1 + %g5] ASI, %l7;			\
      	add	%g1, 0x20, %g1;				\
      	ldxa	[%g1 + %g0] ASI, %i0;			\
      	ldxa	[%g1 + %g2] ASI, %i1;			\
      	ldxa	[%g1 + %g3] ASI, %i2;			\
      	ldxa	[%g1 + %g5] ASI, %i3;			\
      	add	%g1, 0x20, %g1;				\
      	ldxa	[%g1 + %g0] ASI, %i4;			\
      	ldxa	[%g1 + %g2] ASI, %i5;			\
      	ldxa	[%g1 + %g3] ASI, %i6;			\
      	ldxa	[%g1 + %g5] ASI, %i7;			\
      	restored;					\
      	retry; nop; nop; nop; nop;			\
      	b,a,pt	%xcc, fill_fixup_dax;			\
      	b,a,pt	%xcc, fill_fixup_mna;			\
      	b,a,pt	%xcc, fill_fixup;
      
      And the way this works is that if any of those memory accesses
      generate an exception, the exception handler can revector to one of
      those final three branch instructions depending upon which kind of
      exception the memory access took.  In this way, the fault handler
      doesn't have to know if it was a spill or a fill that it's handling
      the fault for.  It just always branches to the last instruction in
      the parent trap's handler.
      
      For example, for a regular fault, the code goes:
      
      winfix_trampoline:
      	rdpr	%tpc, %g3
      	or	%g3, 0x7c, %g3
      	wrpr	%g3, %tnpc
      	done
      
      All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
      trap time program counter, we'll get that final instruction in the
      trap handler.
      
      On return from trap, we have to pull the register window in but we do
      this by hand instead of just executing a "restore" instruction for
      several reasons.  The largest being that from Niagara and onward we
      simply don't have enough levels in the trap stack to fully resolve all
      possible exception cases of a window fault when we are already at
      trap level 1 (which we enter to get ready to return from the original
      trap).
      
      This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
      code branches directly to these to do the window fill by hand if
      necessary.  Now if you look at them, we'll see at the end:
      
      	    ba,a,pt    %xcc, user_rtt_fill_fixup;
      	    ba,a,pt    %xcc, user_rtt_fill_fixup;
      	    ba,a,pt    %xcc, user_rtt_fill_fixup;
      
      And oops, all three cases are handled like a fault.
      
      This doesn't work because each of these trap types (data access
      exception, memory address unaligned, and faults) store their auxiliary
      info in different registers to pass on to the C handler which does the
      real work.
      
      So in the case where the stack was unaligned, the unaligned trap
      handler sets up the arg registers one way, and then we branched to
      the fault handler which expects them setup another way.
      
      So the FAULT_TYPE_* value ends up basically being garbage, and
      randomly would generate the backtrace seen above.
      Reported-by: default avatarNick Alcock <nix@esperi.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9219a052
    • David S. Miller's avatar
      sparc: Harden signal return frame checks. · b3dbf811
      David S. Miller authored
      [ Upstream commit d11c2a0d ]
      
      All signal frames must be at least 16-byte aligned, because that is
      the alignment we explicitly create when we build signal return stack
      frames.
      
      All stack pointers must be at least 8-byte aligned.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b3dbf811
    • David S. Miller's avatar
      sparc64: Take ctx_alloc_lock properly in hugetlb_setup(). · acec837d
      David S. Miller authored
      [ Upstream commit 9ea46abe ]
      
      On cheetahplus chips we take the ctx_alloc_lock in order to
      modify the TLB lookup parameters for the indexed TLBs, which
      are stored in the context register.
      
      This is called with interrupts disabled, however ctx_alloc_lock
      is an IRQ safe lock, therefore we must take acquire/release it
      properly with spin_{lock,unlock}_irq().
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      acec837d
    • Babu Moger's avatar
      sparc/PCI: Fix for panic while enabling SR-IOV · 11885c1b
      Babu Moger authored
      [ Upstream commit d0c31e02 ]
      
      We noticed this panic while enabling SR-IOV in sparc.
      
      mlx4_core: Mellanox ConnectX core driver v2.2-1 (Jan  1 2015)
      mlx4_core: Initializing 0007:01:00.0
      mlx4_core 0007:01:00.0: Enabling SR-IOV with 5 VFs
      mlx4_core: Initializing 0007:01:00.1
      Unable to handle kernel NULL pointer dereference
      insmod(10010): Oops [#1]
      CPU: 391 PID: 10010 Comm: insmod Not tainted
      		4.1.12-32.el6uek.kdump2.sparc64 #1
      TPC: <dma_supported+0x20/0x80>
      I7: <__mlx4_init_one+0x324/0x500 [mlx4_core]>
      Call Trace:
       [00000000104c5ea4] __mlx4_init_one+0x324/0x500 [mlx4_core]
       [00000000104c613c] mlx4_init_one+0xbc/0x120 [mlx4_core]
       [0000000000725f14] local_pci_probe+0x34/0xa0
       [0000000000726028] pci_call_probe+0xa8/0xe0
       [0000000000726310] pci_device_probe+0x50/0x80
       [000000000079f700] really_probe+0x140/0x420
       [000000000079fa24] driver_probe_device+0x44/0xa0
       [000000000079fb5c] __device_attach+0x3c/0x60
       [000000000079d85c] bus_for_each_drv+0x5c/0xa0
       [000000000079f588] device_attach+0x88/0xc0
       [000000000071acd0] pci_bus_add_device+0x30/0x80
       [0000000000736090] virtfn_add.clone.1+0x210/0x360
       [00000000007364a4] sriov_enable+0x2c4/0x520
       [000000000073672c] pci_enable_sriov+0x2c/0x40
       [00000000104c2d58] mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
       [00000000104c49ac] mlx4_load_one+0x42c/0xd40 [mlx4_core]
      Disabling lock debugging due to kernel taint
      Caller[00000000104c5ea4]: __mlx4_init_one+0x324/0x500 [mlx4_core]
      Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
      Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
      Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
      Caller[0000000000726310]: pci_device_probe+0x50/0x80
      Caller[000000000079f700]: really_probe+0x140/0x420
      Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
      Caller[000000000079fb5c]: __device_attach+0x3c/0x60
      Caller[000000000079d85c]: bus_for_each_drv+0x5c/0xa0
      Caller[000000000079f588]: device_attach+0x88/0xc0
      Caller[000000000071acd0]: pci_bus_add_device+0x30/0x80
      Caller[0000000000736090]: virtfn_add.clone.1+0x210/0x360
      Caller[00000000007364a4]: sriov_enable+0x2c4/0x520
      Caller[000000000073672c]: pci_enable_sriov+0x2c/0x40
      Caller[00000000104c2d58]: mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
      Caller[00000000104c49ac]: mlx4_load_one+0x42c/0xd40 [mlx4_core]
      Caller[00000000104c5f90]: __mlx4_init_one+0x410/0x500 [mlx4_core]
      Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
      Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
      Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
      Caller[0000000000726310]: pci_device_probe+0x50/0x80
      Caller[000000000079f700]: really_probe+0x140/0x420
      Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
      Caller[000000000079fb08]: __driver_attach+0x88/0xa0
      Caller[000000000079d90c]: bus_for_each_dev+0x6c/0xa0
      Caller[000000000079f29c]: driver_attach+0x1c/0x40
      Caller[000000000079e35c]: bus_add_driver+0x17c/0x220
      Caller[00000000007a02d4]: driver_register+0x74/0x120
      Caller[00000000007263fc]: __pci_register_driver+0x3c/0x60
      Caller[00000000104f62bc]: mlx4_init+0x60/0xcc [mlx4_core]
      Kernel panic - not syncing: Fatal exception
      Press Stop-A (L1-A) to return to the boot prom
      ---[ end Kernel panic - not syncing: Fatal exception
      
      Details:
      Here is the call sequence
      virtfn_add->__mlx4_init_one->dma_set_mask->dma_supported
      
      The panic happened at line 760(file arch/sparc/kernel/iommu.c)
      
      758 int dma_supported(struct device *dev, u64 device_mask)
      759 {
      760         struct iommu *iommu = dev->archdata.iommu;
      761         u64 dma_addr_mask = iommu->dma_addr_mask;
      762
      763         if (device_mask >= (1UL << 32UL))
      764                 return 0;
      765
      766         if ((device_mask & dma_addr_mask) == dma_addr_mask)
      767                 return 1;
      768
      769 #ifdef CONFIG_PCI
      770         if (dev_is_pci(dev))
      771		return pci64_dma_supported(to_pci_dev(dev), device_mask);
      772 #endif
      773
      774         return 0;
      775 }
      776 EXPORT_SYMBOL(dma_supported);
      
      Same panic happened with Intel ixgbe driver also.
      
      SR-IOV code looks for arch specific data while enabling
      VFs. When VF device is added, driver probe function makes set
      of calls to initialize the pci device. Because the VF device is
      added different way than the normal PF device(which happens via
      of_create_pci_dev for sparc), some of the arch specific initialization
      does not happen for VF device.  That causes panic when archdata is
      accessed.
      
      To fix this, I have used already defined weak function
      pcibios_setup_device to copy archdata from PF to VF.
      Also verified the fix.
      Signed-off-by: default avatarBabu Moger <babu.moger@oracle.com>
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Reviewed-by: default avatarEthan Zhao <ethan.zhao@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      11885c1b
    • David S. Miller's avatar
      sparc64: Fix sparc64_set_context stack handling. · b98a9a51
      David S. Miller authored
      [ Upstream commit 397d1533 ]
      
      Like a signal return, we should use synchronize_user_stack() rather
      than flush_user_windows().
      Reported-by: default avatarIlya Malakhov <ilmalakhovthefirst@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b98a9a51
    • David S. Miller's avatar
      sparc64: Fix bootup regressions on some Kconfig combinations. · 40fd3377
      David S. Miller authored
      [ Upstream commit 49fa5230 ]
      
      The system call tracing bug fix mentioned in the Fixes tag
      below increased the amount of assembler code in the sequence
      of assembler files included by head_64.S
      
      This caused to total set of code to exceed 0x4000 bytes in
      size, which overflows the expression in head_64.S that works
      to place swapper_tsb at address 0x408000.
      
      When this is violated, the TSB is not properly aligned, and
      also the trap table is not aligned properly either.  All of
      this together results in failed boots.
      
      So, do two things:
      
      1) Simplify some code by using ba,a instead of ba/nop to get
         those bytes back.
      
      2) Add a linker script assertion to make sure that if this
         happens again the build will fail.
      
      Fixes: 1a40b953 ("sparc: Fix system call tracing register handling.")
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Reported-by: default avatarJoerg Abraham <joerg.abraham@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      40fd3377
    • Mike Frysinger's avatar
      sparc: Fix system call tracing register handling. · 3428d1d1
      Mike Frysinger authored
      [ Upstream commit 1a40b953 ]
      
      A system call trace trigger on entry allows the tracing
      process to inspect and potentially change the traced
      process's registers.
      
      Account for that by reloading the %g1 (syscall number)
      and %i0-%i5 (syscall argument) values.  We need to be
      careful to revalidate the range of %g1, and reload the
      system call table entry it corresponds to into %l7.
      Reported-by: default avatarMike Frysinger <vapier@gentoo.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Tested-by: default avatarMike Frysinger <vapier@gentoo.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3428d1d1
    • Russell Currey's avatar
      powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge · f856f1da
      Russell Currey authored
      commit 871e178e upstream.
      
      In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the
      spec states that values of 9900-9905 can be returned, indicating that
      software should delay for 10^x (where x is the last digit, i.e. 990x)
      milliseconds and attempt the call again. Currently, the kernel doesn't
      know about this, and respecting it fixes some PCI failures when the
      hypervisor is busy.
      
      The delay is capped at 0.2 seconds.
      Signed-off-by: default avatarRussell Currey <ruscur@russell.cc>
      Acked-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f856f1da
    • Ralf Baechle's avatar
      MIPS: Fix 64k page support for 32 bit kernels. · 05546689
      Ralf Baechle authored
      commit d7de4134 upstream.
      
      TASK_SIZE was defined as 0x7fff8000UL which for 64k pages is not a
      multiple of the page size.  Somewhere further down the math fails
      such that executing an ELF binary fails.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Tested-by: default avatarJoshua Henderson <joshua.henderson@microchip.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      05546689
    • Taku Izumi's avatar
      PCI/AER: Clear error status registers during enumeration and restore · 3d4036cb
      Taku Izumi authored
      commit b07461a8 upstream.
      
      AER errors might be recorded when powering-on devices.  These errors can be
      ignored, so firmware usually clears them before the OS enumerates devices.
      However, firmware is not involved when devices are added via hotplug, so
      the OS may discover power-up errors that should be ignored.  The same may
      happen when powering up devices when resuming after suspend.
      
      Clear the AER error status registers during enumeration and resume.
      
      [bhelgaas: changelog, remove repetitive comments]
      Signed-off-by: default avatarTaku Izumi <izumi.taku@jp.fujitsu.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3d4036cb
  2. 15 Jun, 2016 30 commits