1. 30 Nov, 2008 6 commits
    • Paul Mackerras's avatar
      powerpc: Fix system calls on Cell entered with XER.SO=1 · ab598b66
      Paul Mackerras authored
      It turns out that on Cell, on a kernel with CONFIG_VIRT_CPU_ACCOUNTING
      = y, if a program sets the SO (summary overflow) bit in the XER and
      then does a system call, the SO bit in CR0 will be set on return
      regardless of whether the system call detected an error.  Since CR0.SO
      is used as the error indication from the system call, this means that
      all system calls appear to fail.
      
      The reason is that the workaround for the timebase bug on Cell uses a
      compare instruction.  With CONFIG_VIRT_CPU_ACCOUNTING = y, the
      ACCOUNT_CPU_USER_ENTRY macro reads the timebase, so we end up doing a
      compare instruction, which copies XER.SO to CR0.SO.  Since we were
      doing this in the system call entry patch after clearing CR0.SO but
      before saving the CR, this meant that the saved CR image had CR0.SO
      set if XER.SO was set on entry.
      
      This fixes it by moving the clearing of CR0.SO to after the
      ACCOUNT_CPU_USER_ENTRY call in the system call entry path.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ab598b66
    • Arnd Bergmann's avatar
      powerpc/cell: Fix GDB watchpoints, again · 960cedb4
      Arnd Bergmann authored
      An earlier patch from Jens Osterkamp attempted to fix GDB
      watchpoints by enabling the DABRX register at boot time.
      Unfortunately, this did not work on SMP setups, where
      secondary CPUs were still using the power-on DABRX value.
      
      This introduces the same change for secondary CPUs on cell
      as well.
      Reported-by: default avatarUlrich Weigand <Ulrich.Weigand@de.ibm.com>
      Tested-by: default avatarUlrich Weigand <Ulrich.Weigand@de.ibm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      960cedb4
    • Arnd Bergmann's avatar
      powerpc/mpic: Don't reset affinity for secondary MPIC on boot · cc353c30
      Arnd Bergmann authored
      Kexec/kdump currently fails on the IBM QS2x blades when the kexec happens
      on a CPU other than the initial boot CPU.  It turns out that this is the
      result of mpic_init trying to set affinity of each interrupt vector to the
      current boot CPU.
      
      As far as I can tell,  the same problem is likely to exist on any
      secondary MPIC, because they have to deliver interrupts to the first
      output all the time. There are two potential solutions for this: either
      not set up affinity at all for secondary MPICs, or assume that a single
      CPU output is connected to the upstream interrupt controller and hardcode
      affinity to that per architecture.
      
      This patch implements the second approach, defaulting to the first output.
      Currently, all known secondary MPICs are routed to their upstream port
      using the first destination, so we hardcode that.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      cc353c30
    • Arnd Bergmann's avatar
      powerpc/cell/axon-msi: Retry on missing interrupt · d015fe99
      Arnd Bergmann authored
      The MSI capture logic for the axon bridge can sometimes
      lose interrupts in case of high DMA and interrupt load,
      when it signals an MSI interrupt to the MPIC interrupt
      controller while we are already handling another MSI.
      
      Each MSI vector gets written into a FIFO buffer in main
      memory using DMA, and that DMA access is normally flushed
      by the actual interrupt packet on the IOIF.  An MMIO
      register in the MSIC holds the position of the last
      entry in the FIFO buffer that was written.  However,
      reading that position does not flush the DMA, so that
      we can observe stale data in the buffer.
      
      In a stress test, we have observed the DMA to arrive
      up to 14 microseconds after reading the register.
      
      This patch works around this problem by retrying the
      access to the FIFO buffer.
      
      We can reliably detect the conditioning by writing
      an invalid MSI vector into the FIFO buffer after
      reading from it, assuming that all MSIs we get
      are valid.  After detecting an invalid MSI vector,
      we udelay(1) in the interrupt cascade for up to
      100 times before giving up.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      d015fe99
    • Dave Hansen's avatar
      powerpc: Fix boot freeze on machine with empty memory node · 4a618669
      Dave Hansen authored
      I got a bug report about a distro kernel not booting on a particular
      machine.  It would freeze during boot:
      
      > ...
      > Could not find start_pfn for node 1
      > [boot]0015 Setup Done
      > Built 2 zonelists in Node order, mobility grouping on.  Total pages: 123783
      > Policy zone: DMA
      > Kernel command line:
      > [boot]0020 XICS Init
      > [boot]0021 XICS Done
      > PID hash table entries: 4096 (order: 12, 32768 bytes)
      > clocksource: timebase mult[7d0000] shift[22] registered
      > Console: colour dummy device 80x25
      > console handover: boot [udbg0] -> real [hvc0]
      > Dentry cache hash table entries: 1048576 (order: 7, 8388608 bytes)
      > Inode-cache hash table entries: 524288 (order: 6, 4194304 bytes)
      > freeing bootmem node 0
      
      I've reproduced this on 2.6.27.7.  It is caused by commit
      8f64e1f2 ("powerpc: Reserve in bootmem
      lmb reserved regions that cross NUMA nodes").
      
      The problem is that Jon took a loop which was (in pseudocode):
      
      	for_each_node(nid)
      		NODE_DATA(nid) = careful_alloc(nid);
      		setup_bootmem(nid);
      		reserve_node_bootmem(nid);
      
      and broke it up into:
      
      	for_each_node(nid)
      		NODE_DATA(nid) = careful_alloc(nid);
      		setup_bootmem(nid);
      	for_each_node(nid)
      		reserve_node_bootmem(nid);
      
      The issue comes in when the 'careful_alloc()' is called on a node with
      no memory.  It falls back to using bootmem from a previously-initialized
      node.  But, bootmem has not yet been reserved when Jon's patch is
      applied.  It gives back bogus memory (0xc000000000000000) and pukes
      later in boot.
      
      The following patch collapses the loop back together.  It also breaks
      the mark_reserved_regions_for_nid() code out into a function and adds
      some comments.  I think a huge part of introducing this bug is because
      for loop was too long and hard to read.
      
      The actual bug fix here is the:
      
      +		if (end_pfn <= node->node_start_pfn ||
      +		    start_pfn >= node_end_pfn)
      +			continue;
      Signed-off-by: default avatarDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      4a618669
    • Adhemerval Zanella's avatar
      powerpc: Fix IRQ assignment for some PCIe devices · 4b824de9
      Adhemerval Zanella authored
      Currently, some PCIe devices on POWER6 machines do not get interrupts
      assigned correctly.  The problem is that OF doesn't create an
      "interrupt" property for them.  The fix is for of_irq_map_pci to fall
      back to using the value in the PCI interrupt-pin register in config
      space, as we do when there is no OF device-tree node for the device.
      
      I have verified that this works fine with a pair of Squib-E SAS
      adapter on a P6-570.
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      4b824de9
  2. 24 Nov, 2008 3 commits
  3. 20 Nov, 2008 1 commit
    • Jeremy Kerr's avatar
      powerpc/spufs: Fix spinning in spufs_ps_fault on signal · 60657263
      Jeremy Kerr authored
      Currently, we can end up in an infinite loop if we get a signal
      while the kernel has faulted in spufs_ps_fault. Eg:
      
       alarm(1);
      
       write(fd, some_spu_psmap_register_address, 4);
      
      - the write's copy_from_user will fault on the ps mapping, and
      signal_pending will be non-zero. Because returning from the fault
      handler will never clear TIF_SIGPENDING, so we'll just keep faulting,
      resulting in an unkillable process using 100% of CPU.
      
      This change returns VM_FAULT_SIGBUS if there's a fatal signal pending,
      letting us escape the loop.
      Signed-off-by: default avatarJeremy Kerr <jk@ozlabs.org>
      60657263
  4. 19 Nov, 2008 3 commits
  5. 18 Nov, 2008 25 commits
  6. 17 Nov, 2008 2 commits