1. 02 Jun, 2017 14 commits
    • Michal Suchanek's avatar
      powerpc/fadump: Return error when fadump registration fails · 98b8cd7f
      Michal Suchanek authored
       - log an error message when registration fails and no error code listed
         in the switch is returned
       - translate the hv error code to posix error code and return it from
         fw_register
       - return the posix error code from fw_register to the process writing
         to sysfs
       - return EEXIST on re-registration
       - return success on deregistration when fadump is not registered
       - return ENODEV when no memory is reserved for fadump
      Signed-off-by: default avatarMichal Suchanek <msuchanek@suse.de>
      Tested-by: default avatarHari Bathini <hbathini@linux.vnet.ibm.com>
      [mpe: Use pr_err() to shrink the error print]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      98b8cd7f
    • Christophe Leroy's avatar
      powerpc: Remove __ilog2()s and use generic ones · f782ddf2
      Christophe Leroy authored
      With the __ilog2() function as defined in
      arch/powerpc/include/asm/bitops.h, GCC will not optimise the code
      in case of constant parameter.
      
      The generic ilog2() function in include/linux/log2.h is written
      to handle the case of the constant parameter.
      
      This patch discards the three __ilog2() functions and
      defines __ilog2() as ilog2()
      
      For non constant calls, the generated code is doing the same:
      int test__ilog2(unsigned long x)
      {
      	return __ilog2(x);
      }
      
      int test__ilog2_u32(u32 n)
      {
      	return __ilog2_u32(n);
      }
      
      int test__ilog2_u64(u64 n)
      {
      	return __ilog2_u64(n);
      }
      
      On PPC32 before the patch:
      00000000 <test__ilog2>:
         0:	7c 63 00 34 	cntlzw  r3,r3
         4:	20 63 00 1f 	subfic  r3,r3,31
         8:	4e 80 00 20 	blr
      
      0000000c <test__ilog2_u32>:
         c:	7c 63 00 34 	cntlzw  r3,r3
        10:	20 63 00 1f 	subfic  r3,r3,31
        14:	4e 80 00 20 	blr
      
      On PPC32 after the patch:
      00000000 <test__ilog2>:
         0:	7c 63 00 34 	cntlzw  r3,r3
         4:	20 63 00 1f 	subfic  r3,r3,31
         8:	4e 80 00 20 	blr
      
      0000000c <test__ilog2_u32>:
         c:	7c 63 00 34 	cntlzw  r3,r3
        10:	20 63 00 1f 	subfic  r3,r3,31
        14:	4e 80 00 20 	blr
      
      On PPC64 before the patch:
      0000000000000000 <.test__ilog2>:
         0:	7c 63 00 74 	cntlzd  r3,r3
         4:	20 63 00 3f 	subfic  r3,r3,63
         8:	7c 63 07 b4 	extsw   r3,r3
         c:	4e 80 00 20 	blr
      
      0000000000000010 <.test__ilog2_u32>:
        10:	7c 63 00 34 	cntlzw  r3,r3
        14:	20 63 00 1f 	subfic  r3,r3,31
        18:	7c 63 07 b4 	extsw   r3,r3
        1c:	4e 80 00 20 	blr
      
      0000000000000020 <.test__ilog2_u64>:
        20:	7c 63 00 74 	cntlzd  r3,r3
        24:	20 63 00 3f 	subfic  r3,r3,63
        28:	7c 63 07 b4 	extsw   r3,r3
        2c:	4e 80 00 20 	blr
      
      On PPC64 after the patch:
      0000000000000000 <.test__ilog2>:
         0:	7c 63 00 74 	cntlzd  r3,r3
         4:	20 63 00 3f 	subfic  r3,r3,63
         8:	7c 63 07 b4 	extsw   r3,r3
         c:	4e 80 00 20 	blr
      
      0000000000000010 <.test__ilog2_u32>:
        10:	7c 63 00 34 	cntlzw  r3,r3
        14:	20 63 00 1f 	subfic  r3,r3,31
        18:	7c 63 07 b4 	extsw   r3,r3
        1c:	4e 80 00 20 	blr
      
      0000000000000020 <.test__ilog2_u64>:
        20:	7c 63 00 74 	cntlzd  r3,r3
        24:	20 63 00 3f 	subfic  r3,r3,63
        28:	7c 63 07 b4 	extsw   r3,r3
        2c:	4e 80 00 20 	blr
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f782ddf2
    • Christophe Leroy's avatar
      powerpc: Replace ffz() by equivalent generic function · 22ef33b3
      Christophe Leroy authored
      With the ffz() function as defined in arch/powerpc/include/asm/bitops.h
      GCC will not optimise the code in case of constant parameter.
      
      This patch replaces ffz() by the generic function.
      
      The generic ffz(x) expects to never be called with ~x == 0
      as written in the comment in include/asm-generic/bitops/ffz.h
      The only user of ffz() within arch/powerpc/ is
      platforms/512x/mpc5121_ads_cpld.c, which checks if x is not 0xff
      
      For non constant calls, the generated code is doing the same:
      
      unsigned long testffz(unsigned long x)
      {
      	return ffz(x);
      }
      
      On PPC32, before the patch:
      00000018 <testffz>:
        18:	7c 63 18 f9 	not.    r3,r3
        1c:	40 82 00 0c 	bne     28 <testffz+0x10>
        20:	38 60 00 20 	li      r3,32
        24:	4e 80 00 20 	blr
        28:	7d 23 00 d0 	neg     r9,r3
        2c:	7d 23 18 38 	and     r3,r9,r3
        30:	7c 63 00 34 	cntlzw  r3,r3
        34:	20 63 00 1f 	subfic  r3,r3,31
        38:	4e 80 00 20 	blr
      
      On PPC32, after the patch:
      00000018 <testffz>:
        18:	39 23 00 01 	addi    r9,r3,1
        1c:	7d 23 18 78 	andc    r3,r9,r3
        20:	7c 63 00 34 	cntlzw  r3,r3
        24:	20 63 00 1f 	subfic  r3,r3,31
        28:	4e 80 00 20 	blr
      
      On PPC64, before the patch:
      0000000000000030 <.testffz>:
        30:	7c 60 18 f9 	not.    r0,r3
        34:	38 60 00 40 	li      r3,64
        38:	4d 82 00 20 	beqlr
        3c:	7c 60 00 d0 	neg     r3,r0
        40:	7c 63 00 38 	and     r3,r3,r0
        44:	7c 63 00 74 	cntlzd  r3,r3
        48:	20 63 00 3f 	subfic  r3,r3,63
        4c:	7c 63 07 b4 	extsw   r3,r3
        50:	4e 80 00 20 	blr
      
      On PPC64, after the patch:
      0000000000000030 <.testffz>:
        30:	38 03 00 01 	addi    r0,r3,1
        34:	7c 03 18 78 	andc    r3,r0,r3
        38:	7c 63 00 74 	cntlzd  r3,r3
        3c:	20 63 00 3f 	subfic  r3,r3,63
        40:	4e 80 00 20 	blr
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      22ef33b3
    • Christophe Leroy's avatar
      powerpc: Use builtin functions for fls()/__fls()/fls64() · 2fcff790
      Christophe Leroy authored
      With the fls() functions as defined in arch/powerpc/include/asm/bitops.h
      GCC will not optimise the code in case of constant parameter.
      
      This patch replaces __fls() by the builtin function, and modifies
      fls() and fls64() to use builtins instead of inline assembly
      
      For non constant calls, the generated code is doing the same:
      
      int testfls(unsigned int x)
      {
      	return fls(x);
      }
      
      unsigned long test__fls(unsigned long x)
      {
      	return __fls(x);
      }
      
      int testfls64(__u64 x)
      {
      	return fls64(x);
      }
      
      On PPC32, before the patch:
      00000064 <testfls>:
        64:	7c 63 00 34 	cntlzw  r3,r3
        68:	20 63 00 20 	subfic  r3,r3,32
        6c:	4e 80 00 20 	blr
      
      00000070 <test__fls>:
        70:	7c 63 00 34 	cntlzw  r3,r3
        74:	20 63 00 1f 	subfic  r3,r3,31
        78:	4e 80 00 20 	blr
      
      0000007c <testfls64>:
        7c:	2c 03 00 00 	cmpwi   r3,0
        80:	40 82 00 10 	bne     90 <testfls64+0x14>
        84:	7c 83 00 34 	cntlzw  r3,r4
        88:	20 63 00 20 	subfic  r3,r3,32
        8c:	4e 80 00 20 	blr
        90:	7c 63 00 34 	cntlzw  r3,r3
        94:	20 63 00 40 	subfic  r3,r3,64
        98:	4e 80 00 20 	blr
      
      On PPC32, after the patch:
      00000054 <testfls>:
        54:	7c 63 00 34 	cntlzw  r3,r3
        58:	20 63 00 20 	subfic  r3,r3,32
        5c:	4e 80 00 20 	blr
      
      00000060 <test__fls>:
        60:	7c 63 00 34 	cntlzw  r3,r3
        64:	20 63 00 1f 	subfic  r3,r3,31
        68:	4e 80 00 20 	blr
      
      0000006c <testfls64>:
        6c:	2c 03 00 00 	cmpwi   r3,0
        70:	41 82 00 10 	beq     80 <testfls64+0x14>
        74:	7c 63 00 34 	cntlzw  r3,r3
        78:	20 63 00 40 	subfic  r3,r3,64
        7c:	4e 80 00 20 	blr
        80:	7c 83 00 34 	cntlzw  r3,r4
        84:	20 63 00 40 	subfic  r3,r3,32
        88:	4e 80 00 20 	blr
      
      On PPC64, before the patch:
      00000000000000a0 <.testfls>:
        a0:	7c 63 00 34 	cntlzw  r3,r3
        a4:	20 63 00 20 	subfic  r3,r3,32
        a8:	7c 63 07 b4 	extsw   r3,r3
        ac:	4e 80 00 20 	blr
      
      00000000000000b0 <.test__fls>:
        b0:	7c 63 00 74 	cntlzd  r3,r3
        b4:	20 63 00 3f 	subfic  r3,r3,63
        b8:	7c 63 07 b4 	extsw   r3,r3
        bc:	4e 80 00 20 	blr
      
      00000000000000c0 <.testfls64>:
        c0:	7c 63 00 74 	cntlzd  r3,r3
        c4:	20 63 00 40 	subfic  r3,r3,64
        c8:	7c 63 07 b4 	extsw   r3,r3
        cc:	4e 80 00 20 	blr
      
      On PPC64, after the patch:
      0000000000000090 <.testfls>:
        90:	7c 63 00 34 	cntlzw  r3,r3
        94:	20 63 00 20 	subfic  r3,r3,32
        98:	7c 63 07 b4 	extsw   r3,r3
        9c:	4e 80 00 20 	blr
      
      00000000000000a0 <.test__fls>:
        a0:	7c 63 00 74 	cntlzd  r3,r3
        a4:	20 63 00 3f 	subfic  r3,r3,63
        a8:	4e 80 00 20 	blr
        ac:	60 00 00 00 	nop
      
      00000000000000b0 <.testfls64>:
        b0:	7c 63 00 74 	cntlzd  r3,r3
        b4:	20 63 00 40 	subfic  r3,r3,64
        b8:	7c 63 07 b4 	extsw   r3,r3
        bc:	4e 80 00 20 	blr
      
      Those builtins have been in GCC since at least 3.4.6 (see
      https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Other-Builtins.html )
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2fcff790
    • Christophe Leroy's avatar
      powerpc: Discard ffs()/__ffs() function and use builtin functions instead · f83647d6
      Christophe Leroy authored
      With the ffs() function as defined in arch/powerpc/include/asm/bitops.h
      GCC will not optimise the code in case of constant parameter, as shown
      by the small exemple below.
      
      int ffs_test(void)
      {
      	return 4 << ffs(31);
      }
      
      c0012334 <ffs_test>:
      c0012334:       39 20 00 01     li      r9,1
      c0012338:       38 60 00 04     li      r3,4
      c001233c:       7d 29 00 34     cntlzw  r9,r9
      c0012340:       21 29 00 20     subfic  r9,r9,32
      c0012344:       7c 63 48 30     slw     r3,r3,r9
      c0012348:       4e 80 00 20     blr
      
      With this patch, the same function will compile as follows:
      
      c0012334 <ffs_test>:
      c0012334:       38 60 00 08     li      r3,8
      c0012338:       4e 80 00 20     blr
      
      The same happens with __ffs()
      
      For non constant calls, the generated code is doing the same,
      allthought it is slightly different on 64 bits for ffs():
      
      unsigned long test__ffs(unsigned long x)
      {
      	return __ffs(x);
      }
      
      int testffs(int x)
      {
      	return ffs(x);
      }
      
      On PPC32, before the patch:
      0000003c <test__ffs>:
        3c:	7d 23 00 d0 	neg     r9,r3
        40:	7d 23 18 38 	and     r3,r9,r3
        44:	7c 63 00 34 	cntlzw  r3,r3
        48:	20 63 00 1f 	subfic  r3,r3,31
        4c:	4e 80 00 20 	blr
      
      00000050 <testffs>:
        50:	7d 23 00 d0 	neg     r9,r3
        54:	7d 23 18 38 	and     r3,r9,r3
        58:	7c 63 00 34 	cntlzw  r3,r3
        5c:	20 63 00 20 	subfic  r3,r3,32
        60:	4e 80 00 20 	blr
      
      On PPC32, after the patch:
      0000002c <test__ffs>:
        2c:	7d 23 00 d0 	neg     r9,r3
        30:	7d 23 18 38 	and     r3,r9,r3
        34:	7c 63 00 34 	cntlzw  r3,r3
        38:	20 63 00 1f 	subfic  r3,r3,31
        3c:	4e 80 00 20 	blr
      
      00000040 <testffs>:
        40:	7d 23 00 d0 	neg     r9,r3
        44:	7d 23 18 38 	and     r3,r9,r3
        48:	7c 63 00 34 	cntlzw  r3,r3
        4c:	20 63 00 20 	subfic  r3,r3,32
        50:	4e 80 00 20 	blr
      
      On PPC64, before the patch:
      0000000000000060 <.test__ffs>:
        60:	7c 03 00 d0 	neg     r0,r3
        64:	7c 03 18 38 	and     r3,r0,r3
        68:	7c 63 00 74 	cntlzd  r3,r3
        6c:	20 63 00 3f 	subfic  r3,r3,63
        70:	7c 63 07 b4 	extsw   r3,r3
        74:	4e 80 00 20 	blr
      
      0000000000000080 <.testffs>:
        80:	7c 03 00 d0 	neg     r0,r3
        84:	7c 03 18 38 	and     r3,r0,r3
        88:	7c 63 00 74 	cntlzd  r3,r3
        8c:	20 63 00 40 	subfic  r3,r3,64
        90:	7c 63 07 b4 	extsw   r3,r3
        94:	4e 80 00 20 	blr
      
      On PPC64, after the patch:
      0000000000000050 <.test__ffs>:
        50:	7c 03 00 d0 	neg     r0,r3
        54:	7c 03 18 38 	and     r3,r0,r3
        58:	7c 63 00 74 	cntlzd  r3,r3
        5c:	20 63 00 3f 	subfic  r3,r3,63
        60:	4e 80 00 20 	blr
      
      0000000000000070 <.testffs>:
        70:	7c 03 00 d0 	neg     r0,r3
        74:	7c 03 18 38 	and     r3,r0,r3
        78:	7c 63 00 34 	cntlzw  r3,r3
        7c:	20 63 00 20 	subfic  r3,r3,32
        80:	7c 63 07 b4 	extsw   r3,r3
        84:	4e 80 00 20 	blr
      (ffs() operates on an int so cntlzw is equivalent to cntlzd)
      
      In addition, when reading the generated vmlinux, we can observe
      that with the builtin functions, GCC sometimes efficiently spreads
      the instructions within the generated functions while the inline
      assembly force them to remain grouped together.
      
      __builtin_ffs() is already used in arch/powerpc/include/asm/page_32.h
      
      Those builtins have been in GCC since at least 3.4.6 (see
      https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Other-Builtins.html )
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f83647d6
    • Christophe Leroy's avatar
      powerpc: Handle simultaneous interrupts at once · 45cb08f4
      Christophe Leroy authored
      It often happens to have simultaneous interrupts, for instance
      when having double Ethernet attachment. With the current
      implementation, we suffer the cost of kernel entry/exit for each
      interrupt.
      
      This patch introduces a loop in __do_irq() to handle all interrupts
      at once before returning.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      45cb08f4
    • Christophe Leroy's avatar
      powerpc/8xx: fix mpc8xx_get_irq() return on no irq · 3c29b603
      Christophe Leroy authored
      IRQ 0 is a valid HW interrupt. So get_irq() shall return 0 when
      there is no irq, instead of returning irq_linear_revmap(... ,0)
      
      Fixes: f2a0bd37 ("[POWERPC] 8xx: powerpc port of core CPM PIC")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3c29b603
    • Christophe Leroy's avatar
    • Christophe Leroy's avatar
      powerpc/mm: The 8xx doesn't call do_page_fault() for breakpoints · 92aa2fe0
      Christophe Leroy authored
      The 8xx has a dedicated exception for breakpoints, that directly
      calls do_break()
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      92aa2fe0
    • Christophe Leroy's avatar
      powerpc/mm: Evaluate user_mode(regs) only once in do_page_fault() · da929f6a
      Christophe Leroy authored
      Analysis of the assembly code shows that when using user_mode(regs),
      at least the 'andi.' is redone all the time, and also
      the 'lwz ,132(r31)' most of the time. With the new form, the 'is_user'
      is mapped to cr4, then all further use of is_user results in just
      things like 'beq cr4,218 <do_page_fault+0x218>'
      
      Without the patch:
      
        50:	81 1e 00 84 	lwz     r8,132(r30)
        54:	71 09 40 00 	andi.   r9,r8,16384
        58:	40 82 00 0c 	bne     64 <do_page_fault+0x64>
      
        84:	81 3e 00 84 	lwz     r9,132(r30)
        8c:	71 2a 40 00 	andi.   r10,r9,16384
        90:	41 a2 01 64 	beq     1f4 <do_page_fault+0x1f4>
      
        d4:	81 3e 00 84 	lwz     r9,132(r30)
        dc:	71 28 40 00 	andi.   r8,r9,16384
        e0:	41 82 02 08 	beq     2e8 <do_page_fault+0x2e8>
      
       108:	81 3e 00 84 	lwz     r9,132(r30)
       110:	71 28 40 00 	andi.   r8,r9,16384
       118:	41 82 02 28 	beq     340 <do_page_fault+0x340>
      
       1e4:	81 3e 00 84 	lwz     r9,132(r30)
       1e8:	71 2a 40 00 	andi.   r10,r9,16384
       1ec:	40 82 01 68 	bne     354 <do_page_fault+0x354>
      
       228:	81 3e 00 84 	lwz     r9,132(r30)
       22c:	71 28 40 00 	andi.   r8,r9,16384
       230:	41 82 ff c4 	beq     1f4 <do_page_fault+0x1f4>
      
       288:	71 2a 40 00 	andi.   r10,r9,16384
       294:	41 a2 fe 60 	beq     f4 <do_page_fault+0xf4>
      
       50c:	81 3e 00 84 	lwz     r9,132(r30)
       514:	71 2a 40 00 	andi.   r10,r9,16384
       518:	40 a2 fc e0 	bne     1f8 <do_page_fault+0x1f8>
      
       534:	81 3e 00 84 	lwz     r9,132(r30)
       53c:	71 2a 40 00 	andi.   r10,r9,16384
       540:	41 82 fc b8 	beq     1f8 <do_page_fault+0x1f8>
      
      This patch creates a local var called 'is_user' which contains the
      result of user_mode(regs)
      
      With the patch:
      
        20:	81 03 00 84 	lwz     r8,132(r3)
        48:	55 09 97 fe 	rlwinm  r9,r8,18,31,31
        58:	2e 09 00 00 	cmpwi   cr4,r9,0
        5c:	40 92 00 0c 	bne     cr4,68 <do_page_fault+0x68>
      
        88:	41 b2 01 90 	beq     cr4,218 <do_page_fault+0x218>
      
        d4:	40 92 01 d0 	bne     cr4,2a4 <do_page_fault+0x2a4>
      
       120:	41 b2 00 f8 	beq     cr4,218 <do_page_fault+0x218>
      
       138:	41 b2 ff a0 	beq     cr4,d8 <do_page_fault+0xd8>
      
       1d4:	40 92 00 e0 	bne     cr4,2b4 <do_page_fault+0x2b4>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      da929f6a
    • Christophe Leroy's avatar
      powerpc/mm: Remove a redundant test in do_page_fault() · 97a011e6
      Christophe Leroy authored
      The result of (trap == 0x400) is already in is_exec.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      97a011e6
    • Christophe Leroy's avatar
      powerpc/mm: Only call store_updates_sp() on stores in do_page_fault() · e8de85ca
      Christophe Leroy authored
      Function store_updates_sp() checks whether the faulting
      instruction is a store updating r1. Therefore we can limit its calls
      to store exceptions.
      
      This patch is an improvement of commit a7a9dcd8 ("powerpc: Avoid
      taking a data miss on every userspace instruction miss")
      
      With the same microbenchmark app, run with 500 as argument, on an
      MPC885 we get:
      
      Before this patch: 152000 DTLB misses
      After this patch:  147000 DTLB misses
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      e8de85ca
    • Christophe Leroy's avatar
      powerpc/mm: Remove __this_fixmap_does_not_exist() · 9affa9e2
      Christophe Leroy authored
      This function has not been used since commit 9494a1e8
      ("powerpc: use generic fixmap.h)
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9affa9e2
    • Balbir Singh's avatar
      powerpc/mm/ptdump: Dump the first entry of the linear mapping as well · e63739b1
      Balbir Singh authored
      The check in hpte_find() should be < and not <= for PAGE_OFFSET
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      e63739b1
  2. 30 May, 2017 26 commits