MIPS: Reinstate platform `__div64_32' handler
Our current MIPS platform `__div64_32' handler is inactive, because it is incorrectly only enabled for 64-bit configurations, for which generic `do_div' code does not call it anyway. The handler is not suitable for being called from there though as it only calculates 32 bits of the quotient under the assumption the 64-bit divident has been suitably reduced. Code for such reduction used to be there, however it has been incorrectly removed with commit c21004cd ("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0."), which should have only updated an obsoleted constraint for an inline asm involving $hi and $lo register outputs, while possibly wiring the original MIPS variant of the `do_div' macro as `__div64_32' handler for the generic `do_div' implementation Correct the handler as follows then: - Revert most of the commit referred, however retaining the current formatting, except for the final two instructions of the inline asm sequence, which the original commit missed. Omit the original 64-bit parts though. - Rename the original `do_div' macro to `__div64_32'. Use the combined `x' constraint referring to the MD accumulator as a whole, replacing the original individual `h' and `l' constraints used for $hi and $lo registers respectively, of which `h' has been obsoleted with GCC 4.4. Update surrounding code accordingly. We have since removed support for GCC versions before 4.9, so no need for a special arrangement here; GCC has supported the `x' constraint since forever anyway, or at least going back to 1991. - Rename the `__base' local variable in `__div64_32' to `__radix' to avoid a conflict with a local variable in `do_div'. - Actually enable this code for 32-bit rather than 64-bit configurations by qualifying it with BITS_PER_LONG being 32 instead of 64. Include <asm/bitsperlong.h> for this macro rather than <linux/types.h> as we don't need anything else. - Finally include <asm-generic/div64.h> last rather than first. This has passed correctness verification with test_div64 and reduced the module's average execution time down to 1.0668s and 0.2629s from 2.1529s and 0.5647s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz. For a reference 64-bit `do_div' code where we have the DDIVU instruction available to do the whole calculation right away averages at 0.0660s for the latter CPU. Fixes: c21004cd ("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0.") Reported-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Showing
Please register or sign in to comment