• Alexander van Heukelum's avatar
    x86: change x86 to use generic find_next_bit · 6fd92b63
    Alexander van Heukelum authored
    The versions with inline assembly are in fact slower on the machines I
    tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
    Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
    crashing the benchmark.
    
    Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
    1...512, for each possible bitmap with one bit set, for each possible
    offset: find the position of the first bit starting at offset. If you
    follow ;). Times include setup of the bitmap and checking of the
    results.
    
    		Athlon		Xeon		Opteron 32/64bit
    x86-specific:	0m3.692s	0m2.820s	0m3.196s / 0m2.480s
    generic:	0m2.622s	0m1.662s	0m2.100s / 0m1.572s
    
    If the bitmap size is not a multiple of BITS_PER_LONG, and no set
    (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
    value outside of the range [0, size]. The generic version always returns
    exactly size. The generic version also uses unsigned long everywhere,
    while the x86 versions use a mishmash of int, unsigned (int), long and
    unsigned long.
    
    Using the generic version does give a slightly bigger kernel, though.
    
    defconfig:	   text    data     bss     dec     hex filename
    x86-specific:	4738555  481232  626688 5846475  5935cb vmlinux (32 bit)
    generic:	4738621  481232  626688 5846541  59360d vmlinux (32 bit)
    x86-specific:	5392395  846568  724424 6963387  6a40bb vmlinux (64 bit)
    generic:	5392458  846568  724424 6963450  6a40fa vmlinux (64 bit)
    Signed-off-by: default avatarAlexander van Heukelum <heukelum@fastmail.fm>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    6fd92b63
bitops_32.h 3.08 KB