Commit b455111c authored by Dominik Brodowski's avatar Dominik Brodowski Committed by Linus Torvalds

[PATCH] add 1 in __const_udelay()

The "mull" instruction in __const_udelay() cuts off the lower 32 bits --
so, it is "rounding down".  This is both an issue for small ndelay()s for
_all_ values for loops_per_jiffy and for certain {n,u}delay()s for many
loops_per_jiffy values.

Assuming

LPJ = 1501115

udelay(87)

results in

130597 loops to be spent.

However, 1000 * 130597 / 1501115 is 86.999997 us, so we're actually
_rounding down_.  1000 * 130598 / 1501115 is 87.000662841, which would be
the technically correct thing to do.  Of course, for the TSC case this
won't matter as the maths take some time, so the actual delay is

1000 * __udelay(x) / lpj + __OVERHEAD(x)

Anybody worried about both the additional overhead and the fact that the
overhead takes some time to run should add a check

        if (unlikely(xloops < OVERHEAD))
                return;
        xloops -= OVERHEAD;

to the delay() routines in arch/i386/kernel/timers/*.c and determine
what the OVERHEAD is.
Signed-off-by: default avatarDominik Brodowski <linux@brodo.de>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 5bad9c0a
...@@ -35,7 +35,7 @@ inline void __const_udelay(unsigned long xloops) ...@@ -35,7 +35,7 @@ inline void __const_udelay(unsigned long xloops)
__asm__("mull %0" __asm__("mull %0"
:"=d" (xloops), "=&a" (d0) :"=d" (xloops), "=&a" (d0)
:"1" (xloops),"0" (current_cpu_data.loops_per_jiffy * (HZ/4))); :"1" (xloops),"0" (current_cpu_data.loops_per_jiffy * (HZ/4)));
__delay(xloops); __delay(++xloops);
} }
void __udelay(unsigned long usecs) void __udelay(unsigned long usecs)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment