-
Andreas Mohr authored
During some performance diagnostics I stumbled on this slightly wasteful code in pcnet_cs.c which I made the patch included at the bottom for (two minor comment fixes included). Improvement: instead of *always* calculating lea 0x2c0(%edx),%ebx and then additionally doing the mov %edx,0xc0(%ebx) addition *if we need it*, we now do the *whole* calculation of mov %edx,0x380(%ebx) *only* if we need it. This even manages to save us a whole 16-byte alignment buffer loss in this compilation case. Result: slightly improves IRQ handler performance in both shared and non-shared IRQ case, which should make my rusty P3/700 a slight bit happier. Thank you for your support, Andreas Mohr old asm result (using gcc 3.3.5): 000015a0 <ei_irq_wrapper>: 15a0: 55 push %ebp 15a1: 89 e5 mov %esp,%ebp 15a3: 53 push %ebx 15a4: 8d 9a c0 02 00 00 lea 0x2c0(%edx),%ebx 15aa: e8 fc ff ff ff call 15ab <ei_irq_wrapper+0xb> 15af: 83 f8 01 cmp $0x1,%eax 15b2: 74 03 je 15b7 <ei_irq_wrapper+0x17> 15b4: 5b pop %ebx 15b5: 5d pop %ebp 15b6: c3 ret 15b7: 31 d2 xor %edx,%edx 15b9: 89 93 c0 00 00 00 mov %edx,0xc0(%ebx) 15bf: eb f3 jmp 15b4 <ei_irq_wrapper+0x14> 15c1: eb 0d jmp 15d0 <ei_watchdog> 15c3: 90 nop 15c4: 90 nop 15c5: 90 nop 15c6: 90 nop 15c7: 90 nop 15c8: 90 nop 15c9: 90 nop 15ca: 90 nop 15cb: 90 nop 15cc: 90 nop 15cd: 90 nop 15ce: 90 nop 15cf: 90 nop 000015d0 <ei_watchdog>: new asm result: 000015a0 <ei_irq_wrapper>: 15a0: 55 push %ebp 15a1: 89 e5 mov %esp,%ebp 15a3: 53 push %ebx 15a4: 89 d3 mov %edx,%ebx 15a6: e8 fc ff ff ff call 15a7 <ei_irq_wrapper+0x7> 15ab: 83 f8 01 cmp $0x1,%eax 15ae: 74 03 je 15b3 <ei_irq_wrapper+0x13> 15b0: 5b pop %ebx 15b1: 5d pop %ebp 15b2: c3 ret 15b3: 31 d2 xor %edx,%edx 15b5: 89 93 80 03 00 00 mov %edx,0x380(%ebx) 15bb: eb f3 jmp 15b0 <ei_irq_wrapper+0x10> 15bd: 8d 76 00 lea 0x0(%esi),%esi 000015c0 <ei_watchdog>: Signed-off-by: Andrew Morton <akpm@osdl.org>
93ad4fb0