• Rafael J. Wysocki's avatar
    x86: fix C1E && nx6325 stability problem · e2079c43
    Rafael J. Wysocki authored
    The problems are that, with the ACPI vs timer overring issue _fixed_,
    after using the box for some time (between several seconds and 1 hour, at
    random) processes get very high CPU loads (once I've got X using 107% of
    the CPU, for example) and the system becomes unresponsive, as though there
    were interrupts lost or something similar.
    
    Andreas Herrman reproduced similar problems:
    
    > Ok, now I've reproduced the stability problem.
    > - Using tip/master,
    > - reverting e38502eb8aa82314d5ab0eba45f50e6790dadd88 and
    > - applying your patch from this posting
    >   http://marc.info/?l=linux-kernel&m=121539354224562&w=4
    >
    > Starting X, firefox, gimp, tuxpaint and doing some drawing in tuxpaint
    > results in a slow system. Drawing is almost not possible anymore --
    > Selections of new colors, cursors etc. is performed with huge delay
    > if it's performed at all.
    >
    > BTW, the code sets up timer IRQ as Virtual Wire IRQ:
    >
    > Jul  8 14:57:58 kodscha IO-APIC (apicid-pin) 2-22, 2-23 not connected.
    > Jul  8 14:57:58 kodscha ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    > Jul  8 14:57:58 kodscha ...trying to set up timer as Virtual Wire IRQ... works.
    >
    > and both INT0 and INT2 of IOAPIC are masked:
    >
    > Jul  8 14:57:58 kodscha NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
    > Jul  8 14:57:58 kodscha 00 000 1    0    0   0   0    0    0    00
    > Jul  8 14:57:58 kodscha 01 003 0    0    0   0   0    1    1    31
    > Jul  8 14:57:58 kodscha 02 003 1    0    0   0   0    0    0    30
    >
    > I've also seen strange CPU utilization -- with syslog-ng:
    >
    > top - 15:33:06 up 35 min,  4 users,  load average: 1.70, 0.68, 0.37
    > Tasks:  64 total,   4 running,  60 sleeping,   0 stopped,   0 zombie
    > Cpu0  :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
    > Cpu1  :  6.4%us, 87.2%sy,  0.0%ni,  5.8%id,  0.0%wa,  0.6%hi,  0.0%si,  0.0%st
    > Mem:    895384k total,   283568k used,   611816k free,    35492k buffers
    > Swap:  1959920k total,        0k used,  1959920k free,   163044k cached
    >
    >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    >  4632 root      20   0 17216  800  580 S  104  0.1   0:34.22 syslog-ng
    > 28505 root      20   0  205m  11m 4024 S    6  1.3   0:21.16 X
    > 28518 root      20   0 56292 5652 4492 S    1  0.6   0:01.80 fluxbox
    >     1 root      20   0  3724  608  508 S    0  0.1   0:00.36 init
    >
    > So far I have no clue why C1E-idle in conjunction with virtual wire
    > mode causes this strange behaviour.
    >
    > ... and I start to think about the root cause of all this.
    >
    > I've performed similar tests under X with the IRQ0/INT0 configuration and
    > I did not see above symptoms.
    
    So lets fall back to the IRQ0/INT0 configuration on this box.
    
    This basically restores the dont-use-the-lapic-timer exception mechanism
    that was unconditional on this box prior commit 8750bf5 ("x86: add C1E
    aware idle function").
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    e2079c43
io_apic_32.c 70.1 KB