• Heiko Carstens's avatar
    net: cpu offline cause napi stall · 264524d5
    Heiko Carstens authored
    Frank Blaschka reported :
    <quote>
      During heavy network load we turn off/on cpus.
      Sometimes this causes a stall on the network device.
      Digging into the dump I found out following:
    
      napi is scheduled but does not run. From the I/O buffers
      and the napi state I see napi/rx_softirq processing has stopped
      because the budget was reached. napi stays in the
      softnet_data poll_list and the rx_softirq was raised again.
    
      I assume at this time the cpu offline comes in,
      the rx softirq is raised/moved to another cpu but napi stays in the
      poll_list of the softnet_data of the now offline cpu.
    
      Reviewing dev_cpu_callback (net/core/dev.c) I did not find the
      poll_list is transfered to the new cpu.
    </quote>
    
    This patch is a straightforward implementation of Frank suggestion :
    
    Transfert poll_list and trigger NET_RX_SOFTIRQ on new cpu.
    Reported-by: default avatarFrank Blaschka <blaschka@linux.vnet.ibm.com>
    Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
    Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    Tested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    264524d5
dev.c 158 KB