• Ido Schimmel's avatar
    nexthop: Fix performance regression in nexthop deletion · df6afe2f
    Ido Schimmel authored
    While insertion of 16k nexthops all using the same netdev ('dummy10')
    takes less than a second, deletion takes about 130 seconds:
    
    # time -p ip -b nexthop.batch
    real 0.29
    user 0.01
    sys 0.15
    
    # time -p ip link set dev dummy10 down
    real 131.03
    user 0.06
    sys 0.52
    
    This is because of repeated calls to synchronize_rcu() whenever a
    nexthop is removed from a nexthop group:
    
    # /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K
    ...
        b'finish_task_switch'
        b'schedule'
        b'schedule_timeout'
        b'wait_for_completion'
        b'__wait_rcu_gp'
        b'synchronize_rcu.part.0'
        b'synchronize_rcu'
        b'__remove_nexthop'
        b'remove_nexthop'
        b'nexthop_flush_dev'
        b'nh_netdev_event'
        b'raw_notifier_call_chain'
        b'call_netdevice_notifiers_info'
        b'__dev_notify_flags'
        b'dev_change_flags'
        b'do_setlink'
        b'__rtnl_newlink'
        b'rtnl_newlink'
        b'rtnetlink_rcv_msg'
        b'netlink_rcv_skb'
        b'rtnetlink_rcv'
        b'netlink_unicast'
        b'netlink_sendmsg'
        b'____sys_sendmsg'
        b'___sys_sendmsg'
        b'__sys_sendmsg'
        b'__x64_sys_sendmsg'
        b'do_syscall_64'
        b'entry_SYSCALL_64_after_hwframe'
        -                ip (277)
            126554955
    
    Since nexthops are always deleted under RTNL, synchronize_net() can be
    used instead. It will call synchronize_rcu_expedited() which only blocks
    for several microseconds as opposed to multiple milliseconds like
    synchronize_rcu().
    
    With this patch deletion of 16k nexthops takes less than a second:
    
    # time -p ip link set dev dummy10 down
    real 0.12
    user 0.00
    sys 0.04
    
    Tested with fib_nexthops.sh which includes torture tests that prompted
    the initial change:
    
    # ./fib_nexthops.sh
    ...
    Tests passed: 134
    Tests failed:   0
    
    Fixes: 90f33bff ("nexthops: don't modify published nexthop groups")
    Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
    Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
    Acked-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
    Link: https://lore.kernel.org/r/20201016172914.643282-1-idosch@idosch.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    df6afe2f
nexthop.c 44.8 KB