• Wei Wang's avatar
    ipv4: fix race condition between route lookup and invalidation · 5018c596
    Wei Wang authored
    Jesse and Ido reported the following race condition:
    <CPU A, t0> - Received packet A is forwarded and cached dst entry is
    taken from the nexthop ('nhc->nhc_rth_input'). Calls skb_dst_set()
    
    <t1> - Given Jesse has busy routers ("ingesting full BGP routing tables
    from multiple ISPs"), route is added / deleted and rt_cache_flush() is
    called
    
    <CPU B, t2> - Received packet B tries to use the same cached dst entry
    from t0, but rt_cache_valid() is no longer true and it is replaced in
    rt_cache_route() by the newer one. This calls dst_dev_put() on the
    original dst entry which assigns the blackhole netdev to 'dst->dev'
    
    <CPU A, t3> - dst_input(skb) is called on packet A and it is dropped due
    to 'dst->dev' being the blackhole netdev
    
    There are 2 issues in the v4 routing code:
    1. A per-netns counter is used to do the validation of the route. That
    means whenever a route is changed in the netns, users of all routes in
    the netns needs to redo lookup. v6 has an implementation of only
    updating fn_sernum for routes that are affected.
    2. When rt_cache_valid() returns false, rt_cache_route() is called to
    throw away the current cache, and create a new one. This seems
    unnecessary because as long as this route does not change, the route
    cache does not need to be recreated.
    
    To fully solve the above 2 issues, it probably needs quite some code
    changes and requires careful testing, and does not suite for net branch.
    
    So this patch only tries to add the deleted cached rt into the uncached
    list, so user could still be able to use it to receive packets until
    it's done.
    
    Fixes: 95c47f9c ("ipv4: call dst_dev_put() properly")
    Signed-off-by: default avatarWei Wang <weiwan@google.com>
    Reported-by: default avatarIdo Schimmel <idosch@idosch.org>
    Reported-by: default avatarJesse Hathaway <jesse@mbuki-mvuki.org>
    Tested-by: default avatarJesse Hathaway <jesse@mbuki-mvuki.org>
    Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Cc: David Ahern <dsahern@gmail.com>
    Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    5018c596
route.c 86.5 KB