• Eric Dumazet's avatar
    net: fix IP early demux races · 5037e9ef
    Eric Dumazet authored
    David Wilder reported crashes caused by dst reuse.
    
    <quote David>
      I am seeing a crash on a distro V4.2.3 kernel caused by a double
      release of a dst_entry.  In ipv4_dst_destroy() the call to
      list_empty() finds a poisoned next pointer, indicating the dst_entry
      has already been removed from the list and freed. The crash occurs
      18 to 24 hours into a run of a network stress exerciser.
    </quote>
    
    Thanks to his detailed report and analysis, we were able to understand
    the core issue.
    
    IP early demux can associate a dst to skb, after a lookup in TCP/UDP
    sockets.
    
    When socket cache is not properly set, we want to store into
    sk->sk_dst_cache the dst for future IP early demux lookups,
    by acquiring a stable refcount on the dst.
    
    Problem is this acquisition is simply using an atomic_inc(),
    which works well, unless the dst was queued for destruction from
    dst_release() noticing dst refcount went to zero, if DST_NOCACHE
    was set on dst.
    
    We need to make sure current refcount is not zero before incrementing
    it, or risk double free as David reported.
    
    This patch, being a stable candidate, adds two new helpers, and use
    them only from IP early demux problematic paths.
    
    It might be possible to merge in net-next skb_dst_force() and
    skb_dst_force_safe(), but I prefer having the smallest patch for stable
    kernels : Maybe some skb_dst_force() callers do not expect skb->dst
    can suddenly be cleared.
    
    Can probably be backported back to linux-3.6 kernels
    Reported-by: default avatarDavid J. Wilder <dwilder@us.ibm.com>
    Tested-by: default avatarDavid J. Wilder <dwilder@us.ibm.com>
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    5037e9ef
tcp_ipv4.c 60.7 KB