• Stefano Brivio's avatar
    ipv4: Return -ENETUNREACH if we can't create route but saddr is valid · 595e0651
    Stefano Brivio authored
    ...instead of -EINVAL. An issue was found with older kernel versions
    while unplugging a NFS client with pending RPCs, and the wrong error
    code here prevented it from recovering once link is back up with a
    configured address.
    
    Incidentally, this is not an issue anymore since commit 4f8943f8
    ("SUNRPC: Replace direct task wakeups from softirq context"), included
    in 5.2-rc7, had the effect of decoupling the forwarding of this error
    by using SO_ERROR in xs_wake_error(), as pointed out by Benjamin
    Coddington.
    
    To the best of my knowledge, this isn't currently causing any further
    issue, but the error code doesn't look appropriate anyway, and we
    might hit this in other paths as well.
    
    In detail, as analysed by Gonzalo Siero, once the route is deleted
    because the interface is down, and can't be resolved and we return
    -EINVAL here, this ends up, courtesy of inet_sk_rebuild_header(),
    as the socket error seen by tcp_write_err(), called by
    tcp_retransmit_timer().
    
    In turn, tcp_write_err() indirectly calls xs_error_report(), which
    wakes up the RPC pending tasks with a status of -EINVAL. This is then
    seen by call_status() in the SUN RPC implementation, which aborts the
    RPC call calling rpc_exit(), instead of handling this as a
    potentially temporary condition, i.e. as a timeout.
    
    Return -EINVAL only if the input parameters passed to
    ip_route_output_key_hash_rcu() are actually invalid (this is the case
    if the specified source address is multicast, limited broadcast or
    all zeroes), but return -ENETUNREACH in all cases where, at the given
    moment, the given source address doesn't allow resolving the route.
    
    While at it, drop the initialisation of err to -ENETUNREACH, which
    was added to __ip_route_output_key() back then by commit
    0315e382 ("net: Fix behaviour of unreachable, blackhole and
    prohibit routes"), but actually had no effect, as it was, and is,
    overwritten by the fib_lookup() return code assignment, and anyway
    ignored in all other branches, including the if (fl4->saddr) one:
    I find this rather confusing, as it would look like -ENETUNREACH is
    the "default" error, while that statement has no effect.
    
    Also note that after commit fc75fc83 ("ipv4: dont create routes
    on down devices"), we would get -ENETUNREACH if the device is down,
    but -EINVAL if the source address is specified and we can't resolve
    the route, and this appears to be rather inconsistent.
    Reported-by: default avatarStefan Walter <walteste@inf.ethz.ch>
    Analysed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
    Analysed-by: default avatarGonzalo Siero <gsierohu@redhat.com>
    Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    595e0651
route.c 86.5 KB