• John Fastabend's avatar
    bpf, sockmap: Fix partial copy_page_to_iter so progress can still be made · c9c89dcd
    John Fastabend authored
    If copy_page_to_iter() fails or even partially completes, but with fewer
    bytes copied than expected we currently reset sg.start and return EFAULT.
    This proves problematic if we already copied data into the user buffer
    before we return an error. Because we leave the copied data in the user
    buffer and fail to unwind the scatterlist so kernel side believes data
    has been copied and user side believes data has _not_ been received.
    
    Expected behavior should be to return number of bytes copied and then
    on the next read we need to return the error assuming its still there. This
    can happen if we have a copy length spanning multiple scatterlist elements
    and one or more complete before the error is hit.
    
    The error is rare enough though that my normal testing with server side
    programs, such as nginx, httpd, envoy, etc., I have never seen this. The
    only reliable way to reproduce that I've found is to stream movies over
    my browser for a day or so and wait for it to hang. Not very scientific,
    but with a few extra WARN_ON()s in the code the bug was obvious.
    
    When we review the errors from copy_page_to_iter() it seems we are hitting
    a page fault from copy_page_to_iter_iovec() where the code checks
    fault_in_pages_writeable(buf, copy) where buf is the user buffer. It
    also seems typical server applications don't hit this case.
    
    The other way to try and reproduce this is run the sockmap selftest tool
    test_sockmap with data verification enabled, but it doesn't reproduce the
    fault. Perhaps we can trigger this case artificially somehow from the
    test tools. I haven't sorted out a way to do that yet though.
    
    Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
    Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/160556566659.73229.15694973114605301063.stgit@john-XPS-13-9370
    c9c89dcd
tcp_bpf.c 14.5 KB