• Andrey Ignatov's avatar
    bpf: Hooks for sys_connect · d74bad4e
    Andrey Ignatov authored
    == The problem ==
    
    See description of the problem in the initial patch of this patch set.
    
    == The solution ==
    
    The patch provides much more reliable in-kernel solution for the 2nd
    part of the problem: making outgoing connecttion from desired IP.
    
    It adds new attach types `BPF_CGROUP_INET4_CONNECT` and
    `BPF_CGROUP_INET6_CONNECT` for program type
    `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both
    source and destination of a connection at connect(2) time.
    
    Local end of connection can be bound to desired IP using newly
    introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though,
    and doesn't support binding to port, i.e. leverages
    `IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this:
    * looking for a free port is expensive and can affect performance
      significantly;
    * there is no use-case for port.
    
    As for remote end (`struct sockaddr *` passed by user), both parts of it
    can be overridden, remote IP and remote port. It's useful if an
    application inside cgroup wants to connect to another application inside
    same cgroup or to itself, but knows nothing about IP assigned to the
    cgroup.
    
    Support is added for IPv4 and IPv6, for TCP and UDP.
    
    IPv4 and IPv6 have separate attach types for same reason as sys_bind
    hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields
    when user passes sockaddr_in since it'd be out-of-bound.
    
    == Implementation notes ==
    
    The patch introduces new field in `struct proto`: `pre_connect` that is
    a pointer to a function with same signature as `connect` but is called
    before it. The reason is in some cases BPF hooks should be called way
    before control is passed to `sk->sk_prot->connect`. Specifically
    `inet_dgram_connect` autobinds socket before calling
    `sk->sk_prot->connect` and there is no way to call `bpf_bind()` from
    hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since
    it'd cause double-bind. On the other hand `proto.pre_connect` provides a
    flexible way to add BPF hooks for connect only for necessary `proto` and
    call them at desired time before `connect`. Since `bpf_bind()` is
    allowed to bind only to IP and autobind in `inet_dgram_connect` binds
    only port there is no chance of double-bind.
    
    bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite
    of value of `bind_address_no_port` socket field.
    
    bpf_bind() sets `with_lock` to `false` when calling to __inet_bind()
    and __inet6_bind() since all call-sites, where bpf_bind() is called,
    already hold socket lock.
    Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    d74bad4e
tcp_ipv4.c 67 KB