Commit b617158d authored by Eric Dumazet's avatar Eric Dumazet Committed by David S. Miller

tcp: be more careful in tcp_fragment()

Some applications set tiny SO_SNDBUF values and expect
TCP to just work. Recent patches to address CVE-2019-11478
broke them in case of losses, since retransmits might
be prevented.

We should allow these flows to make progress.

This patch allows the first and last skb in retransmit queue
to be split even if memory limits are hit.

It also adds the some room due to the fact that tcp_sendmsg()
and tcp_sendpage() might overshoot sk_wmem_queued by about one full
TSO skb (64KB size). Note this allowance was already present
in stable backports for kernels < 4.15

Note for < 4.15 backports :
 tcp_rtx_queue_tail() will probably look like :

static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
	struct sk_buff *skb = tcp_send_head(sk);

	return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
}

Fixes: f070ef2a ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
Reported-by: default avatarAndrew Prout <aprout@ll.mit.edu>
Tested-by: default avatarAndrew Prout <aprout@ll.mit.edu>
Tested-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
Tested-by: default avatarMichal Kubecek <mkubecek@suse.cz>
Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
Acked-by: default avatarYuchung Cheng <ycheng@google.com>
Acked-by: default avatarChristoph Paasch <cpaasch@apple.com>
Cc: Jonathan Looney <jtl@netflix.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent be4363bd
...@@ -1709,6 +1709,11 @@ static inline struct sk_buff *tcp_rtx_queue_head(const struct sock *sk) ...@@ -1709,6 +1709,11 @@ static inline struct sk_buff *tcp_rtx_queue_head(const struct sock *sk)
return skb_rb_first(&sk->tcp_rtx_queue); return skb_rb_first(&sk->tcp_rtx_queue);
} }
static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
return skb_rb_last(&sk->tcp_rtx_queue);
}
static inline struct sk_buff *tcp_write_queue_head(const struct sock *sk) static inline struct sk_buff *tcp_write_queue_head(const struct sock *sk)
{ {
return skb_peek(&sk->sk_write_queue); return skb_peek(&sk->sk_write_queue);
......
...@@ -1288,6 +1288,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue, ...@@ -1288,6 +1288,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
struct tcp_sock *tp = tcp_sk(sk); struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *buff; struct sk_buff *buff;
int nsize, old_factor; int nsize, old_factor;
long limit;
int nlen; int nlen;
u8 flags; u8 flags;
...@@ -1298,8 +1299,16 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue, ...@@ -1298,8 +1299,16 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
if (nsize < 0) if (nsize < 0)
nsize = 0; nsize = 0;
if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf && /* tcp_sendmsg() can overshoot sk_wmem_queued by one full size skb.
tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) { * We need some allowance to not penalize applications setting small
* SO_SNDBUF values.
* Also allow first and last skb in retransmit queue to be split.
*/
limit = sk->sk_sndbuf + 2 * SKB_TRUESIZE(GSO_MAX_SIZE);
if (unlikely((sk->sk_wmem_queued >> 1) > limit &&
tcp_queue != TCP_FRAG_IN_WRITE_QUEUE &&
skb != tcp_rtx_queue_head(sk) &&
skb != tcp_rtx_queue_tail(sk))) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG); NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
return -ENOMEM; return -ENOMEM;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment