Commit 5a308f40 authored by Eric Dumazet's avatar Eric Dumazet Committed by David S. Miller

netem: refine early skb orphaning

netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
or any mechanism relying on socket sk_wmem_alloc feedback.

Ideally, we should perform this orphaning after the rate module and
before the delay module, to mimic what happens on a real link :

skb orphaning is indeed normally done at TX completion, before the
transit on the link.

+-------+   +--------+  +---------------+  +-----------------+
+ Qdisc +---> Device +--> TX completion +--> links / hops    +->
+       +   +  xmit  +  + skb orphaning +  + propagation     +
+-------+   +--------+  +---------------+  +-----------------+
      < rate limiting >                  < delay, drops, reorders >

If netem is used without delay feature (drops, reorders, rate
limiting), then we should avoid early skb orphaning, to keep pressure
on sockets as long as packets are still in qdisc queue.

Ideally, netem should be refactored to implement delay module
as the last stage. Current algorithm merges the two phases
(rate limiting + delay) so its not correct.
Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Mark Gordon <msg@google.com>
Cc: Andreas Terzis <aterzis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: default avatarStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 7ff65cde
...@@ -380,7 +380,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) ...@@ -380,7 +380,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
} }
skb_orphan(skb); /* If a delay is expected, orphan the skb. (orphaning usually takes
* place at TX completion time, so _before_ the link transit delay)
* Ideally, this orphaning should be done after the rate limiting
* module, because this breaks TCP Small Queue, and other mechanisms
* based on socket sk_wmem_alloc.
*/
if (q->latency || q->jitter)
skb_orphan(skb);
/* /*
* If we need to duplicate packet, then re-insert at top of the * If we need to duplicate packet, then re-insert at top of the
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment