Commit cb06e37a authored by Aaron Knister's avatar Aaron Knister Committed by Kleber Sacilotto de Souza

IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler

BugLink: https://bugs.launchpad.net/bugs/1798587

commit 816e846c upstream.

Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh->queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv->lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr->hwaddr);
               spin_lock_irqsave(&priv->lock, flags);
               __skb_queue_tail(&neigh->queue, skb);
               spin_unlock_irqrestore(&priv->lock, flags);
       } else {
               ++dev->stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba ("IPoIB: Connected mode experimental support")
Cc: stable@vger.kernel.org
Signed-off-by: default avatarAaron Knister <aaron.s.knister@nasa.gov>
Tested-by: default avatarIra Weiny <ira.weiny@intel.com>
Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
Signed-off-by: default avatarKleber Sacilotto de Souza <kleber.souza@canonical.com>
parent 2f60d386
...@@ -992,12 +992,14 @@ static int ipoib_cm_rep_handler(struct ib_cm_id *cm_id, struct ib_cm_event *even ...@@ -992,12 +992,14 @@ static int ipoib_cm_rep_handler(struct ib_cm_id *cm_id, struct ib_cm_event *even
skb_queue_head_init(&skqueue); skb_queue_head_init(&skqueue);
netif_tx_lock_bh(p->dev);
spin_lock_irq(&priv->lock); spin_lock_irq(&priv->lock);
set_bit(IPOIB_FLAG_OPER_UP, &p->flags); set_bit(IPOIB_FLAG_OPER_UP, &p->flags);
if (p->neigh) if (p->neigh)
while ((skb = __skb_dequeue(&p->neigh->queue))) while ((skb = __skb_dequeue(&p->neigh->queue)))
__skb_queue_tail(&skqueue, skb); __skb_queue_tail(&skqueue, skb);
spin_unlock_irq(&priv->lock); spin_unlock_irq(&priv->lock);
netif_tx_unlock_bh(p->dev);
while ((skb = __skb_dequeue(&skqueue))) { while ((skb = __skb_dequeue(&skqueue))) {
skb->dev = p->dev; skb->dev = p->dev;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment