• Flavio Leitner's avatar
    bonding: check if clients MAC addr has changed · 42d782ac
    Flavio Leitner authored
    When two systems using bonding devices in adaptive load
    balancing (ALB) communicates with each other, an endless
    ping-pong of ARP replies starts between these two systems.
    
    What happens? In the ALB mode, bonding driver keeps track
    of each client connected in a hash table, so it can do the
    receive load balancing (RLB). This hash table is updated
    when an ARP reply is received, then it scans for the client
    entry, updates its MAC address and flag it to be announced
    later. Therefore, two seconds later, the alb monitor runs
    and send for each updated client entry two ARP replies
    updating this specific client. The same process happens on
    the receiving system, causing the endless ping-pong of arp
    replies.
    
    See more information including the relevant functions below:
    
       System 1                          System 2
        bond0                             bond0
    
       ping <system2>
        ARP request  --------->
                               <--------- ARP reply
    
    +->rlb_arp_recv  <---------------------+   <--- loop begins
    |  rlb_update_entry_from_arp           |
    |  client_info->ntt = 1;               |
    |  bond_info->rx_ntt = 1;              |
    |                                      |
    |         <communication succeed>      |
    |                                      |
    |  bond_alb_monitor                    |
    |  rlb_update_rx_clients               |
    |  rlb_update_client                   |
    |  arp_create(ARPOP_REPLY)             |
    |   send ARP reply -------------->     V
    |   send ARP reply -------------->
    |                               rlb_arp_recv
    |                               rlb_update_entry_from_arp
    |                               client_info->ntt = 1;
    |                               bond_info->rx_ntt = 1;
    |                           < snipped, same as in system 1>
    +-------           <-------------- send ARP reply
                       <-------------- send ARP reply
    
    Besides the unneeded networking traffic, this loop breaks
    a cluster because a backup system can't take over the IP
    address. There is always one system sending an ARP reply
    poisoning the network.
    
    This patch fixes the problem adding a check for the MAC
    address before updating it. Thus, if the MAC address didn't
    change, there is no need to update neither to announce it later.
    Signed-off-by: default avatarFlavio Leitner <fleitner@redhat.com>
    Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    42d782ac
bond_alb.c 44.8 KB