• Andrew Lunn's avatar
    batman-adv: Avoid endless loop in bat-on-bat netdevice check · 1bc4e2b0
    Andrew Lunn authored
    batman-adv checks in different situation if a new device is already on top
    of a different batman-adv device. This is done by getting the iflink of a
    device and all its parent. It assumes that this iflink is always a parent
    device in an acyclic graph. But this assumption is broken by devices like
    veth which are actually a pair of two devices linked to each other. The
    recursive check would therefore get veth0 when calling dev_get_iflink on
    veth1. And it gets veth0 when calling dev_get_iflink with veth1.
    
    Creating a veth pair and loading batman-adv freezes parts of the system
    
        ip link add veth0 type veth peer name veth1
        modprobe batman-adv
    
    An RCU stall will be detected on the system which cannot be fixed.
    
        INFO: rcu_sched self-detected stall on CPU
                1: (5264 ticks this GP) idle=3e9/140000000000001/0
        softirq=144683/144686 fqs=5249
                 (t=5250 jiffies g=46 c=45 q=43)
        Task dump for CPU 1:
        insmod          R  running task        0   247    245 0x00000008
         ffffffff8151f140 ffffffff8107888e ffff88000fd141c0 ffffffff8151f140
         0000000000000000 ffffffff81552df0 ffffffff8107b420 0000000000000001
         ffff88000e3fa700 ffffffff81540b00 ffffffff8107d667 0000000000000001
        Call Trace:
         <IRQ>  [<ffffffff8107888e>] ? rcu_dump_cpu_stacks+0x7e/0xd0
         [<ffffffff8107b420>] ? rcu_check_callbacks+0x3f0/0x6b0
         [<ffffffff8107d667>] ? hrtimer_run_queues+0x47/0x180
         [<ffffffff8107cf9d>] ? update_process_times+0x2d/0x50
         [<ffffffff810873fb>] ? tick_handle_periodic+0x1b/0x60
         [<ffffffff810290ae>] ? smp_trace_apic_timer_interrupt+0x5e/0x90
         [<ffffffff813bbae2>] ? apic_timer_interrupt+0x82/0x90
         <EOI>  [<ffffffff812c3fd7>] ? __dev_get_by_index+0x37/0x40
         [<ffffffffa0031f3e>] ? batadv_hard_if_event+0xee/0x3a0 [batman_adv]
         [<ffffffff812c5801>] ? register_netdevice_notifier+0x81/0x1a0
        [...]
    
    This can be avoided by checking if two devices are each others parent and
    stopping the check in this situation.
    
    Fixes: b7eddd0b ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
    Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
    [sven@narfation.org: rewritten description, extracted fix]
    Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
    Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
    Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
    1bc4e2b0
hard-interface.c 21.7 KB