• Ido Schimmel's avatar
    mlxsw: core: Free EMAD transactions using kfree_rcu() · 3c8ce24b
    Ido Schimmel authored
    The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
    managed using RCU. They are freed using kfree_rcu() once the transaction
    ends.
    
    However, in case the transaction failed it is freed immediately after being
    removed from the active transactions list. This is problematic because it is
    still possible for a different CPU to dereference the transaction from an RCU
    read-side critical section while traversing the active transaction list in
    mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
    [1].
    
    Fix this by freeing the transaction after a grace period by calling
    kfree_rcu().
    
    [1]
    BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
    Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881
    
    CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
    Call Trace:
     <IRQ>
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0xf6/0x16e lib/dump_stack.c:118
     print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
     __kasan_report mm/kasan/report.c:513 [inline]
     kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
     mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
     mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
     mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
     mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
     tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
     __do_softirq+0x223/0x964 kernel/softirq.c:292
     asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
     </IRQ>
     __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
     run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
     do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
     invoke_softirq kernel/softirq.c:387 [inline]
     __irq_exit_rcu kernel/softirq.c:417 [inline]
     irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
     sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
     asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
    RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
    RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
    RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
    Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
    RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
    RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
    RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
    R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
     __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
     do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x473dbd
    Code: Bad RIP value.
    RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
    RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
    RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
    R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80
    
    Allocated by task 871:
     save_stack+0x1b/0x40 mm/kasan/common.c:48
     set_track mm/kasan/common.c:56 [inline]
     __kasan_kmalloc mm/kasan/common.c:494 [inline]
     __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
     kmalloc include/linux/slab.h:555 [inline]
     kzalloc include/linux/slab.h:669 [inline]
     mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
     mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
     mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
     update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
     process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
     worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
     kthread+0x355/0x470 kernel/kthread.c:291
     ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293
    
    Freed by task 871:
     save_stack+0x1b/0x40 mm/kasan/common.c:48
     set_track mm/kasan/common.c:56 [inline]
     kasan_set_free_info mm/kasan/common.c:316 [inline]
     __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
     slab_free_hook mm/slub.c:1474 [inline]
     slab_free_freelist_hook mm/slub.c:1507 [inline]
     slab_free mm/slub.c:3072 [inline]
     kfree+0xe6/0x320 mm/slub.c:4052
     mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
     mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
     mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
     update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
     process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
     worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
     kthread+0x355/0x470 kernel/kthread.c:291
     ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293
    
    The buggy address belongs to the object at ffff88800b796400
     which belongs to the cache kmalloc-512 of size 512
    The buggy address is located 232 bytes inside of
     512-byte region [ffff88800b796400, ffff88800b796600)
    The buggy address belongs to the page:
    page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
    flags: 0x100000000010200(slab|head)
    raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
    raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                              ^
     ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    
    Fixes: caf7297e ("mlxsw: core: Introduce support for asynchronous EMAD register access")
    Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
    Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    3c8ce24b
core.c 68.8 KB