• Pablo Neira Ayuso's avatar
    netfilter: flowtable: fix stuck flows on cleanup due to pending work · 9afb4b27
    Pablo Neira Ayuso authored
    To clear the flow table on flow table free, the following sequence
    normally happens in order:
    
      1) gc_step work is stopped to disable any further stats/del requests.
      2) All flow table entries are set to teardown state.
      3) Run gc_step which will queue HW del work for each flow table entry.
      4) Waiting for the above del work to finish (flush).
      5) Run gc_step again, deleting all entries from the flow table.
      6) Flow table is freed.
    
    But if a flow table entry already has pending HW stats or HW add work
    step 3 will not queue HW del work (it will be skipped), step 4 will wait
    for the pending add/stats to finish, and step 5 will queue HW del work
    which might execute after freeing of the flow table.
    
    To fix the above, this patch flushes the pending work, then it sets the
    teardown flag to all flows in the flowtable and it forces a garbage
    collector run to queue work to remove the flows from hardware, then it
    flushes this new pending work and (finally) it forces another garbage
    collector run to remove the entry from the software flowtable.
    
    Stack trace:
    [47773.882335] BUG: KASAN: use-after-free in down_read+0x99/0x460
    [47773.883634] Write of size 8 at addr ffff888103b45aa8 by task kworker/u20:6/543704
    [47773.885634] CPU: 3 PID: 543704 Comm: kworker/u20:6 Not tainted 5.12.0-rc7+ #2
    [47773.886745] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
    [47773.888438] Workqueue: nf_ft_offload_del flow_offload_work_handler [nf_flow_table]
    [47773.889727] Call Trace:
    [47773.890214]  dump_stack+0xbb/0x107
    [47773.890818]  print_address_description.constprop.0+0x18/0x140
    [47773.892990]  kasan_report.cold+0x7c/0xd8
    [47773.894459]  kasan_check_range+0x145/0x1a0
    [47773.895174]  down_read+0x99/0x460
    [47773.899706]  nf_flow_offload_tuple+0x24f/0x3c0 [nf_flow_table]
    [47773.907137]  flow_offload_work_handler+0x72d/0xbe0 [nf_flow_table]
    [47773.913372]  process_one_work+0x8ac/0x14e0
    [47773.921325]
    [47773.921325] Allocated by task 592159:
    [47773.922031]  kasan_save_stack+0x1b/0x40
    [47773.922730]  __kasan_kmalloc+0x7a/0x90
    [47773.923411]  tcf_ct_flow_table_get+0x3cb/0x1230 [act_ct]
    [47773.924363]  tcf_ct_init+0x71c/0x1156 [act_ct]
    [47773.925207]  tcf_action_init_1+0x45b/0x700
    [47773.925987]  tcf_action_init+0x453/0x6b0
    [47773.926692]  tcf_exts_validate+0x3d0/0x600
    [47773.927419]  fl_change+0x757/0x4a51 [cls_flower]
    [47773.928227]  tc_new_tfilter+0x89a/0x2070
    [47773.936652]
    [47773.936652] Freed by task 543704:
    [47773.937303]  kasan_save_stack+0x1b/0x40
    [47773.938039]  kasan_set_track+0x1c/0x30
    [47773.938731]  kasan_set_free_info+0x20/0x30
    [47773.939467]  __kasan_slab_free+0xe7/0x120
    [47773.940194]  slab_free_freelist_hook+0x86/0x190
    [47773.941038]  kfree+0xce/0x3a0
    [47773.941644]  tcf_ct_flow_table_cleanup_work
    
    Original patch description and stack trace by Paul Blakey.
    
    Fixes: c29f74e0 ("netfilter: nf_flow_table: hardware offload support")
    Reported-by: default avatarPaul Blakey <paulb@nvidia.com>
    Tested-by: default avatarPaul Blakey <paulb@nvidia.com>
    Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    9afb4b27
nf_flow_table.h 9.16 KB