1. 29 Oct, 2019 14 commits
    • Aya Levin's avatar
      net/mlx5e: Initialize on stack link modes bitmap · 926b37f7
      Aya Levin authored
      Initialize link modes bitmap on stack before using it, otherwise the
      outcome of ethtool set link ksettings might have unexpected values.
      
      Fixes: 4b95840a ("net/mlx5e: Fix matching of speed to PRM link modes")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      926b37f7
    • Aya Levin's avatar
      net/mlx5e: Fix ethtool self test: link speed · 534e7366
      Aya Levin authored
      Ethtool self test contains a test for link speed. This test reads the
      PTYS register and determines whether the current speed is valid or not.
      Change current implementation to use the function mlx5e_port_linkspeed()
      that does the same check and fails when speed is invalid. This code
      redundancy lead to a bug when mlx5e_port_linkspeed() was updated with
      expended speeds and the self test was not.
      
      Fixes: 2c81bfd5 ("net/mlx5e: Move port speed code from en_ethtool.c to en/port.c")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      534e7366
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Fix handling of compressed CQEs in case of low NAPI budget · 9df86bdb
      Maxim Mikityanskiy authored
      When CQE compression is enabled, compressed CQEs use the following
      structure: a title is followed by one or many blocks, each containing 8
      mini CQEs (except the last, which may contain fewer mini CQEs).
      
      Due to NAPI budget restriction, a complete structure is not always
      parsed in one NAPI run, and some blocks with mini CQEs may be deferred
      to the next NAPI poll call - we have the mlx5e_decompress_cqes_cont call
      in the beginning of mlx5e_poll_rx_cq. However, if the budget is
      extremely low, some blocks may be left even after that, but the code
      that follows the mlx5e_decompress_cqes_cont call doesn't check it and
      assumes that a new CQE begins, which may not be the case. In such cases,
      random memory corruptions occur.
      
      An extremely low NAPI budget of 8 is used when busy_poll or busy_read is
      active.
      
      This commit adds a check to make sure that the previous compressed CQE
      has been completely parsed after mlx5e_decompress_cqes_cont, otherwise
      it prevents a new CQE from being fetched in the middle of a compressed
      CQE.
      
      This commit fixes random crashes in __build_skb, __page_pool_put_page
      and other not-related-directly places, that used to happen when both CQE
      compression and busy_poll/busy_read were enabled.
      
      Fixes: 7219ab34 ("net/mlx5e: CQE compression")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9df86bdb
    • Vlad Buslov's avatar
      net/mlx5e: Don't store direct pointer to action's tunnel info · 2a4b6526
      Vlad Buslov authored
      Geneve implementation changed mlx5 tc to user direct pointer to tunnel_key
      action's internal struct ip_tunnel_info instance. However, this leads to
      use-after-free error when initial filter that caused creation of new encap
      entry is deleted or when tunnel_key action is manually overwritten through
      action API. Moreover, with recent TC offloads API unlocking change struct
      flow_action_entry->tunnel point to temporal copy of tunnel info that is
      deallocated after filter is offloaded to hardware which causes bug to
      reproduce every time new filter is attached to existing encap entry with
      following KASAN bug:
      
      [  314.885555] ==================================================================
      [  314.886641] BUG: KASAN: use-after-free in memcmp+0x2c/0x60
      [  314.886864] Read of size 1 at addr ffff88886c746280 by task tc/2682
      
      [  314.887179] CPU: 22 PID: 2682 Comm: tc Not tainted 5.3.0-rc7+ #703
      [  314.887188] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  314.887195] Call Trace:
      [  314.887215]  dump_stack+0x9a/0xf0
      [  314.887236]  print_address_description+0x67/0x323
      [  314.887248]  ? memcmp+0x2c/0x60
      [  314.887257]  ? memcmp+0x2c/0x60
      [  314.887272]  __kasan_report.cold+0x1a/0x3d
      [  314.887474]  ? __mlx5e_tc_del_fdb_peer_flow+0x100/0x1b0 [mlx5_core]
      [  314.887484]  ? memcmp+0x2c/0x60
      [  314.887509]  kasan_report+0xe/0x12
      [  314.887521]  memcmp+0x2c/0x60
      [  314.887662]  mlx5e_tc_add_fdb_flow+0x51b/0xbe0 [mlx5_core]
      [  314.887838]  ? mlx5e_encap_take+0x110/0x110 [mlx5_core]
      [  314.887902]  ? lockdep_init_map+0x87/0x2c0
      [  314.887924]  ? __init_waitqueue_head+0x4f/0x60
      [  314.888062]  ? mlx5e_alloc_flow.isra.0+0x18c/0x1c0 [mlx5_core]
      [  314.888207]  __mlx5e_add_fdb_flow+0x2d7/0x440 [mlx5_core]
      [  314.888359]  ? mlx5e_tc_update_neigh_used_value+0x6f0/0x6f0 [mlx5_core]
      [  314.888374]  ? match_held_lock+0x2e/0x240
      [  314.888537]  mlx5e_configure_flower+0x830/0x16a0 [mlx5_core]
      [  314.888702]  ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core]
      [  314.888713]  ? down_read+0x118/0x2c0
      [  314.888728]  ? down_read_killable+0x300/0x300
      [  314.888882]  ? mlx5e_rep_get_ethtool_stats+0x180/0x180 [mlx5_core]
      [  314.888899]  tc_setup_cb_add+0x127/0x270
      [  314.888937]  fl_hw_replace_filter+0x2ac/0x380 [cls_flower]
      [  314.888976]  ? fl_hw_destroy_filter+0x1b0/0x1b0 [cls_flower]
      [  314.888990]  ? fl_change+0xbcf/0x27ef [cls_flower]
      [  314.889030]  ? fl_change+0xa57/0x27ef [cls_flower]
      [  314.889069]  fl_change+0x16bd/0x27ef [cls_flower]
      [  314.889135]  ? __rhashtable_insert_fast.constprop.0+0xa00/0xa00 [cls_flower]
      [  314.889167]  ? __radix_tree_lookup+0xa4/0x130
      [  314.889200]  ? fl_get+0x169/0x240 [cls_flower]
      [  314.889218]  ? fl_walk+0x230/0x230 [cls_flower]
      [  314.889249]  tc_new_tfilter+0x5e1/0xd40
      [  314.889281]  ? __rhashtable_insert_fast.constprop.0+0xa00/0xa00 [cls_flower]
      [  314.889309]  ? tc_del_tfilter+0xa30/0xa30
      [  314.889335]  ? __lock_acquire+0x5b5/0x2460
      [  314.889378]  ? find_held_lock+0x85/0xa0
      [  314.889442]  ? tc_del_tfilter+0xa30/0xa30
      [  314.889465]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  314.889488]  ? rtnl_dellink+0x490/0x490
      [  314.889518]  ? lockdep_hardirqs_on+0x260/0x260
      [  314.889538]  ? netlink_deliver_tap+0xab/0x5a0
      [  314.889550]  ? match_held_lock+0x1b/0x240
      [  314.889575]  netlink_rcv_skb+0xd0/0x200
      [  314.889588]  ? rtnl_dellink+0x490/0x490
      [  314.889605]  ? netlink_ack+0x440/0x440
      [  314.889635]  ? netlink_deliver_tap+0x161/0x5a0
      [  314.889648]  ? lock_downgrade+0x360/0x360
      [  314.889657]  ? lock_acquire+0xe5/0x210
      [  314.889686]  netlink_unicast+0x296/0x350
      [  314.889707]  ? netlink_attachskb+0x390/0x390
      [  314.889726]  ? _copy_from_iter_full+0xe0/0x3a0
      [  314.889738]  ? __virt_addr_valid+0xbb/0x130
      [  314.889771]  netlink_sendmsg+0x394/0x600
      [  314.889800]  ? netlink_unicast+0x350/0x350
      [  314.889817]  ? move_addr_to_kernel.part.0+0x90/0x90
      [  314.889852]  ? netlink_unicast+0x350/0x350
      [  314.889872]  sock_sendmsg+0x96/0xa0
      [  314.889891]  ___sys_sendmsg+0x482/0x520
      [  314.889919]  ? copy_msghdr_from_user+0x250/0x250
      [  314.889930]  ? __fput+0x1fa/0x390
      [  314.889941]  ? task_work_run+0xb7/0xf0
      [  314.889957]  ? exit_to_usermode_loop+0x117/0x120
      [  314.889972]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  314.889982]  ? do_syscall_64+0x74/0xe0
      [  314.889992]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  314.890012]  ? mark_lock+0xac/0x9a0
      [  314.890028]  ? __lock_acquire+0x5b5/0x2460
      [  314.890053]  ? mark_lock+0xac/0x9a0
      [  314.890083]  ? __lock_acquire+0x5b5/0x2460
      [  314.890112]  ? match_held_lock+0x1b/0x240
      [  314.890144]  ? __fget_light+0xa1/0xf0
      [  314.890166]  ? sockfd_lookup_light+0x91/0xb0
      [  314.890187]  __sys_sendmsg+0xba/0x130
      [  314.890201]  ? __sys_sendmsg_sock+0xb0/0xb0
      [  314.890225]  ? __blkcg_punt_bio_submit+0xd0/0xd0
      [  314.890264]  ? lockdep_hardirqs_off+0xbe/0x100
      [  314.890274]  ? mark_held_locks+0x24/0x90
      [  314.890286]  ? do_syscall_64+0x1e/0xe0
      [  314.890308]  do_syscall_64+0x74/0xe0
      [  314.890325]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  314.890336] RIP: 0033:0x7f00ca33d7b8
      [  314.890348] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [  314.890356] RSP: 002b:00007ffea2983928 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  314.890369] RAX: ffffffffffffffda RBX: 000000005d777d5b RCX: 00007f00ca33d7b8
      [  314.890377] RDX: 0000000000000000 RSI: 00007ffea2983990 RDI: 0000000000000003
      [  314.890384] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
      [  314.890392] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [  314.890400] R13: 000000000047f640 R14: 00007ffea2987b58 R15: 0000000000000021
      
      [  314.890529] Allocated by task 2687:
      [  314.890684]  save_stack+0x1b/0x80
      [  314.890694]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [  314.890705]  __kmalloc_track_caller+0x102/0x340
      [  314.890721]  kmemdup+0x1d/0x40
      [  314.890730]  tc_setup_flow_action+0x731/0x2c27
      [  314.890743]  fl_hw_replace_filter+0x23b/0x380 [cls_flower]
      [  314.890756]  fl_change+0x16bd/0x27ef [cls_flower]
      [  314.890765]  tc_new_tfilter+0x5e1/0xd40
      [  314.890776]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  314.890786]  netlink_rcv_skb+0xd0/0x200
      [  314.890796]  netlink_unicast+0x296/0x350
      [  314.890805]  netlink_sendmsg+0x394/0x600
      [  314.890815]  sock_sendmsg+0x96/0xa0
      [  314.890825]  ___sys_sendmsg+0x482/0x520
      [  314.890834]  __sys_sendmsg+0xba/0x130
      [  314.890844]  do_syscall_64+0x74/0xe0
      [  314.890854]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  314.890937] Freed by task 2687:
      [  314.891076]  save_stack+0x1b/0x80
      [  314.891086]  __kasan_slab_free+0x12c/0x170
      [  314.891095]  kfree+0xeb/0x2f0
      [  314.891106]  tc_cleanup_flow_action+0x69/0xa0
      [  314.891119]  fl_hw_replace_filter+0x2c5/0x380 [cls_flower]
      [  314.891132]  fl_change+0x16bd/0x27ef [cls_flower]
      [  314.891140]  tc_new_tfilter+0x5e1/0xd40
      [  314.891151]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  314.891161]  netlink_rcv_skb+0xd0/0x200
      [  314.891170]  netlink_unicast+0x296/0x350
      [  314.891180]  netlink_sendmsg+0x394/0x600
      [  314.891190]  sock_sendmsg+0x96/0xa0
      [  314.891200]  ___sys_sendmsg+0x482/0x520
      [  314.891208]  __sys_sendmsg+0xba/0x130
      [  314.891218]  do_syscall_64+0x74/0xe0
      [  314.891228]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  314.891315] The buggy address belongs to the object at ffff88886c746280
                      which belongs to the cache kmalloc-96 of size 96
      [  314.891762] The buggy address is located 0 bytes inside of
                      96-byte region [ffff88886c746280, ffff88886c7462e0)
      [  314.892196] The buggy address belongs to the page:
      [  314.892387] page:ffffea0021b1d180 refcount:1 mapcount:0 mapping:ffff88835d00ef80 index:0x0
      [  314.892398] flags: 0x57ffffc0000200(slab)
      [  314.892413] raw: 0057ffffc0000200 ffffea00219e0340 0000000800000008 ffff88835d00ef80
      [  314.892423] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
      [  314.892430] page dumped because: kasan: bad access detected
      
      [  314.892515] Memory state around the buggy address:
      [  314.892707]  ffff88886c746180: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.892976]  ffff88886c746200: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.893251] >ffff88886c746280: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.893522]                    ^
      [  314.893657]  ffff88886c746300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.893924]  ffff88886c746380: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
      [  314.894189] ==================================================================
      
      Fix the issue by duplicating tunnel info into per-encap copy that is
      deallocated with encap structure. Also, duplicate tunnel info in flow parse
      attribute to support cases when flow might be attached asynchronously.
      
      Fixes: 1f6da306 ("net/mlx5e: Geneve, Keep tunnel info as pointer to the original struct")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarYevgeny Kliteynik <kliteyn@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      2a4b6526
    • Eli Britstein's avatar
      net/mlx5: Fix NULL pointer dereference in extended destination · 0fd79b1e
      Eli Britstein authored
      The cited commit refactored the encap id into a struct pointed from the
      destination.
      Bug fix for the case there is no encap for one of the destinations.
      
      Fixes: 2b688ea5 ("net/mlx5: Add flow steering actions to fs_cmd shim layer")
      Signed-off-by: default avatarEli Britstein <elibr@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      0fd79b1e
    • Parav Pandit's avatar
      net/mlx5: Fix rtable reference leak · 2347cee8
      Parav Pandit authored
      If the rt entry gateway family is not AF_INET for multipath device,
      rtable reference is leaked.
      Hence, fix it by releasing the reference.
      
      Fixes: 5fb091e8 ("net/mlx5e: Use hint to resolve route when in HW multipath mode")
      Fixes: e32ee6c7 ("net/mlx5e: Support tunnel encap over tagged Ethernet")
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      2347cee8
    • Vlad Buslov's avatar
      net/mlx5e: Only skip encap flows update when encap init failed · 64d7b685
      Vlad Buslov authored
      When encap entry initialization completes successfully e->compl_result is
      set to positive value and not zero, like mlx5e_rep_update_flows() assumes
      at the moment. Fix the conditional to only skip encap flows update when
      e->compl_result < 0.
      
      Fixes: 2a1f1768 ("net/mlx5e: Refactor neigh update for concurrent execution")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      64d7b685
    • Maor Gottlieb's avatar
      net/mlx5e: Replace kfree with kvfree when free vhca stats · 5dfb6335
      Maor Gottlieb authored
      Memory allocated by kvzalloc should be freed by kvfree.
      
      Fixes: cef35af3 ("net/mlx5e: Add mlx5e HV VHCA stats agent")
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      5dfb6335
    • Dmytro Linkin's avatar
      net/mlx5e: Remove incorrect match criteria assignment line · 752d3dc0
      Dmytro Linkin authored
      Driver have function, which enable match criteria for misc parameters
      in dependence of eswitch capabilities.
      
      Fixes: 4f5d1bea ("Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux")
      Signed-off-by: default avatarDmytro Linkin <dmitrolin@mellanox.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      752d3dc0
    • Dmytro Linkin's avatar
      net/mlx5e: Determine source port properly for vlan push action · d5dbcc4e
      Dmytro Linkin authored
      Termination tables are used for vlan push actions on uplink ports.
      To support RoCE dual port the source port value was placed in a register.
      Fix the code to use an API method returning the source port according to
      the FW capabilities.
      
      Fixes: 10caabda ("net/mlx5e: Use termination table for VLAN push actions")
      Signed-off-by: default avatarDmytro Linkin <dmitrolin@mellanox.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d5dbcc4e
    • Roi Dayan's avatar
      net/mlx5: Fix flow counter list auto bits struct · 6dfef396
      Roi Dayan authored
      The union should contain the extended dest and counter list.
      Remove the resevered 0x40 bits which is redundant.
      This change doesn't break any functionally.
      Everything works today because the code in fs_cmd.c is using
      the correct structs if extended dest or the basic dest.
      
      Fixes: 1b115498 ("net/mlx5: Introduce extended destination fields")
      Signed-off-by: default avatarRoi Dayan <roid@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      6dfef396
    • Navid Emamdoost's avatar
      wimax: i2400: Fix memory leak in i2400m_op_rfkill_sw_toggle · 6f3ef5c2
      Navid Emamdoost authored
      In the implementation of i2400m_op_rfkill_sw_toggle() the allocated
      buffer for cmd should be released before returning. The
      documentation for i2400m_msg_to_dev() says when it returns the buffer
      can be reused. Meaning cmd should be released in either case. Move
      kfree(cmd) before return to be reached by all execution paths.
      
      Fixes: 2507e6ab ("wimax: i2400: fix memory leak")
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f3ef5c2
    • Jiangfeng Xiao's avatar
      net: hisilicon: Fix "Trying to free already-free IRQ" · 63a41746
      Jiangfeng Xiao authored
      When rmmod hip04_eth.ko, we can get the following warning:
      
      Task track: rmmod(1623)>bash(1591)>login(1581)>init(1)
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 1623 at kernel/irq/manage.c:1557 __free_irq+0xa4/0x2ac()
      Trying to free already-free IRQ 200
      Modules linked in: ping(O) pramdisk(O) cpuinfo(O) rtos_snapshot(O) interrupt_ctrl(O) mtdblock mtd_blkdevrtfs nfs_acl nfs lockd grace sunrpc xt_tcpudp ipt_REJECT iptable_filter ip_tables x_tables nf_reject_ipv
      CPU: 0 PID: 1623 Comm: rmmod Tainted: G           O    4.4.193 #1
      Hardware name: Hisilicon A15
      [<c020b408>] (rtos_unwind_backtrace) from [<c0206624>] (show_stack+0x10/0x14)
      [<c0206624>] (show_stack) from [<c03f2be4>] (dump_stack+0xa0/0xd8)
      [<c03f2be4>] (dump_stack) from [<c021a780>] (warn_slowpath_common+0x84/0xb0)
      [<c021a780>] (warn_slowpath_common) from [<c021a7e8>] (warn_slowpath_fmt+0x3c/0x68)
      [<c021a7e8>] (warn_slowpath_fmt) from [<c026876c>] (__free_irq+0xa4/0x2ac)
      [<c026876c>] (__free_irq) from [<c0268a14>] (free_irq+0x60/0x7c)
      [<c0268a14>] (free_irq) from [<c0469e80>] (release_nodes+0x1c4/0x1ec)
      [<c0469e80>] (release_nodes) from [<c0466924>] (__device_release_driver+0xa8/0x104)
      [<c0466924>] (__device_release_driver) from [<c0466a80>] (driver_detach+0xd0/0xf8)
      [<c0466a80>] (driver_detach) from [<c0465e18>] (bus_remove_driver+0x64/0x8c)
      [<c0465e18>] (bus_remove_driver) from [<c02935b0>] (SyS_delete_module+0x198/0x1e0)
      [<c02935b0>] (SyS_delete_module) from [<c0202ed0>] (__sys_trace_return+0x0/0x10)
      ---[ end trace bb25d6123d849b44 ]---
      
      Currently "rmmod hip04_eth.ko" call free_irq more than once
      as devres_release_all and hip04_remove both call free_irq.
      This results in a 'Trying to free already-free IRQ' warning.
      To solve the problem free_irq has been moved out of hip04_remove.
      Signed-off-by: default avatarJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63a41746
    • Will Deacon's avatar
      fjes: Handle workqueue allocation failure · 85ac30fa
      Will Deacon authored
      In the highly unlikely event that we fail to allocate either of the
      "/txrx" or "/control" workqueues, we should bail cleanly rather than
      blindly march on with NULL queue pointer(s) installed in the
      'fjes_adapter' instance.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Reported-by: default avatarNicolas Waisman <nico@semmle.com>
      Link: https://lore.kernel.org/lkml/CADJ_3a8WFrs5NouXNqS5WYe7rebFP+_A5CheeqAyD_p7DFJJcg@mail.gmail.com/Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85ac30fa
  2. 28 Oct, 2019 13 commits
    • David S. Miller's avatar
      Merge tag 'batadv-net-for-davem-20191025' of git://git.open-mesh.org/linux-merge · 55793d2a
      David S. Miller authored
      Simon Wunderlich says:
      
      ====================
      Here are two batman-adv bugfixes:
      
       * Fix free/alloc race for OGM and OGMv2, by Sven Eckelmann (2 patches)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55793d2a
    • Daniel Wagner's avatar
      net: usb: lan78xx: Disable interrupts before calling generic_handle_irq() · 0a29ac5b
      Daniel Wagner authored
      lan78xx_status() will run with interrupts enabled due to the change in
      ed194d13 ("usb: core: remove local_irq_save() around ->complete()
      handler"). generic_handle_irq() expects to be run with IRQs disabled.
      
      [    4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
      [    4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
      [    4.896294] 000: Modules linked in:
      [    4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
      [    4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
      [    4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
      [    4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
      [    4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
      [    4.896339] 000: sp : ffff000010003cc0
      [    4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
      [    4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
      [    4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
      [    4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
      [    4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
      [    4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
      [    4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
      [    4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
      [    4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
      [    4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
      [    4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
      [    4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
      [    4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
      [    4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
      [    4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
      [    4.896466] 000: Call trace:
      [    4.896471] 000:  __handle_irq_event_percpu+0x154/0x168
      [    4.896481] 000:  handle_irq_event_percpu+0x50/0xb0
      [    4.896489] 000:  handle_irq_event+0x40/0x98
      [    4.896497] 000:  handle_simple_irq+0xa4/0xf0
      [    4.896505] 000:  generic_handle_irq+0x24/0x38
      [    4.896513] 000:  intr_complete+0xb0/0xe0
      [    4.896525] 000:  __usb_hcd_giveback_urb+0x58/0xd8
      [    4.896533] 000:  usb_giveback_urb_bh+0xd0/0x170
      [    4.896539] 000:  tasklet_action_common.isra.0+0x9c/0x128
      [    4.896549] 000:  tasklet_hi_action+0x24/0x30
      [    4.896556] 000:  __do_softirq+0x120/0x23c
      [    4.896564] 000:  irq_exit+0xb8/0xd8
      [    4.896571] 000:  __handle_domain_irq+0x64/0xb8
      [    4.896579] 000:  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
      [    4.896586] 000:  el1_irq+0xb8/0x140
      [    4.896592] 000:  arch_cpu_idle+0x10/0x18
      [    4.896601] 000:  do_idle+0x200/0x280
      [    4.896608] 000:  cpu_startup_entry+0x20/0x28
      [    4.896615] 000:  rest_init+0xb4/0xc0
      [    4.896623] 000:  arch_call_rest_init+0xc/0x14
      [    4.896632] 000:  start_kernel+0x454/0x480
      
      Fixes: ed194d13 ("usb: core: remove local_irq_save() around ->complete() handler")
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Stefan Wahren <wahrenst@gmx.net>
      Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarDaniel Wagner <dwagner@suse.de>
      Tested-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a29ac5b
    • Arnd Bergmann's avatar
      net: dsa: sja1105: improve NET_DSA_SJA1105_TAS dependency · 5d294fc4
      Arnd Bergmann authored
      An earlier bugfix introduced a dependency on CONFIG_NET_SCH_TAPRIO,
      but this missed the case of NET_SCH_TAPRIO=m and NET_DSA_SJA1105=y,
      which still causes a link error:
      
      drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_setup_tc_taprio':
      sja1105_tas.c:(.text+0x5c): undefined reference to `taprio_offload_free'
      sja1105_tas.c:(.text+0x3b4): undefined reference to `taprio_offload_get'
      drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_tas_teardown':
      sja1105_tas.c:(.text+0x6ec): undefined reference to `taprio_offload_free'
      
      Change the dependency to only allow selecting the TAS code when it
      can link against the taprio code.
      
      Fixes: a8d570de ("net: dsa: sja1105: Add dependency for NET_DSA_SJA1105_TAS")
      Fixes: 317ab5b8 ("net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d294fc4
    • Benjamin Herrenschmidt's avatar
      net: ethernet: ftgmac100: Fix DMA coherency issue with SW checksum · 88824e3b
      Benjamin Herrenschmidt authored
      We are calling the checksum helper after the dma_map_single()
      call to map the packet. This is incorrect as the checksumming
      code will touch the packet from the CPU. This means the cache
      won't be properly flushes (or the bounce buffering will leave
      us with the unmodified packet to DMA).
      
      This moves the calculation of the checksum & vlan tags to
      before the DMA mapping.
      
      This also has the side effect of fixing another bug: If the
      checksum helper fails, we goto "drop" to drop the packet, which
      will not unmap the DMA mapping.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Fixes: 05690d63 ("ftgmac100: Upgrade to NETIF_F_HW_CSUM")
      Reviewed-by: default avatarVijay Khemka <vijaykhemka@fb.com>
      Tested-by: default avatarVijay Khemka <vijaykhemka@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88824e3b
    • Tejun Heo's avatar
      net: fix sk_page_frag() recursion from memory reclaim · 20eb4f29
      Tejun Heo authored
      sk_page_frag() optimizes skb_frag allocations by using per-task
      skb_frag cache when it knows it's the only user.  The condition is
      determined by seeing whether the socket allocation mask allows
      blocking - if the allocation may block, it obviously owns the task's
      context and ergo exclusively owns current->task_frag.
      
      Unfortunately, this misses recursion through memory reclaim path.
      Please take a look at the following backtrace.
      
       [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10
           ...
           tcp_sendmsg+0x27/0x40
           sock_sendmsg+0x30/0x40
           sock_xmit.isra.24+0xa1/0x170 [nbd]
           nbd_send_cmd+0x1d2/0x690 [nbd]
           nbd_queue_rq+0x1b5/0x3b0 [nbd]
           __blk_mq_try_issue_directly+0x108/0x1b0
           blk_mq_request_issue_directly+0xbd/0xe0
           blk_mq_try_issue_list_directly+0x41/0xb0
           blk_mq_sched_insert_requests+0xa2/0xe0
           blk_mq_flush_plug_list+0x205/0x2a0
           blk_flush_plug_list+0xc3/0xf0
       [1] blk_finish_plug+0x21/0x2e
           _xfs_buf_ioapply+0x313/0x460
           __xfs_buf_submit+0x67/0x220
           xfs_buf_read_map+0x113/0x1a0
           xfs_trans_read_buf_map+0xbf/0x330
           xfs_btree_read_buf_block.constprop.42+0x95/0xd0
           xfs_btree_lookup_get_block+0x95/0x170
           xfs_btree_lookup+0xcc/0x470
           xfs_bmap_del_extent_real+0x254/0x9a0
           __xfs_bunmapi+0x45c/0xab0
           xfs_bunmapi+0x15/0x30
           xfs_itruncate_extents_flags+0xca/0x250
           xfs_free_eofblocks+0x181/0x1e0
           xfs_fs_destroy_inode+0xa8/0x1b0
           destroy_inode+0x38/0x70
           dispose_list+0x35/0x50
           prune_icache_sb+0x52/0x70
           super_cache_scan+0x120/0x1a0
           do_shrink_slab+0x120/0x290
           shrink_slab+0x216/0x2b0
           shrink_node+0x1b6/0x4a0
           do_try_to_free_pages+0xc6/0x370
           try_to_free_mem_cgroup_pages+0xe3/0x1e0
           try_charge+0x29e/0x790
           mem_cgroup_charge_skmem+0x6a/0x100
           __sk_mem_raise_allocated+0x18e/0x390
           __sk_mem_schedule+0x2a/0x40
       [0] tcp_sendmsg_locked+0x8eb/0xe10
           tcp_sendmsg+0x27/0x40
           sock_sendmsg+0x30/0x40
           ___sys_sendmsg+0x26d/0x2b0
           __sys_sendmsg+0x57/0xa0
           do_syscall_64+0x42/0x100
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      In [0], tcp_send_msg_locked() was using current->page_frag when it
      called sk_wmem_schedule().  It already calculated how many bytes can
      be fit into current->page_frag.  Due to memory pressure,
      sk_wmem_schedule() called into memory reclaim path which called into
      xfs and then IO issue path.  Because the filesystem in question is
      backed by nbd, the control goes back into the tcp layer - back into
      tcp_sendmsg_locked().
      
      nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes
      sense - it's in the process of freeing memory and wants to be able to,
      e.g., drop clean pages to make forward progress.  However, this
      confused sk_page_frag() called from [2].  Because it only tests
      whether the allocation allows blocking which it does, it now thinks
      current->page_frag can be used again although it already was being
      used in [0].
      
      After [2] used current->page_frag, the offset would be increased by
      the used amount.  When the control returns to [0],
      current->page_frag's offset is increased and the previously calculated
      number of bytes now may overrun the end of allocated memory leading to
      silent memory corruptions.
      
      Fix it by adding gfpflags_normal_context() which tests sleepable &&
      !reclaim and use it to determine whether to use current->task_frag.
      
      v2: Eric didn't like gfp flags being tested twice.  Introduce a new
          helper gfpflags_normal_context() and combine the two tests.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20eb4f29
    • Eric Dumazet's avatar
      udp: fix data-race in udp_set_dev_scratch() · a793183c
      Eric Dumazet authored
      KCSAN reported a data-race in udp_set_dev_scratch() [1]
      
      The issue here is that we must not write over skb fields
      if skb is shared. A similar issue has been fixed in commit
      89c22d8c ("net: Fix skb csum races when peeking")
      
      While we are at it, use a helper only dealing with
      udp_skb_scratch(skb)->csum_unnecessary, as this allows
      udp_set_dev_scratch() to be called once and thus inlined.
      
      [1]
      BUG: KCSAN: data-race in udp_set_dev_scratch / udpv6_recvmsg
      
      write to 0xffff888120278317 of 1 bytes by task 10411 on cpu 1:
       udp_set_dev_scratch+0xea/0x200 net/ipv4/udp.c:1308
       __first_packet_length+0x147/0x420 net/ipv4/udp.c:1556
       first_packet_length+0x68/0x2a0 net/ipv4/udp.c:1579
       udp_poll+0xea/0x110 net/ipv4/udp.c:2720
       sock_poll+0xed/0x250 net/socket.c:1256
       vfs_poll include/linux/poll.h:90 [inline]
       do_select+0x7d0/0x1020 fs/select.c:534
       core_sys_select+0x381/0x550 fs/select.c:677
       do_pselect.constprop.0+0x11d/0x160 fs/select.c:759
       __do_sys_pselect6 fs/select.c:784 [inline]
       __se_sys_pselect6 fs/select.c:769 [inline]
       __x64_sys_pselect6+0x12e/0x170 fs/select.c:769
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      read to 0xffff888120278317 of 1 bytes by task 10413 on cpu 0:
       udp_skb_csum_unnecessary include/net/udp.h:358 [inline]
       udpv6_recvmsg+0x43e/0xe90 net/ipv6/udp.c:310
       inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 10413 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 2276f58a ("udp: use a separate rx queue for packet reception")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a793183c
    • Nishad Kamdar's avatar
      net: dpaa2: Use the correct style for SPDX License Identifier · 7de4344f
      Nishad Kamdar authored
      This patch corrects the SPDX License Identifier style in
      header files related to DPAA2 Ethernet driver supporting
      Freescale SoCs with DPAA2. For C header files
      Documentation/process/license-rules.rst mandates C-like comments
      (opposed to C source files where C++ style should be used)
      
      Changes made by using a script provided by Joe Perches here:
      https://lkml.org/lkml/2019/2/7/46.
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarNishad Kamdar <nishadkamdar@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7de4344f
    • David S. Miller's avatar
      Merge branch 'net-avoid-KCSAN-splats' · 20243058
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      net: avoid KCSAN splats
      
      Often times we use skb_queue_empty() without holding a lock,
      meaning that other cpus (or interrupt) can change the queue
      under us. This is fine, but we need to properly annotate
      the lockless intent to make sure the compiler wont over
      optimize things.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20243058
    • Eric Dumazet's avatar
      net: add READ_ONCE() annotation in __skb_wait_for_more_packets() · 7c422d0c
      Eric Dumazet authored
      __skb_wait_for_more_packets() can be called while other cpus
      can feed packets to the socket receive queue.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in __skb_wait_for_more_packets / __udp_enqueue_schedule_skb
      
      write to 0xffff888102e40b58 of 8 bytes by interrupt on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       __udp_enqueue_schedule_skb+0x2d7/0x410 net/ipv4/udp.c:1470
       __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline]
       udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057
       udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074
       udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233
       __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300
       udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      read to 0xffff888102e40b58 of 8 bytes by task 13035 on cpu 1:
       __skb_wait_for_more_packets+0xfa/0x320 net/core/datagram.c:100
       __skb_recv_udp+0x374/0x500 net/ipv4/udp.c:1683
       udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 13035 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c422d0c
    • Eric Dumazet's avatar
      net: use skb_queue_empty_lockless() in busy poll contexts · 3f926af3
      Eric Dumazet authored
      Busy polling usually runs without locks.
      Let's use skb_queue_empty_lockless() instead of skb_queue_empty()
      
      Also uses READ_ONCE() in __skb_try_recv_datagram() to address
      a similar potential problem.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f926af3
    • Eric Dumazet's avatar
      net: use skb_queue_empty_lockless() in poll() handlers · 3ef7cf57
      Eric Dumazet authored
      Many poll() handlers are lockless. Using skb_queue_empty_lockless()
      instead of skb_queue_empty() is more appropriate.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ef7cf57
    • Eric Dumazet's avatar
      udp: use skb_queue_empty_lockless() · 137a0dbe
      Eric Dumazet authored
      syzbot reported a data-race [1].
      
      We should use skb_queue_empty_lockless() to document that we are
      not ensuring a mutual exclusion and silence KCSAN.
      
      [1]
      BUG: KCSAN: data-race in __skb_recv_udp / __udp_enqueue_schedule_skb
      
      write to 0xffff888122474b50 of 8 bytes by interrupt on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       __udp_enqueue_schedule_skb+0x2c1/0x410 net/ipv4/udp.c:1470
       __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline]
       udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057
       udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074
       udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233
       __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300
       udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      read to 0xffff888122474b50 of 8 bytes by task 8921 on cpu 1:
       skb_queue_empty include/linux/skbuff.h:1494 [inline]
       __skb_recv_udp+0x18d/0x500 net/ipv4/udp.c:1653
       udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 8921 Comm: syz-executor.4 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      137a0dbe
    • Eric Dumazet's avatar
      net: add skb_queue_empty_lockless() · d7d16a89
      Eric Dumazet authored
      Some paths call skb_queue_empty() without holding
      the queue lock. We must use a barrier in order
      to not let the compiler do strange things, and avoid
      KCSAN splats.
      
      Adding a barrier in skb_queue_empty() might be overkill,
      I prefer adding a new helper to clearly identify
      points where the callers might be lockless. This might
      help us finding real bugs.
      
      The corresponding WRITE_ONCE() should add zero cost
      for current compilers.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7d16a89
  3. 27 Oct, 2019 2 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · fc11078d
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for net:
      
      1) Fix crash on flowtable due to race between garbage collection
         and insertion.
      
      2) Restore callback unbinding in netfilter offloads.
      
      3) Fix races on IPVS module removal, from Davide Caratti.
      
      4) Make old_secure_tcp per-netns to fix sysbot report,
         from Eric Dumazet.
      
      5) Validate matching length in netfilter offloads, from wenxu.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc11078d
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 1a51a474
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-10-27
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 7 non-merge commits during the last 11 day(s) which contain
      a total of 7 files changed, 66 insertions(+), 16 deletions(-).
      
      The main changes are:
      
      1) Fix two use-after-free bugs in relation to RCU in jited symbol exposure to
         kallsyms, from Daniel Borkmann.
      
      2) Fix NULL pointer dereference in AF_XDP rx-only sockets, from Magnus Karlsson.
      
      3) Fix hang in netdev unregister for hash based devmap as well as another overflow
         bug on 32 bit archs in memlock cost calculation, from Toke Høiland-Jørgensen.
      
      4) Fix wrong memory access in LWT BPF programs on reroute due to invalid dst.
         Also fix BPF selftests to use more compatible nc options, from Jiri Benc.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a51a474
  4. 26 Oct, 2019 11 commits