1. 27 Sep, 2019 27 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 3c30819d
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-09-27
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix libbpf's BTF dumper to not skip anonymous enum definitions, from Andrii.
      
      2) Fix BTF verifier issues when handling the BTF of vmlinux, from Alexei.
      
      3) Fix nested calls into bpf_event_output() from TCP sockops BPF
         programs, from Allan.
      
      4) Fix NULL pointer dereference in AF_XDP's xsk map creation when
         allocation fails, from Jonathan.
      
      5) Remove unneeded 64 byte alignment requirement of the AF_XDP UMEM
         headroom, from Bjorn.
      
      6) Remove unused XDP_OPTIONS getsockopt() call which results in an error
         on older kernels, from Toke.
      
      7) Fix a client/server race in tcp_rtt BPF kselftest case, from Stanislav.
      
      8) Fix indentation issue in BTF's btf_enum_check_kflag_member(), from Colin.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c30819d
    • David S. Miller's avatar
      Merge branch 'qdisc-destroy' · 5c7ff181
      David S. Miller authored
      Vlad Buslov says:
      
      ====================
      Fix Qdisc destroy issues caused by adding fine-grained locking to filter API
      
      TC filter API unlocking introduced several new fine-grained locks. The
      change caused sleeping-while-atomic BUGs in several Qdiscs that call cls
      APIs which need to obtain new mutex while holding sch tree spinlock. This
      series fixes affected Qdiscs by ensuring that cls API that became sleeping
      is only called outside of sch tree lock critical section.
      ====================
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c7ff181
    • Vlad Buslov's avatar
      net: sched: sch_sfb: don't call qdisc_put() while holding tree lock · e3ae1f96
      Vlad Buslov authored
      Recent changes that removed rtnl dependency from rules update path of tc
      also made tcf_block_put() function sleeping. This function is called from
      ops->destroy() of several Qdisc implementations, which in turn is called by
      qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
      which results sleeping-while-atomic BUG.
      
      Steps to reproduce for sfb:
      
      tc qdisc add dev ens1f0 handle 1: root sfb
      tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
      tc qdisc change dev ens1f0 root handle 1: sfb
      
      Resulting dmesg:
      
      [ 7265.938717] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
      [ 7265.940152] in_atomic(): 1, irqs_disabled(): 0, pid: 28579, name: tc
      [ 7265.941455] INFO: lockdep is turned off.
      [ 7265.942744] CPU: 11 PID: 28579 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
      [ 7265.944065] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 7265.945396] Call Trace:
      [ 7265.946709]  dump_stack+0x85/0xc0
      [ 7265.947994]  ___might_sleep.cold+0xac/0xbc
      [ 7265.949282]  __mutex_lock+0x5b/0x960
      [ 7265.950543]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 7265.951803]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 7265.953022]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 7265.954248]  tcf_block_put_ext.part.0+0x21/0x50
      [ 7265.955478]  tcf_block_put+0x50/0x70
      [ 7265.956694]  sfq_destroy+0x15/0x50 [sch_sfq]
      [ 7265.957898]  qdisc_destroy+0x5f/0x160
      [ 7265.959099]  sfb_change+0x175/0x330 [sch_sfb]
      [ 7265.960304]  tc_modify_qdisc+0x324/0x840
      [ 7265.961503]  rtnetlink_rcv_msg+0x170/0x4b0
      [ 7265.962692]  ? netlink_deliver_tap+0x95/0x400
      [ 7265.963876]  ? rtnl_dellink+0x2d0/0x2d0
      [ 7265.965064]  netlink_rcv_skb+0x49/0x110
      [ 7265.966251]  netlink_unicast+0x171/0x200
      [ 7265.967427]  netlink_sendmsg+0x224/0x3f0
      [ 7265.968595]  sock_sendmsg+0x5e/0x60
      [ 7265.969753]  ___sys_sendmsg+0x2ae/0x330
      [ 7265.970916]  ? ___sys_recvmsg+0x159/0x1f0
      [ 7265.972074]  ? do_wp_page+0x9c/0x790
      [ 7265.973233]  ? __handle_mm_fault+0xcd3/0x19e0
      [ 7265.974407]  __sys_sendmsg+0x59/0xa0
      [ 7265.975591]  do_syscall_64+0x5c/0xb0
      [ 7265.976753]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 7265.977938] RIP: 0033:0x7f229069f7b8
      [ 7265.979117] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [ 7265.981681] RSP: 002b:00007ffd7ed2d158 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 7265.983001] RAX: ffffffffffffffda RBX: 000000005d813ca1 RCX: 00007f229069f7b8
      [ 7265.984336] RDX: 0000000000000000 RSI: 00007ffd7ed2d1c0 RDI: 0000000000000003
      [ 7265.985682] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000165c9a0
      [ 7265.987021] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [ 7265.988309] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
      
      In sfb_change() function use qdisc_purge_queue() instead of
      qdisc_tree_flush_backlog() to properly reset old child Qdisc and save
      pointer to it into local temporary variable. Put reference to Qdisc after
      sch tree lock is released in order not to call potentially sleeping cls API
      in atomic section. This is safe to do because Qdisc has already been reset
      by qdisc_purge_queue() inside sch tree lock critical section.
      
      Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com
      Fixes: c266f64d ("net: sched: protect block state with mutex")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3ae1f96
    • Vlad Buslov's avatar
      net: sched: multiq: don't call qdisc_put() while holding tree lock · c2999f7f
      Vlad Buslov authored
      Recent changes that removed rtnl dependency from rules update path of tc
      also made tcf_block_put() function sleeping. This function is called from
      ops->destroy() of several Qdisc implementations, which in turn is called by
      qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
      which results sleeping-while-atomic BUG.
      
      Steps to reproduce for multiq:
      
      tc qdisc add dev ens1f0 root handle 1: multiq
      tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
      ethtool -L ens1f0 combined 2
      tc qdisc change dev ens1f0 root handle 1: multiq
      
      Resulting dmesg:
      
      [ 5539.419344] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
      [ 5539.420945] in_atomic(): 1, irqs_disabled(): 0, pid: 27658, name: tc
      [ 5539.422435] INFO: lockdep is turned off.
      [ 5539.423904] CPU: 21 PID: 27658 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
      [ 5539.425400] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 5539.426911] Call Trace:
      [ 5539.428380]  dump_stack+0x85/0xc0
      [ 5539.429823]  ___might_sleep.cold+0xac/0xbc
      [ 5539.431262]  __mutex_lock+0x5b/0x960
      [ 5539.432682]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 5539.434103]  ? __nla_validate_parse+0x51/0x840
      [ 5539.435493]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 5539.436903]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 5539.438327]  tcf_block_put_ext.part.0+0x21/0x50
      [ 5539.439752]  tcf_block_put+0x50/0x70
      [ 5539.441165]  sfq_destroy+0x15/0x50 [sch_sfq]
      [ 5539.442570]  qdisc_destroy+0x5f/0x160
      [ 5539.444000]  multiq_tune+0x14a/0x420 [sch_multiq]
      [ 5539.445421]  tc_modify_qdisc+0x324/0x840
      [ 5539.446841]  rtnetlink_rcv_msg+0x170/0x4b0
      [ 5539.448269]  ? netlink_deliver_tap+0x95/0x400
      [ 5539.449691]  ? rtnl_dellink+0x2d0/0x2d0
      [ 5539.451116]  netlink_rcv_skb+0x49/0x110
      [ 5539.452522]  netlink_unicast+0x171/0x200
      [ 5539.453914]  netlink_sendmsg+0x224/0x3f0
      [ 5539.455304]  sock_sendmsg+0x5e/0x60
      [ 5539.456686]  ___sys_sendmsg+0x2ae/0x330
      [ 5539.458071]  ? ___sys_recvmsg+0x159/0x1f0
      [ 5539.459461]  ? do_wp_page+0x9c/0x790
      [ 5539.460846]  ? __handle_mm_fault+0xcd3/0x19e0
      [ 5539.462263]  __sys_sendmsg+0x59/0xa0
      [ 5539.463661]  do_syscall_64+0x5c/0xb0
      [ 5539.465044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 5539.466454] RIP: 0033:0x7f1fe08177b8
      [ 5539.467863] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [ 5539.470906] RSP: 002b:00007ffe812de5d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 5539.472483] RAX: ffffffffffffffda RBX: 000000005d8135e3 RCX: 00007f1fe08177b8
      [ 5539.474069] RDX: 0000000000000000 RSI: 00007ffe812de640 RDI: 0000000000000003
      [ 5539.475655] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000182e9b0
      [ 5539.477203] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [ 5539.478699] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
      
      Rearrange locking in multiq_tune() in following ways:
      
      - In loop that removes Qdiscs from disabled queues, call
        qdisc_purge_queue() instead of qdisc_tree_flush_backlog() on Qdisc that
        is being destroyed. Save the Qdisc in temporary allocated array and call
        qdisc_put() on each element of the array after sch tree lock is released.
        This is safe to do because Qdiscs have already been reset by
        qdisc_purge_queue() inside sch tree lock critical section.
      
      - Do the same change for second loop that initializes Qdiscs for newly
        enabled queues in multiq_tune() function. Since sch tree lock is obtained
        and released on each iteration of this loop, just call qdisc_put()
        directly outside of critical section. Don't verify that old Qdisc is not
        noop_qdisc before releasing reference to it because such check is already
        performed by qdisc_put*() functions.
      
      Fixes: c266f64d ("net: sched: protect block state with mutex")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2999f7f
    • Vlad Buslov's avatar
      net: sched: sch_htb: don't call qdisc_put() while holding tree lock · 4ce70b4a
      Vlad Buslov authored
      Recent changes that removed rtnl dependency from rules update path of tc
      also made tcf_block_put() function sleeping. This function is called from
      ops->destroy() of several Qdisc implementations, which in turn is called by
      qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
      which results sleeping-while-atomic BUG.
      
      Steps to reproduce for htb:
      
      tc qdisc add dev ens1f0 root handle 1: htb default 12
      tc class add dev ens1f0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
      tc qdisc add dev ens1f0 parent 1:1 handle 40: sfq perturb 10
      tc class add dev ens1f0 parent 1:1 classid 1:2 htb rate 100kbps ceil 100kbps
      
      Resulting dmesg:
      
      [ 4791.148551] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
      [ 4791.151354] in_atomic(): 1, irqs_disabled(): 0, pid: 27273, name: tc
      [ 4791.152805] INFO: lockdep is turned off.
      [ 4791.153605] CPU: 19 PID: 27273 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
      [ 4791.154336] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 4791.155075] Call Trace:
      [ 4791.155803]  dump_stack+0x85/0xc0
      [ 4791.156529]  ___might_sleep.cold+0xac/0xbc
      [ 4791.157251]  __mutex_lock+0x5b/0x960
      [ 4791.157966]  ? console_unlock+0x363/0x5d0
      [ 4791.158676]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 4791.159395]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 4791.160103]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 4791.160815]  tcf_block_put_ext.part.0+0x21/0x50
      [ 4791.161530]  tcf_block_put+0x50/0x70
      [ 4791.162233]  sfq_destroy+0x15/0x50 [sch_sfq]
      [ 4791.162936]  qdisc_destroy+0x5f/0x160
      [ 4791.163642]  htb_change_class.cold+0x5df/0x69d [sch_htb]
      [ 4791.164505]  tc_ctl_tclass+0x19d/0x480
      [ 4791.165360]  rtnetlink_rcv_msg+0x170/0x4b0
      [ 4791.166191]  ? netlink_deliver_tap+0x95/0x400
      [ 4791.166907]  ? rtnl_dellink+0x2d0/0x2d0
      [ 4791.167625]  netlink_rcv_skb+0x49/0x110
      [ 4791.168345]  netlink_unicast+0x171/0x200
      [ 4791.169058]  netlink_sendmsg+0x224/0x3f0
      [ 4791.169771]  sock_sendmsg+0x5e/0x60
      [ 4791.170475]  ___sys_sendmsg+0x2ae/0x330
      [ 4791.171183]  ? ___sys_recvmsg+0x159/0x1f0
      [ 4791.171894]  ? do_wp_page+0x9c/0x790
      [ 4791.172595]  ? __handle_mm_fault+0xcd3/0x19e0
      [ 4791.173309]  __sys_sendmsg+0x59/0xa0
      [ 4791.174024]  do_syscall_64+0x5c/0xb0
      [ 4791.174725]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 4791.175435] RIP: 0033:0x7f0aa41497b8
      [ 4791.176129] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [ 4791.177532] RSP: 002b:00007fff4e37d588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 4791.178243] RAX: ffffffffffffffda RBX: 000000005d8132f7 RCX: 00007f0aa41497b8
      [ 4791.178947] RDX: 0000000000000000 RSI: 00007fff4e37d5f0 RDI: 0000000000000003
      [ 4791.179662] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000020149a0
      [ 4791.180382] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [ 4791.181100] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
      
      In htb_change_class() function save parent->leaf.q to local temporary
      variable and put reference to it after sch tree lock is released in order
      not to call potentially sleeping cls API in atomic section. This is safe to
      do because Qdisc has already been reset by qdisc_purge_queue() inside sch
      tree lock critical section.
      
      Fixes: c266f64d ("net: sched: protect block state with mutex")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ce70b4a
    • Ka-Cheong Poon's avatar
      net/rds: Check laddr_check before calling it · 05733434
      Ka-Cheong Poon authored
      In rds_bind(), laddr_check is called without checking if it is NULL or
      not.  And rs_transport should be reset if rds_add_bound() fails.
      
      Fixes: c5c1a030 ("net/rds: An rds_sock is added too early to the hash table")
      Reported-by: syzbot+fae39afd2101a17ec624@syzkaller.appspotmail.com
      Signed-off-by: default avatarKa-Cheong Poon <ka-cheong.poon@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05733434
    • David S. Miller's avatar
      Merge branch 'SO_PRIORITY' · 4e1e83be
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tcp: provide correct skb->priority
      
      SO_PRIORITY socket option requests TCP egress packets
      to contain a user provided value.
      
      TCP manages to send most packets with the requested values,
      notably for TCP_ESTABLISHED state, but fails to do so for
      few packets.
      
      These packets are control packets sent on behalf
      of SYN_RECV or TIME_WAIT states.
      
      Note that to test this with packetdrill, it is a bit
      of a hassle, since packetdrill can not verify priority
      of egress packets, other than indirect observations,
      using for example sch_prio on its tunnel device.
      
      The bad skb priorities cause problems for GCP,
      as this field is one of the keys used in routing.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e1e83be
    • Eric Dumazet's avatar
      tcp: honor SO_PRIORITY in TIME_WAIT state · f6c0f5d2
      Eric Dumazet authored
      ctl packets sent on behalf of TIME_WAIT sockets currently
      have a zero skb->priority, which can cause various problems.
      
      In this patch we :
      
      - add a tw_priority field in struct inet_timewait_sock.
      
      - populate it from sk->sk_priority when a TIME_WAIT is created.
      
      - For IPv4, change ip_send_unicast_reply() and its two
        callers to propagate tw_priority correctly.
        ip_send_unicast_reply() no longer changes sk->sk_priority.
      
      - For IPv6, make sure TIME_WAIT sockets pass their tw_priority
        field to tcp_v6_send_response() and tcp_v6_send_ack().
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6c0f5d2
    • Eric Dumazet's avatar
      ipv6: tcp: provide sk->sk_priority to ctl packets · e9a5dcee
      Eric Dumazet authored
      We can populate skb->priority for some ctl packets
      instead of always using zero.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9a5dcee
    • Eric Dumazet's avatar
      ipv6: add priority parameter to ip6_xmit() · 4f6570d7
      Eric Dumazet authored
      Currently, ip6_xmit() sets skb->priority based on sk->sk_priority
      
      This is not desirable for TCP since TCP shares the same ctl socket
      for a given netns. We want to be able to send RST or ACK packets
      with a non zero skb->priority.
      
      This patch has no functional change.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f6570d7
    • Allan Zhang's avatar
      bpf: Fix bpf_event_output re-entry issue · 768fb61f
      Allan Zhang authored
      BPF_PROG_TYPE_SOCK_OPS program can reenter bpf_event_output because it
      can be called from atomic and non-atomic contexts since we don't have
      bpf_prog_active to prevent it happen.
      
      This patch enables 3 levels of nesting to support normal, irq and nmi
      context.
      
      We can easily reproduce the issue by running netperf crr mode with 100
      flows and 10 threads from netperf client side.
      
      Here is the whole stack dump:
      
      [  515.228898] WARNING: CPU: 20 PID: 14686 at kernel/trace/bpf_trace.c:549 bpf_event_output+0x1f9/0x220
      [  515.228903] CPU: 20 PID: 14686 Comm: tcp_crr Tainted: G        W        4.15.0-smp-fixpanic #44
      [  515.228904] Hardware name: Intel TBG,ICH10/Ikaria_QC_1b, BIOS 1.22.0 06/04/2018
      [  515.228905] RIP: 0010:bpf_event_output+0x1f9/0x220
      [  515.228906] RSP: 0018:ffff9a57ffc03938 EFLAGS: 00010246
      [  515.228907] RAX: 0000000000000012 RBX: 0000000000000001 RCX: 0000000000000000
      [  515.228907] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff836b0f80
      [  515.228908] RBP: ffff9a57ffc039c8 R08: 0000000000000004 R09: 0000000000000012
      [  515.228908] R10: ffff9a57ffc1de40 R11: 0000000000000000 R12: 0000000000000002
      [  515.228909] R13: ffff9a57e13bae00 R14: 00000000ffffffff R15: ffff9a57ffc1e2c0
      [  515.228910] FS:  00007f5a3e6ec700(0000) GS:ffff9a57ffc00000(0000) knlGS:0000000000000000
      [  515.228910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  515.228911] CR2: 0000537082664fff CR3: 000000061fed6002 CR4: 00000000000226f0
      [  515.228911] Call Trace:
      [  515.228913]  <IRQ>
      [  515.228919]  [<ffffffff82c6c6cb>] bpf_sockopt_event_output+0x3b/0x50
      [  515.228923]  [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10
      [  515.228927]  [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100
      [  515.228930]  [<ffffffff82cf90a5>] ? tcp_init_transfer+0x125/0x150
      [  515.228933]  [<ffffffff82cf9159>] ? tcp_finish_connect+0x89/0x110
      [  515.228936]  [<ffffffff82cf98e4>] ? tcp_rcv_state_process+0x704/0x1010
      [  515.228939]  [<ffffffff82c6e263>] ? sk_filter_trim_cap+0x53/0x2a0
      [  515.228942]  [<ffffffff82d90d1f>] ? tcp_v6_inbound_md5_hash+0x6f/0x1d0
      [  515.228945]  [<ffffffff82d92160>] ? tcp_v6_do_rcv+0x1c0/0x460
      [  515.228947]  [<ffffffff82d93558>] ? tcp_v6_rcv+0x9f8/0xb30
      [  515.228951]  [<ffffffff82d737c0>] ? ip6_route_input+0x190/0x220
      [  515.228955]  [<ffffffff82d5f7ad>] ? ip6_protocol_deliver_rcu+0x6d/0x450
      [  515.228958]  [<ffffffff82d60246>] ? ip6_rcv_finish+0xb6/0x170
      [  515.228961]  [<ffffffff82d5fb90>] ? ip6_protocol_deliver_rcu+0x450/0x450
      [  515.228963]  [<ffffffff82d60361>] ? ipv6_rcv+0x61/0xe0
      [  515.228966]  [<ffffffff82d60190>] ? ipv6_list_rcv+0x330/0x330
      [  515.228969]  [<ffffffff82c4976b>] ? __netif_receive_skb_one_core+0x5b/0xa0
      [  515.228972]  [<ffffffff82c497d1>] ? __netif_receive_skb+0x21/0x70
      [  515.228975]  [<ffffffff82c4a8d2>] ? process_backlog+0xb2/0x150
      [  515.228978]  [<ffffffff82c4aadf>] ? net_rx_action+0x16f/0x410
      [  515.228982]  [<ffffffff830000dd>] ? __do_softirq+0xdd/0x305
      [  515.228986]  [<ffffffff8252cfdc>] ? irq_exit+0x9c/0xb0
      [  515.228989]  [<ffffffff82e02de5>] ? smp_call_function_single_interrupt+0x65/0x120
      [  515.228991]  [<ffffffff82e020e1>] ? call_function_single_interrupt+0x81/0x90
      [  515.228992]  </IRQ>
      [  515.228996]  [<ffffffff82a11ff0>] ? io_serial_in+0x20/0x20
      [  515.229000]  [<ffffffff8259c040>] ? console_unlock+0x230/0x490
      [  515.229003]  [<ffffffff8259cbaa>] ? vprintk_emit+0x26a/0x2a0
      [  515.229006]  [<ffffffff8259cbff>] ? vprintk_default+0x1f/0x30
      [  515.229008]  [<ffffffff8259d9f5>] ? vprintk_func+0x35/0x70
      [  515.229011]  [<ffffffff8259d4bb>] ? printk+0x50/0x66
      [  515.229013]  [<ffffffff82637637>] ? bpf_event_output+0xb7/0x220
      [  515.229016]  [<ffffffff82c6c6cb>] ? bpf_sockopt_event_output+0x3b/0x50
      [  515.229019]  [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10
      [  515.229023]  [<ffffffff82c29e87>] ? release_sock+0x97/0xb0
      [  515.229026]  [<ffffffff82ce9d6a>] ? tcp_recvmsg+0x31a/0xda0
      [  515.229029]  [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100
      [  515.229032]  [<ffffffff82ce77c1>] ? tcp_set_state+0x191/0x1b0
      [  515.229035]  [<ffffffff82ced10e>] ? tcp_disconnect+0x2e/0x600
      [  515.229038]  [<ffffffff82cecbbb>] ? tcp_close+0x3eb/0x460
      [  515.229040]  [<ffffffff82d21082>] ? inet_release+0x42/0x70
      [  515.229043]  [<ffffffff82d58809>] ? inet6_release+0x39/0x50
      [  515.229046]  [<ffffffff82c1f32d>] ? __sock_release+0x4d/0xd0
      [  515.229049]  [<ffffffff82c1f3e5>] ? sock_close+0x15/0x20
      [  515.229052]  [<ffffffff8273b517>] ? __fput+0xe7/0x1f0
      [  515.229055]  [<ffffffff8273b66e>] ? ____fput+0xe/0x10
      [  515.229058]  [<ffffffff82547bf2>] ? task_work_run+0x82/0xb0
      [  515.229061]  [<ffffffff824086df>] ? exit_to_usermode_loop+0x7e/0x11f
      [  515.229064]  [<ffffffff82408171>] ? do_syscall_64+0x111/0x130
      [  515.229067]  [<ffffffff82e0007c>] ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: a5a3a828 ("bpf: add perf event notificaton support for sock_ops")
      Signed-off-by: default avatarAllan Zhang <allanzhang@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20190925234312.94063-2-allanzhang@google.com
      768fb61f
    • Andrew Lunn's avatar
      net: dsa: qca8k: Fix port enable for CPU port · 2b6fd3ea
      Andrew Lunn authored
      The CPU port does not have a PHY connected to it. So calling
      phy_support_asym_pause() results in an Opps. As with other DSA
      drivers, add a guard that the port is a user port.
      Reported-by: default avatarMichal Vokáč <michal.vokac@ysoft.com>
      Fixes: 0394a63a ("net: dsa: enable and disable all ports")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Tested-by: default avatarMichal Vokáč <michal.vokac@ysoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b6fd3ea
    • Eric Dumazet's avatar
      sch_netem: fix rcu splat in netem_enqueue() · 159d2c7d
      Eric Dumazet authored
      qdisc_root() use from netem_enqueue() triggers a lockdep warning.
      
      __dev_queue_xmit() uses rcu_read_lock_bh() which is
      not equivalent to rcu_read_lock() + local_bh_disable_bh as far
      as lockdep is concerned.
      
      WARNING: suspicious RCU usage
      5.3.0-rc7+ #0 Not tainted
      -----------------------------
      include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      3 locks held by syz-executor427/8855:
       #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline]
       #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214
       #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804
       #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
       #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline]
       #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838
      
      stack backtrace:
      CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357
       qdisc_root include/net/sch_generic.h:492 [inline]
       netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479
       __dev_xmit_skb net/core/dev.c:3527 [inline]
       __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838
       dev_queue_xmit+0x18/0x20 net/core/dev.c:3902
       neigh_hh_output include/net/neighbour.h:500 [inline]
       neigh_output include/net/neighbour.h:509 [inline]
       ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:308 [inline]
       __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290
       ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125
       ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555
       udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887
       udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174
       inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:657
       ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
       __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
       __se_sys_sendmmsg net/socket.c:2439 [inline]
       __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
       do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      159d2c7d
    • Eric Dumazet's avatar
      kcm: disable preemption in kcm_parse_func_strparser() · 0355d6c1
      Eric Dumazet authored
      After commit a2c11b03 ("kcm: use BPF_PROG_RUN")
      syzbot easily triggers the warning in cant_sleep().
      
      As explained in commit 6cab5e90 ("bpf: run bpf programs
      with preemption disabled") we need to disable preemption before
      running bpf programs.
      
      BUG: assuming atomic context at net/kcm/kcmsock.c:382
      in_atomic(): 0, irqs_disabled(): 0, pid: 7, name: kworker/u4:0
      3 locks held by kworker/u4:0/7:
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: __write_once_size include/linux/compiler.h:226 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: set_work_data kernel/workqueue.c:620 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
       #1: ffff8880a989fdc0 ((work_completion)(&strp->work)){+.+.}, at: process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
       #2: ffff888098998d10 (sk_lock-AF_INET){+.+.}, at: lock_sock include/net/sock.h:1522 [inline]
       #2: ffff888098998d10 (sk_lock-AF_INET){+.+.}, at: strp_sock_lock+0x2e/0x40 net/strparser/strparser.c:440
      CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.3.0+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: kstrp strp_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       __cant_sleep kernel/sched/core.c:6826 [inline]
       __cant_sleep.cold+0xa4/0xbc kernel/sched/core.c:6803
       kcm_parse_func_strparser+0x54/0x200 net/kcm/kcmsock.c:382
       __strp_recv+0x5dc/0x1b20 net/strparser/strparser.c:221
       strp_recv+0xcf/0x10b net/strparser/strparser.c:343
       tcp_read_sock+0x285/0xa00 net/ipv4/tcp.c:1639
       strp_read_sock+0x14d/0x200 net/strparser/strparser.c:366
       do_strp_work net/strparser/strparser.c:414 [inline]
       strp_work+0xe3/0x130 net/strparser/strparser.c:423
       process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
      
      Fixes: a2c11b03 ("kcm: use BPF_PROG_RUN")
      Fixes: 6cab5e90 ("bpf: run bpf programs with preemption disabled")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0355d6c1
    • Dan Carpenter's avatar
      net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse() · 23104218
      Dan Carpenter authored
      The "gmac->phy_mode" variable is an enum and in this context GCC will
      treat it as an unsigned int so the error handling will never be
      triggered.
      
      Fixes: b1c17215 ("stmmac: add ipq806x glue layer")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23104218
    • Dan Carpenter's avatar
      net: nixge: Fix a signedness bug in nixge_probe() · 1a4b62a0
      Dan Carpenter authored
      The "priv->phy_mode" is an enum and in this context GCC will treat it
      as an unsigned int so it can never be less than zero.
      
      Fixes: 492caffa ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a4b62a0
    • Dan Carpenter's avatar
      of: mdio: Fix a signedness bug in of_phy_get_and_connect() · d7eb6512
      Dan Carpenter authored
      The "iface" variable is an enum and in this context GCC treats it as
      an unsigned int so the error handling is never triggered.
      
      Fixes: b7862412 ("of_mdio: Abstract a general interface for phy connect")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7eb6512
    • Dan Carpenter's avatar
      net: axienet: fix a signedness bug in probe · 73e211e1
      Dan Carpenter authored
      The "lp->phy_mode" is an enum but in this context GCC treats it as an
      unsigned int so the error handling is never triggered.
      
      Fixes: ee06b172 ("net: axienet: add support for standard phy-mode binding")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73e211e1
    • Dan Carpenter's avatar
      net: stmmac: dwmac-meson8b: Fix signedness bug in probe · f1021051
      Dan Carpenter authored
      The "dwmac->phy_mode" is an enum and in this context GCC treats it as
      an unsigned int so the error handling is never triggered.
      
      Fixes: 566e8251 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1021051
    • Dan Carpenter's avatar
      net: socionext: Fix a signedness bug in ave_probe() · 7f9e88e6
      Dan Carpenter authored
      The "phy_mode" variable is an enum and in this context GCC treats it as
      an unsigned int so the error handling is never triggered.
      
      Fixes: 4c270b55 ("net: ethernet: socionext: add AVE ethernet driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarKunihiko Hayashi <hayashi.kunihiko@socionext.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f9e88e6
    • Dan Carpenter's avatar
      enetc: Fix a signedness bug in enetc_of_get_phy() · ced81eb8
      Dan Carpenter authored
      The "priv->if_mode" is type phy_interface_t which is an enum.  In this
      context GCC will treat the enum as an unsigned int so this error
      handling is never triggered.
      
      Fixes: d4fd0404 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ced81eb8
    • Dan Carpenter's avatar
      net: netsec: Fix signedness bug in netsec_probe() · bd55f8dd
      Dan Carpenter authored
      The "priv->phy_interface" variable is an enum and in this context GCC
      will treat it as an unsigned int so the error handling is never
      triggered.
      
      Fixes: 533dd11a ("net: socionext: Add Synquacer NetSec driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd55f8dd
    • Dan Carpenter's avatar
      net: broadcom/bcmsysport: Fix signedness in bcm_sysport_probe() · 25a58495
      Dan Carpenter authored
      The "priv->phy_interface" variable is an enum and in this context GCC
      will treat it as unsigned so the error handling will never be
      triggered.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25a58495
    • Dan Carpenter's avatar
      net: hisilicon: Fix signedness bug in hix5hd2_dev_probe() · 002dfe80
      Dan Carpenter authored
      The "priv->phy_mode" variable is an enum and in this context GCC will
      treat it as unsigned to the error handling will never trigger.
      
      Fixes: 57c5bc9a ("net: hisilicon: add hix5hd2 mac driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      002dfe80
    • Dan Carpenter's avatar
      cxgb4: Signedness bug in init_one() · 28618314
      Dan Carpenter authored
      The "chip" variable is an enum, and it's treated as unsigned int by GCC
      in this context so the error handling isn't triggered.
      
      Fixes: e8d45292 ("cxgb4: clean up init_one")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28618314
    • Dan Carpenter's avatar
      net: aquantia: Fix aq_vec_isr_legacy() return value · 31aefe14
      Dan Carpenter authored
      The irqreturn_t type is an enum or an unsigned int in GCC.  That
      creates to problems because it can't detect if the
      self->aq_hw_ops->hw_irq_read() call fails and at the end the function
      always returns IRQ_HANDLED.
      
      drivers/net/ethernet/aquantia/atlantic/aq_vec.c:316 aq_vec_isr_legacy() warn: unsigned 'err' is never less than zero.
      drivers/net/ethernet/aquantia/atlantic/aq_vec.c:329 aq_vec_isr_legacy() warn: always true condition '(err >= 0) => (0-u32max >= 0)'
      
      Fixes: 970a2e98 ("net: ethernet: aquantia: Vector operations")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31aefe14
    • Uwe Kleine-König's avatar
      dimlib: make DIMLIB a hidden symbol · 424adc32
      Uwe Kleine-König authored
      According to Tal Gilboa the only benefit from DIM comes from a driver
      that uses it. So it doesn't make sense to make this symbol user visible,
      instead all drivers that use it should select it (as is already the case
      AFAICT).
      Signed-off-by: default avatarUwe Kleine-König <uwe@kleine-koenig.org>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      424adc32
  2. 26 Sep, 2019 13 commits
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-for-davem-2019-09-26' of... · 5a2a828d
      David S. Miller authored
      Merge tag 'wireless-drivers-for-davem-2019-09-26' of https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 5.4
      
      First set of fixes for 5.4 sent during the merge window. Most are
      regressions fixes but the mt7615 problem has been since it was merged.
      
      iwlwifi
      
      * fix a build regression related CONFIG_THERMAL
      
      * avoid using GEO_TX_POWER_LIMIT command on certain firmware versions
      
      rtw88
      
      * fixes for skb leaks
      
      zd1211rw
      
      * fix a compiler warning on 32 bit
      
      mt76
      
      * fix the firmware paths for mt7615 to match with linux-firmware
      
      wil6210
      
      * fix use of skb after free
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a2a828d
    • Colin Ian King's avatar
      bpf: Clean up indentation issue in BTF kflag processing · e3439af4
      Colin Ian King authored
      There is a statement that is indented one level too deeply, remove
      the extraneous tab.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20190925093835.19515-1-colin.king@canonical.com
      e3439af4
    • Andrii Nakryiko's avatar
      libbpf: Teach btf_dumper to emit stand-alone anonymous enum definitions · 39529a99
      Andrii Nakryiko authored
      BTF-to-C converter previously skipped anonymous enums in an assumption
      that those are embedded in struct's field definitions. This is not
      always the case and a lot of kernel constants are defined as part of
      anonymous enums. This change fixes the logic by eagerly marking all
      types as either referenced by any other type or not. This is enough to
      distinguish two classes of anonymous enums and emit previously omitted
      enum definitions.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20190925203745.3173184-1-andriin@fb.com
      39529a99
    • Jason A. Donenfeld's avatar
      ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule · ca7a03c4
      Jason A. Donenfeld authored
      Commit 7d9e5f42 removed references from certain dsts, but accounting
      for this never translated down into the fib6 suppression code. This bug
      was triggered by WireGuard users who use wg-quick(8), which uses the
      "suppress-prefix" directive to ip-rule(8) for routing all of their
      internet traffic without routing loops. The test case added here
      causes the reference underflow by causing packets to evaluate a suppress
      rule.
      
      Fixes: 7d9e5f42 ("ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca7a03c4
    • Li RongQing's avatar
      openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC · ea8564c8
      Li RongQing authored
      userspace openvswitch patch "(dpif-linux: Implement the API
      functions to allow multiple handler threads read upcall)"
      changes its type from U32 to UNSPEC, but leave the kernel
      unchanged
      
      and after kernel 6e237d09 "(netlink: Relax attr validation
      for fixed length types)", this bug is exposed by the below
      warning
      
      	[   57.215841] netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
      
      Fixes: 5cd667b0 ("openvswitch: Allow each vport to have an array of 'port_id's")
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea8564c8
    • Biju Das's avatar
      dt-bindings: net: ravb: Add support for r8a774b1 SoC · c1d419d0
      Biju Das authored
      Document RZ/G2N (R8A774B1) SoC bindings.
      Signed-off-by: default avatarBiju Das <biju.das@bp.renesas.com>
      Reviewed-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1d419d0
    • Thierry Reding's avatar
      net: stmmac: Fix page pool size · 4f28bd95
      Thierry Reding authored
      The size of individual pages in the page pool in given by an order. The
      order is the binary logarithm of the number of pages that make up one of
      the pages in the pool. However, the driver currently passes the number
      of pages rather than the order, so it ends up wasting quite a bit of
      memory.
      
      Fix this by taking the binary logarithm and passing that in the order
      field.
      
      Fixes: 2af6106a ("net: stmmac: Introducing support for Page Pool")
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f28bd95
    • Xin Long's avatar
      macsec: drop skb sk before calling gro_cells_receive · ba56d8ce
      Xin Long authored
      Fei Liu reported a crash when doing netperf on a topo of macsec
      dev over veth:
      
        [  448.919128] refcount_t: underflow; use-after-free.
        [  449.090460] Call trace:
        [  449.092895]  refcount_sub_and_test+0xb4/0xc0
        [  449.097155]  tcp_wfree+0x2c/0x150
        [  449.100460]  ip_rcv+0x1d4/0x3a8
        [  449.103591]  __netif_receive_skb_core+0x554/0xae0
        [  449.108282]  __netif_receive_skb+0x28/0x78
        [  449.112366]  netif_receive_skb_internal+0x54/0x100
        [  449.117144]  napi_gro_complete+0x70/0xc0
        [  449.121054]  napi_gro_flush+0x6c/0x90
        [  449.124703]  napi_complete_done+0x50/0x130
        [  449.128788]  gro_cell_poll+0x8c/0xa8
        [  449.132351]  net_rx_action+0x16c/0x3f8
        [  449.136088]  __do_softirq+0x128/0x320
      
      The issue was caused by skb's true_size changed without its sk's
      sk_wmem_alloc increased in tcp/skb_gro_receive(). Later when the
      skb is being freed and the skb's truesize is subtracted from its
      sk's sk_wmem_alloc in tcp_wfree(), underflow occurs.
      
      macsec is calling gro_cells_receive() to receive a packet, which
      actually requires skb->sk to be NULL. However when macsec dev is
      over veth, it's possible the skb->sk is still set if the skb was
      not unshared or expanded from the peer veth.
      
      ip_rcv() is calling skb_orphan() to drop the skb's sk for tproxy,
      but it is too late for macsec's calling gro_cells_receive(). So
      fix it by dropping the skb's sk earlier on rx path of macsec.
      
      Fixes: 5491e7c6 ("macsec: enable GRO and RPS on macsec devices")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Reported-by: default avatarFei Liu <feliu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba56d8ce
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2019-09-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 2dbf45d1
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2019-09-24
      
      This series introduces some fixes to mlx5 driver.
      For more information please see tag log below.
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.20:
       ('net/mlx5e: Fix traffic duplication in ethtool steering')
      
      For -stable v4.19:
       ('net/mlx5: Add device ID of upcoming BlueField-2')
      
      For -stable v5.3:
       ('net/mlx5e: Fix matching on tunnel addresses type')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2dbf45d1
    • Jason A. Donenfeld's avatar
      net: print proper warning on dst underflow · adecda5b
      Jason A. Donenfeld authored
      Proper warnings with stack traces make it much easier to figure out
      what's doing the double free and create more meaningful bug reports from
      users.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adecda5b
    • Vinicius Costa Gomes's avatar
      net/sched: cbs: Fix not adding cbs instance to list · 3e8b9bfa
      Vinicius Costa Gomes authored
      When removing a cbs instance when offloading is enabled, the crash
      below can be observed.
      
      The problem happens because that when offloading is enabled, the cbs
      instance is not added to the list.
      
      Also, the current code doesn't handle correctly the case when offload
      is disabled without removing the qdisc: if the link speed changes the
      credit calculations will be wrong. When we create the cbs instance
      with offloading enabled, it's not added to the notification list, when
      later we disable offloading, it's not in the list, so link speed
      changes will not affect it.
      
      The solution for both issues is the same, add the cbs instance being
      created unconditionally to the global list, even if the link state
      notification isn't useful "right now".
      
      Crash log:
      
      [518758.189866] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [518758.189870] #PF: supervisor read access in kernel mode
      [518758.189871] #PF: error_code(0x0000) - not-present page
      [518758.189872] PGD 0 P4D 0
      [518758.189874] Oops: 0000 [#1] SMP PTI
      [518758.189876] CPU: 3 PID: 4825 Comm: tc Not tainted 5.2.9 #1
      [518758.189877] Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS ULTRA/Z390 AORUS ULTRA-CF, BIOS F7 03/14/2019
      [518758.189881] RIP: 0010:__list_del_entry_valid+0x29/0xa0
      [518758.189883] Code: 90 48 b8 00 01 00 00 00 00 ad de 55 48 8b 17 4c 8b 47 08 48 89 e5 48 39 c2 74 27 48 b8 00 02 00 00 00 00 ad de 49 39 c0 74 2d <49> 8b 30 48 39 fe 75 3d 48 8b 52 08 48 39 f2 75 4c b8 01 00 00 00
      [518758.189885] RSP: 0018:ffffa27e43903990 EFLAGS: 00010207
      [518758.189887] RAX: dead000000000200 RBX: ffff8bce69f0f000 RCX: 0000000000000000
      [518758.189888] RDX: 0000000000000000 RSI: ffff8bce69f0f064 RDI: ffff8bce69f0f1e0
      [518758.189890] RBP: ffffa27e43903990 R08: 0000000000000000 R09: ffff8bce69e788c0
      [518758.189891] R10: ffff8bce62acd400 R11: 00000000000003cb R12: ffff8bce69e78000
      [518758.189892] R13: ffff8bce69f0f140 R14: 0000000000000000 R15: 0000000000000000
      [518758.189894] FS:  00007fa1572c8f80(0000) GS:ffff8bce6e0c0000(0000) knlGS:0000000000000000
      [518758.189895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [518758.189896] CR2: 0000000000000000 CR3: 000000040a398006 CR4: 00000000003606e0
      [518758.189898] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [518758.189899] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [518758.189900] Call Trace:
      [518758.189904]  cbs_destroy+0x32/0xa0 [sch_cbs]
      [518758.189906]  qdisc_destroy+0x45/0x120
      [518758.189907]  qdisc_put+0x25/0x30
      [518758.189908]  qdisc_graft+0x2c1/0x450
      [518758.189910]  tc_get_qdisc+0x1c8/0x310
      [518758.189912]  ? get_page_from_freelist+0x91a/0xcb0
      [518758.189914]  rtnetlink_rcv_msg+0x293/0x360
      [518758.189916]  ? kmem_cache_alloc_node_trace+0x178/0x260
      [518758.189918]  ? __kmalloc_node_track_caller+0x38/0x50
      [518758.189920]  ? rtnl_calcit.isra.0+0xf0/0xf0
      [518758.189922]  netlink_rcv_skb+0x48/0x110
      [518758.189923]  rtnetlink_rcv+0x10/0x20
      [518758.189925]  netlink_unicast+0x15b/0x1d0
      [518758.189926]  netlink_sendmsg+0x1ea/0x380
      [518758.189929]  sock_sendmsg+0x2f/0x40
      [518758.189930]  ___sys_sendmsg+0x295/0x2f0
      [518758.189932]  ? ___sys_recvmsg+0x151/0x1e0
      [518758.189933]  ? do_wp_page+0x7e/0x450
      [518758.189935]  __sys_sendmsg+0x48/0x80
      [518758.189937]  __x64_sys_sendmsg+0x1a/0x20
      [518758.189939]  do_syscall_64+0x53/0x1f0
      [518758.189941]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [518758.189942] RIP: 0033:0x7fa15755169a
      [518758.189944] Code: 48 c7 c0 ff ff ff ff eb be 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 18 b8 2e 00 00 00 c5 fc 77 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 89 54 24 1c
      [518758.189946] RSP: 002b:00007ffda58b60b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [518758.189948] RAX: ffffffffffffffda RBX: 000055e4b836d9a0 RCX: 00007fa15755169a
      [518758.189949] RDX: 0000000000000000 RSI: 00007ffda58b6128 RDI: 0000000000000003
      [518758.189951] RBP: 00007ffda58b6190 R08: 0000000000000001 R09: 000055e4b9d848a0
      [518758.189952] R10: 0000000000000000 R11: 0000000000000246 R12: 000000005d654b49
      [518758.189953] R13: 0000000000000000 R14: 00007ffda58b6230 R15: 00007ffda58b6210
      [518758.189955] Modules linked in: sch_cbs sch_etf sch_mqprio netlink_diag unix_diag e1000e igb intel_pch_thermal thermal video backlight pcc_cpufreq
      [518758.189960] CR2: 0000000000000000
      [518758.189961] ---[ end trace 6a13f7aaf5376019 ]---
      [518758.189963] RIP: 0010:__list_del_entry_valid+0x29/0xa0
      [518758.189964] Code: 90 48 b8 00 01 00 00 00 00 ad de 55 48 8b 17 4c 8b 47 08 48 89 e5 48 39 c2 74 27 48 b8 00 02 00 00 00 00 ad de 49 39 c0 74 2d <49> 8b 30 48 39 fe 75 3d 48 8b 52 08 48 39 f2 75 4c b8 01 00 00 00
      [518758.189967] RSP: 0018:ffffa27e43903990 EFLAGS: 00010207
      [518758.189968] RAX: dead000000000200 RBX: ffff8bce69f0f000 RCX: 0000000000000000
      [518758.189969] RDX: 0000000000000000 RSI: ffff8bce69f0f064 RDI: ffff8bce69f0f1e0
      [518758.189971] RBP: ffffa27e43903990 R08: 0000000000000000 R09: ffff8bce69e788c0
      [518758.189972] R10: ffff8bce62acd400 R11: 00000000000003cb R12: ffff8bce69e78000
      [518758.189973] R13: ffff8bce69f0f140 R14: 0000000000000000 R15: 0000000000000000
      [518758.189975] FS:  00007fa1572c8f80(0000) GS:ffff8bce6e0c0000(0000) knlGS:0000000000000000
      [518758.189976] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [518758.189977] CR2: 0000000000000000 CR3: 000000040a398006 CR4: 00000000003606e0
      [518758.189979] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [518758.189980] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: e0a7683d ("net/sched: cbs: fix port_rate miscalculation")
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e8b9bfa
    • Krzysztof Kozlowski's avatar
      drivers: net: Fix Kconfig indentation · 02bc5eb9
      Krzysztof Kozlowski authored
      Adjust indentation from spaces to tab (+optional two spaces) as in
      coding style with command like:
          $ sed -e 's/^        /\t/' -i */Kconfig
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02bc5eb9
    • Krzysztof Kozlowski's avatar
      net: Fix Kconfig indentation · bf69abad
      Krzysztof Kozlowski authored
      Adjust indentation from spaces to tab (+optional two spaces) as in
      coding style with command like:
          $ sed -e 's/^        /\t/' -i */Kconfig
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf69abad