1. 01 Jul, 2020 27 commits
    • Denis Kirjanov's avatar
      tcp: don't ignore ECN CWR on pure ACK · a4f8b268
      Denis Kirjanov authored
      [ Upstream commit 25702840 ]
      
      there is a problem with the CWR flag set in an incoming ACK segment
      and it leads to the situation when the ECE flag is latched forever
      
      the following packetdrill script shows what happens:
      
      // Stack receives incoming segments with CE set
      +0.1 <[ect0]  . 11001:12001(1000) ack 1001 win 65535
      +0.0 <[ce]    . 12001:13001(1000) ack 1001 win 65535
      +0.0 <[ect0] P. 13001:14001(1000) ack 1001 win 65535
      
      // Stack repsonds with ECN ECHO
      +0.0 >[noecn]  . 1001:1001(0) ack 12001
      +0.0 >[noecn] E. 1001:1001(0) ack 13001
      +0.0 >[noecn] E. 1001:1001(0) ack 14001
      
      // Write a packet
      +0.1 write(3, ..., 1000) = 1000
      +0.0 >[ect0] PE. 1001:2001(1000) ack 14001
      
      // Pure ACK received
      +0.01 <[noecn] W. 14001:14001(0) ack 2001 win 65535
      
      // Since CWR was sent, this packet should NOT have ECE set
      
      +0.1 write(3, ..., 1000) = 1000
      +0.0 >[ect0]  P. 2001:3001(1000) ack 14001
      // but Linux will still keep ECE latched here, with packetdrill
      // flagging a missing ECE flag, expecting
      // >[ect0] PE. 2001:3001(1000) ack 14001
      // in the script
      
      In the situation above we will continue to send ECN ECHO packets
      and trigger the peer to reduce the congestion window. To avoid that
      we can check CWR on pure ACKs received.
      
      v3:
      - Add a sequence check to avoid sending an ACK to an ACK
      
      v2:
      - Adjusted the comment
      - move CWR check before checking for unacknowledged packets
      Signed-off-by: default avatarDenis Kirjanov <denis.kirjanov@suse.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4f8b268
    • Marcelo Ricardo Leitner's avatar
      sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket · 8a3839c7
      Marcelo Ricardo Leitner authored
      [ Upstream commit 471e39df ]
      
      If a socket is set ipv6only, it will still send IPv4 addresses in the
      INIT and INIT_ACK packets. This potentially misleads the peer into using
      them, which then would cause association termination.
      
      The fix is to not add IPv4 addresses to ipv6only sockets.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Tested-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a3839c7
    • David Howells's avatar
      rxrpc: Fix notification call on completion of discarded calls · a3bc2b29
      David Howells authored
      [ Upstream commit 0041cd5a ]
      
      When preallocated service calls are being discarded, they're passed to
      ->discard_new_call() to have the caller clean up any attached higher-layer
      preallocated pieces before being marked completed.  However, the act of
      marking them completed now invokes the call's notification function - which
      causes a problem because that function might assume that the previously
      freed pieces of memory are still there.
      
      Fix this by setting a dummy notification function on the socket after
      calling ->discard_new_call().
      
      This results in the following kasan message when the kafs module is
      removed.
      
      ==================================================================
      BUG: KASAN: use-after-free in afs_wake_up_async_call+0x6aa/0x770 fs/afs/rxrpc.c:707
      Write of size 1 at addr ffff8880946c39e4 by task kworker/u4:1/21
      
      CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.8.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x18f/0x20d lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
       afs_wake_up_async_call+0x6aa/0x770 fs/afs/rxrpc.c:707
       rxrpc_notify_socket+0x1db/0x5d0 net/rxrpc/recvmsg.c:40
       __rxrpc_set_call_completion.part.0+0x172/0x410 net/rxrpc/recvmsg.c:76
       __rxrpc_call_completed net/rxrpc/recvmsg.c:112 [inline]
       rxrpc_call_completed+0xca/0xf0 net/rxrpc/recvmsg.c:111
       rxrpc_discard_prealloc+0x781/0xab0 net/rxrpc/call_accept.c:233
       rxrpc_listen+0x147/0x360 net/rxrpc/af_rxrpc.c:245
       afs_close_socket+0x95/0x320 fs/afs/rxrpc.c:110
       afs_net_exit+0x1bc/0x310 fs/afs/main.c:155
       ops_exit_list.isra.0+0xa8/0x150 net/core/net_namespace.c:186
       cleanup_net+0x511/0xa50 net/core/net_namespace.c:603
       process_one_work+0x965/0x1690 kernel/workqueue.c:2269
       worker_thread+0x96/0xe10 kernel/workqueue.c:2415
       kthread+0x3b5/0x4a0 kernel/kthread.c:291
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
      
      Allocated by task 6820:
       save_stack+0x1b/0x40 mm/kasan/common.c:48
       set_track mm/kasan/common.c:56 [inline]
       __kasan_kmalloc mm/kasan/common.c:494 [inline]
       __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:467
       kmem_cache_alloc_trace+0x153/0x7d0 mm/slab.c:3551
       kmalloc include/linux/slab.h:555 [inline]
       kzalloc include/linux/slab.h:669 [inline]
       afs_alloc_call+0x55/0x630 fs/afs/rxrpc.c:141
       afs_charge_preallocation+0xe9/0x2d0 fs/afs/rxrpc.c:757
       afs_open_socket+0x292/0x360 fs/afs/rxrpc.c:92
       afs_net_init+0xa6c/0xe30 fs/afs/main.c:125
       ops_init+0xaf/0x420 net/core/net_namespace.c:151
       setup_net+0x2de/0x860 net/core/net_namespace.c:341
       copy_net_ns+0x293/0x590 net/core/net_namespace.c:482
       create_new_namespaces+0x3fb/0xb30 kernel/nsproxy.c:110
       unshare_nsproxy_namespaces+0xbd/0x1f0 kernel/nsproxy.c:231
       ksys_unshare+0x43d/0x8e0 kernel/fork.c:2983
       __do_sys_unshare kernel/fork.c:3051 [inline]
       __se_sys_unshare kernel/fork.c:3049 [inline]
       __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3049
       do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 21:
       save_stack+0x1b/0x40 mm/kasan/common.c:48
       set_track mm/kasan/common.c:56 [inline]
       kasan_set_free_info mm/kasan/common.c:316 [inline]
       __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:455
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x109/0x2b0 mm/slab.c:3757
       afs_put_call+0x585/0xa40 fs/afs/rxrpc.c:190
       rxrpc_discard_prealloc+0x764/0xab0 net/rxrpc/call_accept.c:230
       rxrpc_listen+0x147/0x360 net/rxrpc/af_rxrpc.c:245
       afs_close_socket+0x95/0x320 fs/afs/rxrpc.c:110
       afs_net_exit+0x1bc/0x310 fs/afs/main.c:155
       ops_exit_list.isra.0+0xa8/0x150 net/core/net_namespace.c:186
       cleanup_net+0x511/0xa50 net/core/net_namespace.c:603
       process_one_work+0x965/0x1690 kernel/workqueue.c:2269
       worker_thread+0x96/0xe10 kernel/workqueue.c:2415
       kthread+0x3b5/0x4a0 kernel/kthread.c:291
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
      
      The buggy address belongs to the object at ffff8880946c3800
       which belongs to the cache kmalloc-1k of size 1024
      The buggy address is located 484 bytes inside of
       1024-byte region [ffff8880946c3800, ffff8880946c3c00)
      The buggy address belongs to the page:
      page:ffffea000251b0c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0xfffe0000000200(slab)
      raw: 00fffe0000000200 ffffea0002546508 ffffea00024fa248 ffff8880aa000c40
      raw: 0000000000000000 ffff8880946c3000 0000000100000002 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880946c3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880946c3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880946c3980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                             ^
       ffff8880946c3a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880946c3a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      
      Reported-by: syzbot+d3eccef36ddbd02713e9@syzkaller.appspotmail.com
      Fixes: 5ac0d622 ("rxrpc: Fix missing notification")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3bc2b29
    • Aditya Pakki's avatar
      rocker: fix incorrect error handling in dma_rings_init · 3de8d385
      Aditya Pakki authored
      [ Upstream commit 58d0c864 ]
      
      In rocker_dma_rings_init, the goto blocks in case of errors
      caused by the functions rocker_dma_cmd_ring_waits_alloc() and
      rocker_dma_ring_create() are incorrect. The patch fixes the
      order consistent with cleanup in rocker_dma_rings_fini().
      Signed-off-by: default avatarAditya Pakki <pakki001@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3de8d385
    • Jeremy Kerr's avatar
      net: usb: ax88179_178a: fix packet alignment padding · c191f7cf
      Jeremy Kerr authored
      [ Upstream commit e869e7a1 ]
      
      Using a AX88179 device (0b95:1790), I see two bytes of appended data on
      every RX packet. For example, this 48-byte ping, using 0xff as a
      payload byte:
      
        04:20:22.528472 IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 2447, seq 1, length 64
      	0x0000:  000a cd35 ea50 000a cd35 ea4f 0800 4500
      	0x0010:  0054 c116 4000 4001 f63e c0a8 0101 c0a8
      	0x0020:  0102 0800 b633 098f 0001 87ea cd5e 0000
      	0x0030:  0000 dcf2 0600 0000 0000 ffff ffff ffff
      	0x0040:  ffff ffff ffff ffff ffff ffff ffff ffff
      	0x0050:  ffff ffff ffff ffff ffff ffff ffff ffff
      	0x0060:  ffff 961f
      
      Those last two bytes - 96 1f - aren't part of the original packet.
      
      In the ax88179 RX path, the usbnet rx_fixup function trims a 2-byte
      'alignment pseudo header' from the start of the packet, and sets the
      length from a per-packet field populated by hardware. It looks like that
      length field *includes* the 2-byte header; the current driver assumes
      that it's excluded.
      
      This change trims the 2-byte alignment header after we've set the packet
      length, so the resulting packet length is correct. While we're moving
      the comment around, this also fixes the spelling of 'pseudo'.
      Signed-off-by: default avatarJeremy Kerr <jk@ozlabs.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c191f7cf
    • Eric Dumazet's avatar
      net: increment xmit_recursion level in dev_direct_xmit() · 220e80d9
      Eric Dumazet authored
      [ Upstream commit 0ad6f6e7 ]
      
      Back in commit f60e5990 ("ipv6: protect skb->sk accesses
      from recursive dereference inside the stack") Hannes added code
      so that IPv6 stack would not trust skb->sk for typical cases
      where packet goes through 'standard' xmit path (__dev_queue_xmit())
      
      Alas af_packet had a dev_direct_xmit() path that was not
      dealing yet with xmit_recursion level.
      
      Also change sk_mc_loop() to dump a stack once only.
      
      Without this patch, syzbot was able to trigger :
      
      [1]
      [  153.567378] WARNING: CPU: 7 PID: 11273 at net/core/sock.c:721 sk_mc_loop+0x51/0x70
      [  153.567378] Modules linked in: nfnetlink ip6table_raw ip6table_filter iptable_raw iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 nf_defrag_ipv6 iptable_filter macsec macvtap tap macvlan 8021q hsr wireguard libblake2s blake2s_x86_64 libblake2s_generic udp_tunnel ip6_udp_tunnel libchacha20poly1305 poly1305_x86_64 chacha_x86_64 libchacha curve25519_x86_64 libcurve25519_generic netdevsim batman_adv dummy team bridge stp llc w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core
      [  153.567386] CPU: 7 PID: 11273 Comm: b159172088 Not tainted 5.8.0-smp-DEV #273
      [  153.567387] RIP: 0010:sk_mc_loop+0x51/0x70
      [  153.567388] Code: 66 83 f8 0a 75 24 0f b6 4f 12 b8 01 00 00 00 31 d2 d3 e0 a9 bf ef ff ff 74 07 48 8b 97 f0 02 00 00 0f b6 42 3a 83 e0 01 5d c3 <0f> 0b b8 01 00 00 00 5d c3 0f b6 87 18 03 00 00 5d c0 e8 04 83 e0
      [  153.567388] RSP: 0018:ffff95c69bb93990 EFLAGS: 00010212
      [  153.567388] RAX: 0000000000000011 RBX: ffff95c6e0ee3e00 RCX: 0000000000000007
      [  153.567389] RDX: ffff95c69ae50000 RSI: ffff95c6c30c3000 RDI: ffff95c6c30c3000
      [  153.567389] RBP: ffff95c69bb93990 R08: ffff95c69a77f000 R09: 0000000000000008
      [  153.567389] R10: 0000000000000040 R11: 00003e0e00026128 R12: ffff95c6c30c3000
      [  153.567390] R13: ffff95c6cc4fd500 R14: ffff95c6f84500c0 R15: ffff95c69aa13c00
      [  153.567390] FS:  00007fdc3a283700(0000) GS:ffff95c6ff9c0000(0000) knlGS:0000000000000000
      [  153.567390] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  153.567391] CR2: 00007ffee758e890 CR3: 0000001f9ba20003 CR4: 00000000001606e0
      [  153.567391] Call Trace:
      [  153.567391]  ip6_finish_output2+0x34e/0x550
      [  153.567391]  __ip6_finish_output+0xe7/0x110
      [  153.567391]  ip6_finish_output+0x2d/0xb0
      [  153.567392]  ip6_output+0x77/0x120
      [  153.567392]  ? __ip6_finish_output+0x110/0x110
      [  153.567392]  ip6_local_out+0x3d/0x50
      [  153.567392]  ipvlan_queue_xmit+0x56c/0x5e0
      [  153.567393]  ? ksize+0x19/0x30
      [  153.567393]  ipvlan_start_xmit+0x18/0x50
      [  153.567393]  dev_direct_xmit+0xf3/0x1c0
      [  153.567393]  packet_direct_xmit+0x69/0xa0
      [  153.567394]  packet_sendmsg+0xbf0/0x19b0
      [  153.567394]  ? plist_del+0x62/0xb0
      [  153.567394]  sock_sendmsg+0x65/0x70
      [  153.567394]  sock_write_iter+0x93/0xf0
      [  153.567394]  new_sync_write+0x18e/0x1a0
      [  153.567395]  __vfs_write+0x29/0x40
      [  153.567395]  vfs_write+0xb9/0x1b0
      [  153.567395]  ksys_write+0xb1/0xe0
      [  153.567395]  __x64_sys_write+0x1a/0x20
      [  153.567395]  do_syscall_64+0x43/0x70
      [  153.567396]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  153.567396] RIP: 0033:0x453549
      [  153.567396] Code: Bad RIP value.
      [  153.567396] RSP: 002b:00007fdc3a282cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [  153.567397] RAX: ffffffffffffffda RBX: 00000000004d32d0 RCX: 0000000000453549
      [  153.567397] RDX: 0000000000000020 RSI: 0000000020000300 RDI: 0000000000000003
      [  153.567398] RBP: 00000000004d32d8 R08: 0000000000000000 R09: 0000000000000000
      [  153.567398] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d32dc
      [  153.567398] R13: 00007ffee742260f R14: 00007fdc3a282dc0 R15: 00007fdc3a283700
      [  153.567399] ---[ end trace c1d5ae2b1059ec62 ]---
      
      f60e5990 ("ipv6: protect skb->sk accesses from recursive dereference inside the stack")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      220e80d9
    • Florian Westphal's avatar
      net: use correct this_cpu primitive in dev_recursion_level · 3a708efa
      Florian Westphal authored
      commit 28b05b92 upstream.
      
      syzbot reports:
      BUG: using __this_cpu_read() in preemptible code:
      caller is dev_recursion_level include/linux/netdevice.h:3052 [inline]
       __this_cpu_preempt_check+0x246/0x270 lib/smp_processor_id.c:47
       dev_recursion_level include/linux/netdevice.h:3052 [inline]
       ip6_skb_dst_mtu include/net/ip6_route.h:245 [inline]
      
      I erronously downgraded a this_cpu_read to __this_cpu_read when
      moving dev_recursion_level() around.
      
      Reported-by: syzbot+51471b4aae195285a4a3@syzkaller.appspotmail.com
      Fixes: 97cdcf37 ("net: place xmit recursion in softnet data")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a708efa
    • Florian Westphal's avatar
      net: place xmit recursion in softnet data · edbe6532
      Florian Westphal authored
      commit 97cdcf37 upstream.
      
      This fills a hole in softnet data, so no change in structure size.
      
      Also prepares for xmit_more placement in the same spot;
      skb->xmit_more will be removed in followup patch.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      edbe6532
    • Yang Yingliang's avatar
      net: fix memleak in register_netdevice() · 061abde3
      Yang Yingliang authored
      [ Upstream commit 814152a8 ]
      
      I got a memleak report when doing some fuzz test:
      
      unreferenced object 0xffff888112584000 (size 13599):
        comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
        hex dump (first 32 bytes):
          74 61 70 30 00 00 00 00 00 00 00 00 00 00 00 00  tap0............
          00 ee d9 19 81 88 ff ff 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000002f60ba65>] __kmalloc_node+0x309/0x3a0
          [<0000000075b211ec>] kvmalloc_node+0x7f/0xc0
          [<00000000d3a97396>] alloc_netdev_mqs+0x76/0xfc0
          [<00000000609c3655>] __tun_chr_ioctl+0x1456/0x3d70
          [<000000001127ca24>] ksys_ioctl+0xe5/0x130
          [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
          [<00000000e1023498>] do_syscall_64+0x56/0xa0
          [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      unreferenced object 0xffff888111845cc0 (size 8):
        comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
        hex dump (first 8 bytes):
          74 61 70 30 00 88 ff ff                          tap0....
        backtrace:
          [<000000004c159777>] kstrdup+0x35/0x70
          [<00000000d8b496ad>] kstrdup_const+0x3d/0x50
          [<00000000494e884a>] kvasprintf_const+0xf1/0x180
          [<0000000097880a2b>] kobject_set_name_vargs+0x56/0x140
          [<000000008fbdfc7b>] dev_set_name+0xab/0xe0
          [<000000005b99e3b4>] netdev_register_kobject+0xc0/0x390
          [<00000000602704fe>] register_netdevice+0xb61/0x1250
          [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70
          [<000000001127ca24>] ksys_ioctl+0xe5/0x130
          [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
          [<00000000e1023498>] do_syscall_64+0x56/0xa0
          [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      unreferenced object 0xffff88811886d800 (size 512):
        comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
        hex dump (first 32 bytes):
          00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
          ff ff ff ff ff ff ff ff c0 66 3d a3 ff ff ff ff  .........f=.....
        backtrace:
          [<0000000050315800>] device_add+0x61e/0x1950
          [<0000000021008dfb>] netdev_register_kobject+0x17e/0x390
          [<00000000602704fe>] register_netdevice+0xb61/0x1250
          [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70
          [<000000001127ca24>] ksys_ioctl+0xe5/0x130
          [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
          [<00000000e1023498>] do_syscall_64+0x56/0xa0
          [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      If call_netdevice_notifiers() failed, then rollback_registered()
      calls netdev_unregister_kobject() which holds the kobject. The
      reference cannot be put because the netdev won't be add to todo
      list, so it will leads a memleak, we need put the reference to
      avoid memleak.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      061abde3
    • Thomas Martitz's avatar
      net: bridge: enfore alignment for ethernet address · 13378a1d
      Thomas Martitz authored
      [ Upstream commit db7202dec92e6caa2706c21d6fc359af318bde2e ]
      
      The eth_addr member is passed to ether_addr functions that require
      2-byte alignment, therefore the member must be properly aligned
      to avoid unaligned accesses.
      
      The problem is in place since the initial merge of multicast to unicast:
      commit 6db6f0ea bridge: multicast to unicast
      
      Fixes: 6db6f0ea ("bridge: multicast to unicast")
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarThomas Martitz <t.martitz@avm.de>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13378a1d
    • Wang Hai's avatar
      mld: fix memory leak in ipv6_mc_destroy_dev() · cf418c02
      Wang Hai authored
      [ Upstream commit ea2fce88 ]
      
      Commit a84d0164 ("mld: fix memory leak in mld_del_delrec()") fixed
      the memory leak of MLD, but missing the ipv6_mc_destroy_dev() path, in
      which mca_sources are leaked after ma_put().
      
      Using ip6_mc_clear_src() to take care of the missing free.
      
      BUG: memory leak
      unreferenced object 0xffff8881113d3180 (size 64):
        comm "syz-executor071", pid 389, jiffies 4294887985 (age 17.943s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 ff 02 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000002cbc483c>] kmalloc include/linux/slab.h:555 [inline]
          [<000000002cbc483c>] kzalloc include/linux/slab.h:669 [inline]
          [<000000002cbc483c>] ip6_mc_add1_src net/ipv6/mcast.c:2237 [inline]
          [<000000002cbc483c>] ip6_mc_add_src+0x7f5/0xbb0 net/ipv6/mcast.c:2357
          [<0000000058b8b1ff>] ip6_mc_source+0xe0c/0x1530 net/ipv6/mcast.c:449
          [<000000000bfc4fb5>] do_ipv6_setsockopt.isra.12+0x1b2c/0x3b30 net/ipv6/ipv6_sockglue.c:754
          [<00000000e4e7a722>] ipv6_setsockopt+0xda/0x150 net/ipv6/ipv6_sockglue.c:950
          [<0000000029260d9a>] rawv6_setsockopt+0x45/0x100 net/ipv6/raw.c:1081
          [<000000005c1b46f9>] __sys_setsockopt+0x131/0x210 net/socket.c:2132
          [<000000008491f7db>] __do_sys_setsockopt net/socket.c:2148 [inline]
          [<000000008491f7db>] __se_sys_setsockopt net/socket.c:2145 [inline]
          [<000000008491f7db>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2145
          [<00000000c7bc11c5>] do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295
          [<000000005fb7a3f3>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Fixes: 1666d49e ("mld: do not remove mld souce list info when set link down")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Acked-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf418c02
    • Thomas Falcon's avatar
      ibmveth: Fix max MTU limit · 8a92f002
      Thomas Falcon authored
      [ Upstream commit 5948378b ]
      
      The max MTU limit defined for ibmveth is not accounting for
      virtual ethernet buffer overhead, which is twenty-two additional
      bytes set aside for the ethernet header and eight additional bytes
      of an opaque handle reserved for use by the hypervisor. Update the
      max MTU to reflect this overhead.
      
      Fixes: d894be57 ("ethernet: use net core MTU range checking in more drivers")
      Fixes: 110447f8 ("ethernet: fix min/max MTU typos")
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a92f002
    • Jann Horn's avatar
      apparmor: don't try to replace stale label in ptraceme check · 3f9a54ce
      Jann Horn authored
      [ Upstream commit ca3fde52 ]
      
      begin_current_label_crit_section() must run in sleepable context because
      when label_is_stale() is true, aa_replace_current_label() runs, which uses
      prepare_creds(), which can sleep.
      
      Until now, the ptraceme access check (which runs with tasklist_lock held)
      violated this rule.
      
      Fixes: b2d09ae4 ("apparmor: move ptrace checks to using labels")
      Reported-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3f9a54ce
    • Kai-Heng Feng's avatar
      ALSA: hda/realtek - Enable micmute LED on and HP system · 0255fb2f
      Kai-Heng Feng authored
      [ Upstream commit 3e0650ab ]
      
      Though the system uses DMIC, headset mic still uses the HDA, let's use
      GPIO 0x1 to control the micmute LED.
      
      The micmute LED GPIO has a different polarity to the mute LED GPIO, we
      can use the newly added micmute_led_polarity to indicate that.
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Link: https://lore.kernel.org/r/20200430083255.5093-2-kai.heng.feng@canonical.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0255fb2f
    • Kai-Heng Feng's avatar
      ALSA: hda/realtek: Enable mute LED on an HP system · 2a3e71fa
      Kai-Heng Feng authored
      [ Upstream commit f5a88b0a ]
      
      The system in question uses ALC285, and it uses GPIO 0x04 to control its
      mute LED.
      
      The mic mute LED can be controlled by GPIO 0x01, however the system uses
      DMIC so we should use that to control mic mute LED.
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200327044626.29582-1-kai.heng.feng@canonical.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a3e71fa
    • Sasha Levin's avatar
      ALSA: hda/realtek - Enable the headset of ASUS B9450FA with ALC294 · 6a5d0417
      Sasha Levin authored
      [ Upstream commit 8b33a134 ]
      
      A headset on the laptop like ASUS B9450FA does not work, until quirk
      ALC294_FIXUP_ASUS_HPE is applied.
      Signed-off-by: default avatarJian-Hong Pan <jian-hong@endlessm.com>
      Signed-off-by: default avatarKailang Yang <kailang@realtek.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200225072920.109199-1-jian-hong@endlessm.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6a5d0417
    • Al Viro's avatar
      fix a braino in "sparc32: fix register window handling in genregs32_[gs]et()" · 7c68b649
      Al Viro authored
      [ Upstream commit 9d964e1b ]
      
      lost npc in PTRACE_SETREGSET, breaking PTRACE_SETREGS as well
      
      Fixes: cf51e129 "sparc32: fix register window handling in genregs32_[gs]et()"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c68b649
    • Sowjanya Komatineni's avatar
      i2c: tegra: Fix Maximum transfer size · 0479fd1d
      Sowjanya Komatineni authored
      [ Upstream commit b67d4530 ]
      
      Tegra194 supports maximum 64K Bytes transfer per packet.
      Tegra186 and prior supports maximum 4K Bytes transfer per packet.
      
      This patch fixes this payload difference between Tegra194 and prior
      Tegra chipsets using separate i2c_adapter_quirks.
      Signed-off-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0479fd1d
    • Thierry Reding's avatar
      i2c: tegra: Add missing kerneldoc for some fields · 5301f6fb
      Thierry Reding authored
      [ Upstream commit 0604ee4a ]
      
      Not all fields were properly documented. Add kerneldoc for the missing
      fields to prevent the build from flagging them.
      Reported-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5301f6fb
    • Thierry Reding's avatar
      i2c: tegra: Cleanup kerneldoc comments · 4fb46356
      Thierry Reding authored
      [ Upstream commit c990bbaf ]
      
      Some of the kerneldoc uses a strange spelling for abbreviations. Turn
      them into all-uppercase and clean up some whitespace issues while at it.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4fb46356
    • Yazen Ghannam's avatar
      EDAC/amd64: Add Family 17h Model 30h PCI IDs · 934a8dab
      Yazen Ghannam authored
      [ Upstream commit 6e846239 ]
      
      Add the new Family 17h Model 30h PCI IDs to the AMD64 EDAC module.
      
      This also fixes a probe failure that appeared when some other PCI IDs
      for Family 17h Model 30h were added to the AMD NB code.
      
      Fixes: be3518a1 (x86/amd_nb: Add PCI device IDs for family 17h, model 30h)
      Signed-off-by: default avatarYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Tested-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190228153558.127292-1-Yazen.Ghannam@amd.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      934a8dab
    • Valentin Longchamp's avatar
      net: sched: export __netdev_watchdog_up() · 1a182e9c
      Valentin Longchamp authored
      [ Upstream commit 1a3db27a ]
      
      Since the quiesce/activate rework, __netdev_watchdog_up() is directly
      called in the ucc_geth driver.
      
      Unfortunately, this function is not available for modules and thus
      ucc_geth cannot be built as a module anymore. Fix it by exporting
      __netdev_watchdog_up().
      
      Since the commit introducing the regression was backported to stable
      branches, this one should ideally be as well.
      
      Fixes: 79dde73c ("net/ethernet/freescale: rework quiesce/activate for ucc_geth")
      Signed-off-by: default avatarValentin Longchamp <valentin@longchamp.me>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1a182e9c
    • Doug Berger's avatar
      net: bcmgenet: remove HFB_CTRL access · 290ba51c
      Doug Berger authored
      [ Upstream commit 24d476db ]
      
      Commit c5a54bbc ("net: bcmgenet: abort suspend on error")
      mistakenly introduced register accesses that should not occur
      in bcmgenet_wol_power_up_cfg().
      
      Fixes: c5a54bbc ("net: bcmgenet: abort suspend on error")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      290ba51c
    • Miquel Raynal's avatar
      mtd: rawnand: marvell: Fix the condition on a return code · 944a95a9
      Miquel Raynal authored
      [ Upstream commit c2707577 ]
      
      In a previous fix, I changed the condition on which the timeout of an
      IRQ is reached from:
      
          if (!ret)
      
      into:
      
          if (ret && !pending)
      
      While having a non-zero return code is usual in the Linux kernel, here
      ret comes from a wait_for_completion_timeout() which returns 0 when
      the waiting period is too long.
      
      Hence, the revised condition should be:
      
          if (!ret && !pending)
      
      The faulty patch did not produce any error because of the !pending
      condition so this change is finally purely cosmetic and does not
      change the actual driver behavior.
      
      Fixes: cafb56dd ("mtd: rawnand: marvell: prevent timeouts on a loaded machine")
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Link: https://lore.kernel.org/linux-mtd/20200424164501.26719-2-miquel.raynal@bootlin.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      944a95a9
    • Amir Goldstein's avatar
      fanotify: fix ignore mask logic for events on child and on dir · 62b4f338
      Amir Goldstein authored
      commit 2f02fd3f upstream.
      
      The comments in fanotify_group_event_mask() say:
      
        "If the event is on dir/child and this mark doesn't care about
         events on dir/child, don't send it!"
      
      Specifically, mount and filesystem marks do not care about events
      on child, but they can still specify an ignore mask for those events.
      For example, a group that has:
      - A mount mark with mask 0 and ignore_mask FAN_OPEN
      - An inode mark on a directory with mask FAN_OPEN | FAN_OPEN_EXEC
        with flag FAN_EVENT_ON_CHILD
      
      A child file open for exec would be reported to group with the FAN_OPEN
      event despite the fact that FAN_OPEN is in ignore mask of mount mark,
      because the mark iteration loop skips over non-inode marks for events
      on child when calculating the ignore mask.
      
      Move ignore mask calculation to the top of the iteration loop block
      before excluding marks for events on dir/child.
      
      Link: https://lore.kernel.org/r/20200524072441.18258-1-amir73il@gmail.comReported-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/linux-fsdevel/20200521162443.GA26052@quack2.suse.cz/
      Fixes: 55bf882c "fanotify: fix merging marks masks with FAN_ONDIR"
      Fixes: b469e7e4 "fanotify: fix handling of events on child..."
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62b4f338
    • yu kuai's avatar
      block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed · 81abe6ad
      yu kuai authored
      commit a75ca930 upstream.
      
      commit e7bf90e5 ("block/bio-integrity: fix a memory leak bug") added
      a kfree() for 'buf' if bio_integrity_add_page() returns '0'. However,
      the object will be freed in bio_integrity_free() since 'bio->bi_opf' and
      'bio->bi_integrity' were set previousy in bio_integrity_alloc().
      
      Fixes: commit e7bf90e5 ("block/bio-integrity: fix a memory leak bug")
      Signed-off-by: default avataryu kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81abe6ad
    • Eric Dumazet's avatar
      net: be more gentle about silly gso requests coming from user · 6c834566
      Eric Dumazet authored
      commit 7c6d2ecb upstream.
      
      Recent change in virtio_net_hdr_to_skb() broke some packetdrill tests.
      
      When --mss=XXX option is set, packetdrill always provide gso_type & gso_size
      for its inbound packets, regardless of packet size.
      
      	if (packet->tcp && packet->mss) {
      		if (packet->ipv4)
      			gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
      		else
      			gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
      		gso.gso_size = packet->mss;
      	}
      
      Since many other programs could do the same, relax virtio_net_hdr_to_skb()
      to no longer return an error, but instead ignore gso settings.
      
      This keeps Willem intent to make sure no malicious packet could
      reach gso stack.
      
      Note that TCP stack has a special logic in tcp_set_skb_tso_segs()
      to clear gso_size for small packets.
      
      Fixes: 6dd912f8 ("net: check untrusted gso_size at kernel entry")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c834566
  2. 25 Jun, 2020 13 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.130 · a39e7545
      Greg Kroah-Hartman authored
      a39e7545
    • Sean Christopherson's avatar
      KVM: x86/mmu: Set mmio_value to '0' if reserved #PF can't be generated · be0bcec4
      Sean Christopherson authored
      [ Upstream commit 6129ed87 ]
      
      Set the mmio_value to '0' instead of simply clearing the present bit to
      squash a benign warning in kvm_mmu_set_mmio_spte_mask() that complains
      about the mmio_value overlapping the lower GFN mask on systems with 52
      bits of PA space.
      
      Opportunistically clean up the code and comments.
      
      Cc: stable@vger.kernel.org
      Fixes: d43e2675 ("KVM: x86: only do L1TF workaround on affected processors")
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200527084909.23492-1-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      be0bcec4
    • Kai Huang's avatar
      kvm: x86: Fix reserved bits related calculation errors caused by MKTME · 940ed58a
      Kai Huang authored
      [ Upstream commit f3ecb59d ]
      
      Intel MKTME repurposes several high bits of physical address as 'keyID'
      for memory encryption thus effectively reduces platform's maximum
      physical address bits. Exactly how many bits are reduced is configured
      by BIOS. To honor such HW behavior, the repurposed bits are reduced from
      cpuinfo_x86->x86_phys_bits when MKTME is detected in CPU detection.
      Similarly, AMD SME/SEV also reduces physical address bits for memory
      encryption, and cpuinfo->x86_phys_bits is reduced too when SME/SEV is
      detected, so for both MKTME and SME/SEV, boot_cpu_data.x86_phys_bits
      doesn't hold physical address bits reported by CPUID anymore.
      
      Currently KVM treats bits from boot_cpu_data.x86_phys_bits to 51 as
      reserved bits, but it's not true anymore for MKTME, since MKTME treats
      those reduced bits as 'keyID', but not reserved bits. Therefore
      boot_cpu_data.x86_phys_bits cannot be used to calculate reserved bits
      anymore, although we can still use it for AMD SME/SEV since SME/SEV
      treats the reduced bits differently -- they are treated as reserved
      bits, the same as other reserved bits in page table entity [1].
      
      Fix by introducing a new 'shadow_phys_bits' variable in KVM x86 MMU code
      to store the effective physical bits w/o reserved bits -- for MKTME,
      it equals to physical address reported by CPUID, and for SME/SEV, it is
      boot_cpu_data.x86_phys_bits.
      
      Note that for the physical address bits reported to guest should remain
      unchanged -- KVM should report physical address reported by CPUID to
      guest, but not boot_cpu_data.x86_phys_bits. Because for Intel MKTME,
      there's no harm if guest sets up 'keyID' bits in guest page table (since
      MKTME only works at physical address level), and KVM doesn't even expose
      MKTME to guest. Arguably, for AMD SME/SEV, guest is aware of SEV thus it
      should adjust boot_cpu_data.x86_phys_bits when it detects SEV, therefore
      KVM should still reports physcial address reported by CPUID to guest.
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarKai Huang <kai.huang@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      940ed58a
    • Kai Huang's avatar
      kvm: x86: Move kvm_set_mmio_spte_mask() from x86.c to mmu.c · fc095d7e
      Kai Huang authored
      [ Upstream commit 7b6f8a06 ]
      
      As a prerequisite to fix several SPTE reserved bits related calculation
      errors caused by MKTME, which requires kvm_set_mmio_spte_mask() to use
      local static variable defined in mmu.c.
      
      Also move call site of kvm_set_mmio_spte_mask() from kvm_arch_init() to
      kvm_mmu_module_init() so that kvm_set_mmio_spte_mask() can be static.
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarKai Huang <kai.huang@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fc095d7e
    • NeilBrown's avatar
      md: add feature flag MD_FEATURE_RAID0_LAYOUT · f04928c3
      NeilBrown authored
      [ Upstream commit 33f2c35a ]
      
      Due to a bug introduced in Linux 3.14 we cannot determine the
      correctly layout for a multi-zone RAID0 array - there are two
      possibilities.
      
      It is possible to tell the kernel which to chose using a module
      parameter, but this can be clumsy to use.  It would be best if
      the choice were recorded in the metadata.
      So add a feature flag for this purpose.
      If it is set, then the 'layout' field of the superblock is used
      to determine which layout to use.
      
      If this flag is not set, then mddev->layout gets set to -1,
      which causes the module parameter to be required.
      Acked-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f04928c3
    • Vladimir Oltean's avatar
      Revert "dpaa_eth: fix usage as DSA master, try 3" · 6d6a0392
      Vladimir Oltean authored
      This reverts commit b145710b which is
      commit 5d14c304 upstream.
      
      The patch is not wrong, but the Fixes: tag is. It should have been:
      
      	Fixes: 060ad66f ("dpaa_eth: change DMA device")
      
      which means that it's fixing a commit which was introduced in:
      
      git describe --tags 060ad66f
      v5.4-rc3-783-g060ad66f
      
      which then means it should have not been backported to linux-4.19.y,
      where things _were_ working and now they're not.
      Reported-by: default avatarJoakim Tjernlund <joakim.tjernlund@infinera.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d6a0392
    • Ahmed S. Darwish's avatar
      net: core: device_rename: Use rwsem instead of a seqcount · 29d1d0c7
      Ahmed S. Darwish authored
      [ Upstream commit 11d6011c ]
      
      Sequence counters write paths are critical sections that must never be
      preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed.
      
      Commit 5dbe7c17 ("net: fix kernel deadlock with interface rename and
      netdev name retrieval.") handled a deadlock, observed with
      CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was
      infinitely spinning: it got scheduled after the seqcount write side
      blocked inside its own critical section.
      
      To fix that deadlock, among other issues, the commit added a
      cond_resched() inside the read side section. While this will get the
      non-preemptible kernel eventually unstuck, the seqcount reader is fully
      exhausting its slice just spinning -- until TIF_NEED_RESCHED is set.
      
      The fix is also still broken: if the seqcount reader belongs to a
      real-time scheduling policy, it can spin forever and the kernel will
      livelock.
      
      Disabling preemption over the seqcount write side critical section will
      not work: inside it are a number of GFP_KERNEL allocations and mutex
      locking through the drivers/base/ :: device_rename() call chain.
      
      >From all the above, replace the seqcount with a rwsem.
      
      Fixes: 5dbe7c17 (net: fix kernel deadlock with interface rename and netdev name retrieval.)
      Fixes: 30e6c9fa (net: devnet_rename_seq should be a seqcount)
      Fixes: c91f6df2 (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name)
      Cc: <stable@vger.kernel.org>
      Reported-by: kbuild test robot <lkp@intel.com> [ v1 missing up_read() on error exit ]
      Reported-by: Dan Carpenter <dan.carpenter@oracle.com> [ v1 missing up_read() on error exit ]
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      29d1d0c7
    • Thomas Gleixner's avatar
      sched/rt, net: Use CONFIG_PREEMPTION.patch · b855db2a
      Thomas Gleixner authored
      [ Upstream commit 2da2b32f ]
      
      CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
      Both PREEMPT and PREEMPT_RT require the same functionality which today
      depends on CONFIG_PREEMPT.
      
      Update the comment to use CONFIG_PREEMPTION.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20191015191821.11479-22-bigeasy@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b855db2a
    • Jiri Olsa's avatar
      kretprobe: Prevent triggering kretprobe from within kprobe_flush_task · 98abe944
      Jiri Olsa authored
      [ Upstream commit 9b38cc70 ]
      
      Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
      My test was also able to trigger lockdep output:
      
       ============================================
       WARNING: possible recursive locking detected
       5.6.0-rc6+ #6 Not tainted
       --------------------------------------------
       sched-messaging/2767 is trying to acquire lock:
       ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
      
       but task is already holding lock:
       ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&(kretprobe_table_locks[i].lock));
         lock(&(kretprobe_table_locks[i].lock));
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by sched-messaging/2767:
        #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       stack backtrace:
       CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
       Call Trace:
        dump_stack+0x96/0xe0
        __lock_acquire.cold.57+0x173/0x2b7
        ? native_queued_spin_lock_slowpath+0x42b/0x9e0
        ? lockdep_hardirqs_on+0x590/0x590
        ? __lock_acquire+0xf63/0x4030
        lock_acquire+0x15a/0x3d0
        ? kretprobe_hash_lock+0x52/0xa0
        _raw_spin_lock_irqsave+0x36/0x70
        ? kretprobe_hash_lock+0x52/0xa0
        kretprobe_hash_lock+0x52/0xa0
        trampoline_handler+0xf8/0x940
        ? kprobe_fault_handler+0x380/0x380
        ? find_held_lock+0x3a/0x1c0
        kretprobe_trampoline+0x25/0x50
        ? lock_acquired+0x392/0xbc0
        ? _raw_spin_lock_irqsave+0x50/0x70
        ? __get_valid_kprobe+0x1f0/0x1f0
        ? _raw_spin_unlock_irqrestore+0x3b/0x40
        ? finish_task_switch+0x4b9/0x6d0
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
      
      The code within the kretprobe handler checks for probe reentrancy,
      so we won't trigger any _raw_spin_lock_irqsave probe in there.
      
      The problem is in outside kprobe_flush_task, where we call:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave
      
      where _raw_spin_lock_irqsave triggers the kretprobe and installs
      kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
      
      The kretprobe_trampoline handler is then executed with already
      locked kretprobe_table_locks, and first thing it does is to
      lock kretprobe_table_locks ;-) the whole lockup path like:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
      
              ---> kretprobe_table_locks locked
      
              kretprobe_trampoline
                trampoline_handler
                  kretprobe_hash_lock(current, &head, &flags);  <--- deadlock
      
      Adding kprobe_busy_begin/end helpers that mark code with fake
      probe installed to prevent triggering of another kprobe within
      this code.
      
      Using these helpers in kprobe_flush_task, so the probe recursion
      protection check is hit and the probe is never set to prevent
      above lockup.
      
      Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2
      
      Fixes: ef53d9c5 ("kprobes: improve kretprobe scalability with hashed locking")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Reported-by: default avatar"Ziqian SUN (Zamir)" <zsun@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      98abe944
    • Alexander Sverdlin's avatar
      net: octeon: mgmt: Repair filling of RX ring · 10699fb9
      Alexander Sverdlin authored
      commit 0c34bb59 upstream.
      
      The removal of mips_swiotlb_ops exposed a problem in octeon_mgmt Ethernet
      driver. mips_swiotlb_ops had an mb() after most of the operations and the
      removal of the ops had broken the receive functionality of the driver.
      My code inspection has shown no other places except
      octeon_mgmt_rx_fill_ring() where an explicit barrier would be obviously
      missing. The latter function however has to make sure that "ringing the
      bell" doesn't happen before RX ring entry is really written.
      
      The patch has been successfully tested on Octeon II.
      
      Fixes: a999933d ("MIPS: remove mips_swiotlb_ops")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10699fb9
    • Chen Yu's avatar
      e1000e: Do not wake up the system via WOL if device wakeup is disabled · f31b2cab
      Chen Yu authored
      commit 6bf6be11 upstream.
      
      Currently the system will be woken up via WOL(Wake On LAN) even if the
      device wakeup ability has been disabled via sysfs:
       cat /sys/devices/pci0000:00/0000:00:1f.6/power/wakeup
       disabled
      
      The system should not be woken up if the user has explicitly
      disabled the wake up ability for this device.
      
      This patch clears the WOL ability of this network device if the
      user has disabled the wake up ability in sysfs.
      
      Fixes: bc7f75fa ("[E1000E]: New pci-express e1000 driver")
      Reported-by: default avatar"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f31b2cab
    • Masami Hiramatsu's avatar
      kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex · 34895355
      Masami Hiramatsu authored
      commit 1a0aa991 upstream.
      
      In kprobe_optimizer() kick_kprobe_optimizer() is called
      without kprobe_mutex, but this can race with other caller
      which is protected by kprobe_mutex.
      
      To fix that, expand kprobe_mutex protected area to protect
      kick_kprobe_optimizer() call.
      
      Link: http://lkml.kernel.org/r/158927057586.27680.5036330063955940456.stgit@devnote2
      
      Fixes: cd7ebe22 ("kprobes: Use text_poke_smp_batch for optimizing")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ziqian SUN <zsun@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34895355
    • Eric Biggers's avatar
      crypto: algboss - don't wait during notifier callback · c0ef44cb
      Eric Biggers authored
      commit 77251e41 upstream.
      
      When a crypto template needs to be instantiated, CRYPTO_MSG_ALG_REQUEST
      is sent to crypto_chain.  cryptomgr_schedule_probe() handles this by
      starting a thread to instantiate the template, then waiting for this
      thread to complete via crypto_larval::completion.
      
      This can deadlock because instantiating the template may require loading
      modules, and this (apparently depending on userspace) may need to wait
      for the crc-t10dif module (lib/crc-t10dif.c) to be loaded.  But
      crc-t10dif's module_init function uses crypto_register_notifier() and
      therefore takes crypto_chain.rwsem for write.  That can't proceed until
      the notifier callback has finished, as it holds this semaphore for read.
      
      Fix this by removing the wait on crypto_larval::completion from within
      cryptomgr_schedule_probe().  It's actually unnecessary because
      crypto_alg_mod_lookup() calls crypto_larval_wait() itself after sending
      CRYPTO_MSG_ALG_REQUEST.
      
      This only actually became a problem in v4.20 due to commit b7637754
      ("crc-t10dif: Pick better transform if one becomes available"), but the
      unnecessary wait was much older.
      
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207159Reported-by: default avatarMike Gerow <gerow@google.com>
      Fixes: 39871037 ("crypto: algapi - Move larval completion into algboss")
      Cc: <stable@vger.kernel.org> # v3.6+
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reported-by: default avatarKai Lüke <kai@kinvolk.io>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0ef44cb