1. 09 Sep, 2020 3 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 2650be2c
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ===================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Allow conntrack entries with l3num == NFPROTO_IPV4 or == NFPROTO_IPV6
         only via ctnetlink, from Will McVicker.
      
      2) Batch notifications to userspace to improve netlink socket receive
         utilization.
      
      3) Restore mark based dump filtering via ctnetlink, from Martin Willi.
      
      4) nf_conncount_init() fails with -EPROTO with CONFIG_IPV6, from
         Eelco Chaudron.
      
      5) Containers fail to match on meta skuid and skgid, use socket user_ns
         to retrieve meta skuid and skgid.
      ===================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2650be2c
    • Eric Dumazet's avatar
      ipv6: avoid lockdep issue in fib6_del() · 843d926b
      Eric Dumazet authored
      syzbot reported twice a lockdep issue in fib6_del() [1]
      which I think is caused by net->ipv6.fib6_null_entry
      having a NULL fib6_table pointer.
      
      fib6_del() already checks for fib6_null_entry special
      case, we only need to return earlier.
      
      Bug seems to occur very rarely, I have thus chosen
      a 'bug origin' that makes backports not too complex.
      
      [1]
      WARNING: suspicious RCU usage
      5.9.0-rc4-syzkaller #0 Not tainted
      -----------------------------
      net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      4 locks held by syz-executor.5/8095:
       #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
       #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
       #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
       #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
       #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
       #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
      
      stack backtrace:
      CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x198/0x1fd lib/dump_stack.c:118
       fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
       fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
       fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
       fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
       fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
       __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
       fib6_clean_all net/ipv6/ip6_fib.c:2257 [inline]
       fib6_run_gc+0x113/0x2d0 net/ipv6/ip6_fib.c:2320
       ndisc_netdev_event+0x217/0x350 net/ipv6/ndisc.c:1805
       notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
       call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2033
       call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
       call_netdevice_notifiers net/core/dev.c:2059 [inline]
       dev_close_many+0x30b/0x650 net/core/dev.c:1634
       rollback_registered_many+0x3a8/0x1210 net/core/dev.c:9261
       rollback_registered net/core/dev.c:9329 [inline]
       unregister_netdevice_queue+0x2dd/0x570 net/core/dev.c:10410
       unregister_netdevice include/linux/netdevice.h:2774 [inline]
       ppp_release+0x216/0x240 drivers/net/ppp/ppp_generic.c:403
       __fput+0x285/0x920 fs/file_table.c:281
       task_work_run+0xdd/0x190 kernel/task_work.c:141
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
       exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
       syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 421842ed ("net/ipv6: Add fib6_null_entry")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      843d926b
    • Vladimir Oltean's avatar
      net: dsa: link interfaces with the DSA master to get rid of lockdep warnings · 2f1e8ea7
      Vladimir Oltean authored
      Since commit 845e0ebb ("net: change addr_list_lock back to static
      key"), cascaded DSA setups (DSA switch port as DSA master for another
      DSA switch port) are emitting this lockdep warning:
      
      ============================================
      WARNING: possible recursive locking detected
      5.8.0-rc1-00133-g923e4b5032dd-dirty #208 Not tainted
      --------------------------------------------
      dhcpcd/323 is trying to acquire lock:
      ffff000066dd4268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
      
      but task is already holding lock:
      ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&dsa_master_addr_list_lock_key/1);
        lock(&dsa_master_addr_list_lock_key/1);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by dhcpcd/323:
       #0: ffffdbd1381dda18 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30
       #1: ffff00006614b268 (_xmit_ETHER){+...}-{2:2}, at: dev_set_rx_mode+0x28/0x48
       #2: ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
      
      stack backtrace:
      Call trace:
       dump_backtrace+0x0/0x1e0
       show_stack+0x20/0x30
       dump_stack+0xec/0x158
       __lock_acquire+0xca0/0x2398
       lock_acquire+0xe8/0x440
       _raw_spin_lock_nested+0x64/0x90
       dev_mc_sync+0x44/0x90
       dsa_slave_set_rx_mode+0x34/0x50
       __dev_set_rx_mode+0x60/0xa0
       dev_mc_sync+0x84/0x90
       dsa_slave_set_rx_mode+0x34/0x50
       __dev_set_rx_mode+0x60/0xa0
       dev_set_rx_mode+0x30/0x48
       __dev_open+0x10c/0x180
       __dev_change_flags+0x170/0x1c8
       dev_change_flags+0x2c/0x70
       devinet_ioctl+0x774/0x878
       inet_ioctl+0x348/0x3b0
       sock_do_ioctl+0x50/0x310
       sock_ioctl+0x1f8/0x580
       ksys_ioctl+0xb0/0xf0
       __arm64_sys_ioctl+0x28/0x38
       el0_svc_common.constprop.0+0x7c/0x180
       do_el0_svc+0x2c/0x98
       el0_sync_handler+0x9c/0x1b8
       el0_sync+0x158/0x180
      
      Since DSA never made use of the netdev API for describing links between
      upper devices and lower devices, the dev->lower_level value of a DSA
      switch interface would be 1, which would warn when it is a DSA master.
      
      We can use netdev_upper_dev_link() to describe the relationship between
      a DSA slave and a DSA master. To be precise, a DSA "slave" (switch port)
      is an "upper" to a DSA "master" (host port). The relationship is "many
      uppers to one lower", like in the case of VLAN. So, for that reason, we
      use the same function as VLAN uses.
      
      There might be a chance that somebody will try to take hold of this
      interface and use it immediately after register_netdev() and before
      netdev_upper_dev_link(). To avoid that, we do the registration and
      linkage while holding the RTNL, and we use the RTNL-locked cousin of
      register_netdev(), which is register_netdevice().
      
      Since this warning was not there when lockdep was using dynamic keys for
      addr_list_lock, we are blaming the lockdep patch itself. The network
      stack _has_ been using static lockdep keys before, and it _is_ likely
      that stacked DSA setups have been triggering these lockdep warnings
      since forever, however I can't test very old kernels on this particular
      stacked DSA setup, to ensure I'm not in fact introducing regressions.
      
      Fixes: 845e0ebb ("net: change addr_list_lock back to static key")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f1e8ea7
  2. 08 Sep, 2020 6 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nft_meta: use socket user_ns to retrieve skuid and skgid · 0c92411b
      Pablo Neira Ayuso authored
      ... instead of using init_user_ns.
      
      Fixes: 96518518 ("netfilter: add nftables")
      Tested-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0c92411b
    • Eelco Chaudron's avatar
      netfilter: conntrack: nf_conncount_init is failing with IPv6 disabled · 526e81b9
      Eelco Chaudron authored
      The openvswitch module fails initialization when used in a kernel
      without IPv6 enabled. nf_conncount_init() fails because the ct code
      unconditionally tries to initialize the netns IPv6 related bit,
      regardless of the build option. The change below ignores the IPv6
      part if not enabled.
      
      Note that the corresponding _put() function already has this IPv6
      configuration check.
      
      Fixes: 11efd5cb ("openvswitch: Support conntrack zone limit")
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      526e81b9
    • Martin Willi's avatar
      netfilter: ctnetlink: fix mark based dump filtering regression · 6c0d95d1
      Martin Willi authored
      conntrack mark based dump filtering may falsely skip entries if a mask
      is given: If the mask-based check does not filter out the entry, the
      else-if check is always true and compares the mark without considering
      the mask. The if/else-if logic seems wrong.
      
      Given that the mask during filter setup is implicitly set to 0xffffffff
      if not specified explicitly, the mark filtering flags seem to just
      complicate things. Restore the previously used approach by always
      matching against a zero mask is no filter mark is given.
      
      Fixes: cb8aa9a3 ("netfilter: ctnetlink: add kernel side filtering for dump")
      Signed-off-by: default avatarMartin Willi <martin@strongswan.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6c0d95d1
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: coalesce multiple notifications into one skbuff · 67cc570e
      Pablo Neira Ayuso authored
      On x86_64, each notification results in one skbuff allocation which
      consumes at least 768 bytes due to the skbuff overhead.
      
      This patch coalesces several notifications into one single skbuff, so
      each notification consumes at least ~211 bytes, that ~3.5 times less
      memory consumption. As a result, this is reducing the chances to exhaust
      the netlink socket receive buffer.
      
      Rule of thumb is that each notification batch only contains netlink
      messages whose report flag is the same, nfnetlink_send() requires this
      to do appropriate delivery to userspace, either via unicast (echo
      mode) or multicast (monitor mode).
      
      The skbuff control buffer is used to annotate the report flag for later
      handling at the new coalescing routine.
      
      The batch skbuff notification size is NLMSG_GOODSIZE, using a larger
      skbuff would allow for more socket receiver buffer savings (to amortize
      the cost of the skbuff even more), however, going over that size might
      break userspace applications, so let's be conservative and stick to
      NLMSG_GOODSIZE.
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Acked-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      67cc570e
    • Will McVicker's avatar
      netfilter: ctnetlink: add a range check for l3/l4 protonum · 1cc5ef91
      Will McVicker authored
      The indexes to the nf_nat_l[34]protos arrays come from userspace. So
      check the tuple's family, e.g. l3num, when creating the conntrack in
      order to prevent an OOB memory access during setup.  Here is an example
      kernel panic on 4.14.180 when userspace passes in an index greater than
      NFPROTO_NUMPROTO.
      
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:...
      Process poc (pid: 5614, stack limit = 0x00000000a3933121)
      CPU: 4 PID: 5614 Comm: poc Tainted: G S      W  O    4.14.180-g051355490483
      Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM
      task: 000000002a3dfffe task.stack: 00000000a3933121
      pc : __cfi_check_fail+0x1c/0x24
      lr : __cfi_check_fail+0x1c/0x24
      ...
      Call trace:
      __cfi_check_fail+0x1c/0x24
      name_to_dev_t+0x0/0x468
      nfnetlink_parse_nat_setup+0x234/0x258
      ctnetlink_parse_nat_setup+0x4c/0x228
      ctnetlink_new_conntrack+0x590/0xc40
      nfnetlink_rcv_msg+0x31c/0x4d4
      netlink_rcv_skb+0x100/0x184
      nfnetlink_rcv+0xf4/0x180
      netlink_unicast+0x360/0x770
      netlink_sendmsg+0x5a0/0x6a4
      ___sys_sendmsg+0x314/0x46c
      SyS_sendmsg+0xb4/0x108
      el0_svc_naked+0x34/0x38
      
      This crash is not happening since 5.4+, however, ctnetlink still
      allows for creating entries with unsupported layer 3 protocol number.
      
      Fixes: c1d10adb ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
      Signed-off-by: default avatarWill McVicker <willmcvicker@google.com>
      [pablo@netfilter.org: rebased original patch on top of nf.git]
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1cc5ef91
    • Dexuan Cui's avatar
      hv_netvsc: Fix hibernation for mlx5 VF driver · 19162fd4
      Dexuan Cui authored
      mlx5_suspend()/resume() keep the network interface, so during hibernation
      netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
      netvsc_resume() should call netvsc_vf_changed() to switch the data path
      back to the VF after hibernation. Note: after we close and re-open the
      vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
      the data path is implicitly switched to the netvsc NIC. Similarly,
      netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
      can no longer be used after hibernation.
      
      For mlx4, since the VF network interafce is explicitly destroyed and
      re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
      already explicitly switches the data path from and to the VF automatically
      via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
      this fix. Note: mlx4 can still work with the fix because in
      netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.
      
      Fixes: 0efeea5f ("hv_netvsc: Add the support of hibernation")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      19162fd4
  3. 07 Sep, 2020 7 commits
    • Taehee Yoo's avatar
      Revert "netns: don't disable BHs when locking "nsid_lock"" · e1f469cd
      Taehee Yoo authored
      This reverts commit 8d7e5dee.
      
      To protect netns id, the nsid_lock is used when netns id is being
      allocated and removed by peernet2id_alloc() and unhash_nsid().
      The nsid_lock can be used in BH context but only spin_lock() is used
      in this code.
      Using spin_lock() instead of spin_lock_bh() can result in a deadlock in
      the following scenario reported by the lockdep.
      In order to avoid a deadlock, the spin_lock_bh() should be used instead
      of spin_lock() to acquire nsid_lock.
      
      Test commands:
          ip netns del nst
          ip netns add nst
          ip link add veth1 type veth peer name veth2
          ip link set veth1 netns nst
          ip netns exec nst ip link add name br1 type bridge vlan_filtering 1
          ip netns exec nst ip link set dev br1 up
          ip netns exec nst ip link set dev veth1 master br1
          ip netns exec nst ip link set dev veth1 up
          ip netns exec nst ip link add macvlan0 link br1 up type macvlan
      
      Splat looks like:
      [   33.615860][  T607] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
      [   33.617194][  T607] 5.9.0-rc1+ #665 Not tainted
      [ ... ]
      [   33.670615][  T607] Chain exists of:
      [   33.670615][  T607]   &mc->mca_lock --> &bridge_netdev_addr_lock_key --> &net->nsid_lock
      [   33.670615][  T607]
      [   33.673118][  T607]  Possible interrupt unsafe locking scenario:
      [   33.673118][  T607]
      [   33.674599][  T607]        CPU0                    CPU1
      [   33.675557][  T607]        ----                    ----
      [   33.676516][  T607]   lock(&net->nsid_lock);
      [   33.677306][  T607]                                local_irq_disable();
      [   33.678517][  T607]                                lock(&mc->mca_lock);
      [   33.679725][  T607]                                lock(&bridge_netdev_addr_lock_key);
      [   33.681166][  T607]   <Interrupt>
      [   33.681791][  T607]     lock(&mc->mca_lock);
      [   33.682579][  T607]
      [   33.682579][  T607]  *** DEADLOCK ***
      [ ... ]
      [   33.922046][  T607] stack backtrace:
      [   33.922999][  T607] CPU: 3 PID: 607 Comm: ip Not tainted 5.9.0-rc1+ #665
      [   33.924099][  T607] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [   33.925714][  T607] Call Trace:
      [   33.926238][  T607]  dump_stack+0x78/0xab
      [   33.926905][  T607]  check_irq_usage+0x70b/0x720
      [   33.927708][  T607]  ? iterate_chain_key+0x60/0x60
      [   33.928507][  T607]  ? check_path+0x22/0x40
      [   33.929201][  T607]  ? check_noncircular+0xcf/0x180
      [   33.930024][  T607]  ? __lock_acquire+0x1952/0x1f20
      [   33.930860][  T607]  __lock_acquire+0x1952/0x1f20
      [   33.931667][  T607]  lock_acquire+0xaf/0x3a0
      [   33.932366][  T607]  ? peernet2id_alloc+0x3a/0x170
      [   33.933147][  T607]  ? br_port_fill_attrs+0x54c/0x6b0 [bridge]
      [   33.934140][  T607]  ? br_port_fill_attrs+0x5de/0x6b0 [bridge]
      [   33.935113][  T607]  ? kvm_sched_clock_read+0x14/0x30
      [   33.935974][  T607]  _raw_spin_lock+0x30/0x70
      [   33.936728][  T607]  ? peernet2id_alloc+0x3a/0x170
      [   33.937523][  T607]  peernet2id_alloc+0x3a/0x170
      [   33.938313][  T607]  rtnl_fill_ifinfo+0xb5e/0x1400
      [   33.939091][  T607]  rtmsg_ifinfo_build_skb+0x8a/0xf0
      [   33.939953][  T607]  rtmsg_ifinfo_event.part.39+0x17/0x50
      [   33.940863][  T607]  rtmsg_ifinfo+0x1f/0x30
      [   33.941571][  T607]  __dev_notify_flags+0xa5/0xf0
      [   33.942376][  T607]  ? __irq_work_queue_local+0x49/0x50
      [   33.943249][  T607]  ? irq_work_queue+0x1d/0x30
      [   33.943993][  T607]  ? __dev_set_promiscuity+0x7b/0x1a0
      [   33.944878][  T607]  __dev_set_promiscuity+0x7b/0x1a0
      [   33.945758][  T607]  dev_set_promiscuity+0x1e/0x50
      [   33.946582][  T607]  br_port_set_promisc+0x1f/0x40 [bridge]
      [   33.947487][  T607]  br_manage_promisc+0x8b/0xe0 [bridge]
      [   33.948388][  T607]  __dev_set_promiscuity+0x123/0x1a0
      [   33.949244][  T607]  __dev_set_rx_mode+0x68/0x90
      [   33.950021][  T607]  dev_uc_add+0x50/0x60
      [   33.950720][  T607]  macvlan_open+0x18e/0x1f0 [macvlan]
      [   33.951601][  T607]  __dev_open+0xd6/0x170
      [   33.952269][  T607]  __dev_change_flags+0x181/0x1d0
      [   33.953056][  T607]  rtnl_configure_link+0x2f/0xa0
      [   33.953884][  T607]  __rtnl_newlink+0x6b9/0x8e0
      [   33.954665][  T607]  ? __lock_acquire+0x95d/0x1f20
      [   33.955450][  T607]  ? lock_acquire+0xaf/0x3a0
      [   33.956193][  T607]  ? is_bpf_text_address+0x5/0xe0
      [   33.956999][  T607]  rtnl_newlink+0x47/0x70
      Acked-by: default avatarGuillaume Nault <gnault@redhat.com>
      Fixes: 8d7e5dee ("netns: don't disable BHs when locking "nsid_lock"")
      Reported-by: syzbot+3f960c64a104eaa2c813@syzkaller.appspotmail.com
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e1f469cd
    • Jakub Kicinski's avatar
      ibmvnic: add missing parenthesis in do_reset() · 8ae4dff8
      Jakub Kicinski authored
      Indentation and logic clearly show that this code is missing
      parenthesis.
      
      Fixes: 9f134573 ("ibmvnic fix NULL tx_pools and rx_tools issue at do_reset")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8ae4dff8
    • Randy Dunlap's avatar
      netdevice.h: fix xdp_state kernel-doc warning · ffa59b0b
      Randy Dunlap authored
      Fix kernel-doc warning in <linux/netdevice.h>:
      
      ../include/linux/netdevice.h:2158: warning: Function parameter or member 'xdp_state' not described in 'net_device'
      
      Fixes: 7f0a8382 ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffa59b0b
    • Randy Dunlap's avatar
      netdevice.h: fix proto_down_reason kernel-doc warning · eb02d39a
      Randy Dunlap authored
      Fix kernel-doc warning in <linux/netdevice.h>:
      
      ../include/linux/netdevice.h:2158: warning: Function parameter or member 'proto_down_reason' not described in 'net_device'
      
      Fixes: 829eb208 ("rtnetlink: add support for protodown reason")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      eb02d39a
    • Jakub Kicinski's avatar
      Merge branch 'bnxt_en-Two-bug-fixes' · 72bbee2a
      Jakub Kicinski authored
      Michael Chan says:
      
      ====================
      bnxt_en: Two bug fixes.
      
      The first patch fixes AER recovery by reducing the time from several
      minutes to a more reasonable 20 - 30 seconds.  The second patch fixes
      a possible NULL pointer crash during firmware reset.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      72bbee2a
    • Vasundhara Volam's avatar
      bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task() · b16939b5
      Vasundhara Volam authored
      bnxt_fw_reset_task() which runs from a workqueue can race with
      bnxt_remove_one().  For example, if firmware reset and VF FLR are
      happening at about the same time.
      
      bnxt_remove_one() already cancels the workqueue and waits for it
      to finish, but we need to do this earlier before the devlink
      reporters are destroyed.  This will guarantee that
      the devlink reporters will always be valid when bnxt_fw_reset_task()
      is still running.
      
      Fixes: b148bb23 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().")
      Reviewed-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b16939b5
    • Vasundhara Volam's avatar
      bnxt_en: Avoid sending firmware messages when AER error is detected. · b340dc68
      Vasundhara Volam authored
      When the driver goes through PCIe AER reset in error state, all
      firmware messages will timeout because the PCIe bus is no longer
      accessible.  This can lead to AER reset taking many minutes to
      complete as each firmware command takes time to timeout.
      
      Define a new macro BNXT_NO_FW_ACCESS() to skip these firmware messages
      when either firmware is in fatal error state or when
      pci_channel_offline() is true.  It now takes a more reasonable 20 to
      30 seconds to complete AER recovery.
      
      Fixes: b4fff207 ("bnxt_en: Do not send firmware messages if firmware is in error state.")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b340dc68
  4. 06 Sep, 2020 2 commits
  5. 05 Sep, 2020 7 commits
  6. 04 Sep, 2020 15 commits
    • Linus Torvalds's avatar
      Merge tag 's390-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · c70672d8
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix GENERIC_LOCKBREAK dependency on PREEMPTION in Kconfig broken
         because of a typo
      
       - Update defconfigs
      
      * tag 's390-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: update defconfigs
        s390: fix GENERIC_LOCKBREAK dependency typo in Kconfig
      c70672d8
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 09274aed
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Fix the loading of modules built with binutils-2.35. This version
         produces writable and executable .text.ftrace_trampoline section
         which is rejected by the kernel.
      
       - Remove the exporting of cpu_logical_map() as the Tegra driver has now
         been fixed and no longer uses this function.
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE
        arm64: Remove exporting cpu_logical_map symbol
      09274aed
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_5.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 16bf121b
      Linus Torvalds authored
      Pull MIPS fixes from Thomas Bogendoerfer:
       "A few MIPS fixes:
      
         - fallthrough fallout fix
      
         - BMIPS fixes
      
         - MSA fix to avoid leaking MSA register contents
      
         - Loongson perf and cpu feature fix
      
         - SNI interrupt fix"
      
      * tag 'mips_fixes_5.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: SNI: Fix SCSI interrupt
        MIPS: add missing MSACSR and upper MSA initialization
        MIPS: perf: Fix wrong check condition of Loongson event IDs
        mips/oprofile: Fix fallthrough placement
        MIPS: Loongson64: Remove unnecessary inclusion of boot_param.h
        MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores
        MIPS: mm: BMIPS5000 has inclusive physical caches
        MIPS: Loongson64: Do not override watch and ejtag feature
      16bf121b
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.9-2' of... · 41bef91c
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - fix documents
      
       - fix warning in 'make localmodconfig'
      
      * tag 'kbuild-fixes-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kconfig: remove redundant assignment prompt = prompt
        kbuild: Documentation: clean up makefiles.rst
        kconfig: streamline_config.pl: check defined(ENV variable) before using it
        Documentation/llvm: Improve formatting of commands, variables, and arguments
      41bef91c
    • Linus Torvalds's avatar
      Merge tag 'pm-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · f162626a
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix reference counting in the operating performance points (OPP)
        framework and address a few intel_pstate driver issues, mostly related
        to switching driver operation modes and similar with hardware-managed
        P-states (HWP) enabled.
      
        Specifics:
      
         - Fix reference counting of operating performance points (OPP) tables
           (Viresh Kumar).
      
         - Address intel_pstate driver interface issues, mostly related to
           switching operation modes and handling CPU offline and online and
           system-wide suspend/resume with hardware-managed P-states (HWP)
           enabled (Rafael Wysocki).
      
         - Fix the maximum frequency computation in the intel_pstate driver
           with turbo P-states disabled by the platform firmware and HWP
           enabled (Francisco Jerez)"
      
      * tag 'pm-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled
        cpufreq: intel_pstate: Free memory only when turning off
        cpufreq: intel_pstate: Add ->offline and ->online callbacks
        cpufreq: intel_pstate: Tweak the EPP sysfs interface
        cpufreq: intel_pstate: Update cached EPP in the active mode
        cpufreq: intel_pstate: Refuse to turn off with HWP enabled
        opp: Don't drop reference for an OPP table that was never parsed
      f162626a
    • Linus Torvalds's avatar
      Merge tag 'libata-5.9-2020-09-04' of git://git.kernel.dk/linux-block · d824e080
      Linus Torvalds authored
      Pull libata fixes from Jens Axboe:
      
       - improve Sandisks ATA_HORKAGE on NCQ (Tejun)
      
       - link printk cleanup (Xu)
      
      * tag 'libata-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
        libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks
        ata: ahci: use ata_link_info() instead of ata_link_printk()
      d824e080
    • Linus Torvalds's avatar
      Merge tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-block · 8075fc3b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A bit larger than usual this week, mostly due to the NVMe fixes
        arriving late for -rc3 and hence didn't make last weeks pull request.
      
         - NVMe:
              - instance leak and io boundary fixes from Keith
              - fc locking fix from Christophe
              - various tcp/rdma reset during traffic fixes from Sagi
              - pci use-after-free fix from Tong
              - tcp target null deref fix from Ziye
      
         - Locking fix for partition removal (Christoph)
      
         - Ensure bdi->io_pages is always set (me)
      
         - Fixup for hd struct reference (Ming)
      
         - Fix for zero length bvecs (Ming)
      
         - Two small blk-iocost fixes (Tejun)"
      
      * tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
        block: allow for_each_bvec to support zero len bvec
        blk-stat: make q->stats->lock irqsafe
        blk-iocost: ioc_pd_free() shouldn't assume irq disabled
        block: fix locking in bdev_del_partition
        block: release disk reference in hd_struct_free_work
        block: ensure bdi->io_pages is always initialized
        nvme-pci: cancel nvme device request before disabling
        nvme: only use power of two io boundaries
        nvme: fix controller instance leak
        nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
        nvme: Fix NULL dereference for pci nvme controllers
        nvme-rdma: fix reset hang if controller died in the middle of a reset
        nvme-rdma: fix timeout handler
        nvme-rdma: serialize controller teardown sequences
        nvme-tcp: fix reset hang if controller died in the middle of a reset
        nvme-tcp: fix timeout handler
        nvme-tcp: serialize controller teardown sequences
        nvme: have nvme_wait_freeze_timeout return if it timed out
        nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance
        nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
      8075fc3b
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-09-04' of git://git.kernel.dk/linux-block · d849ca48
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - EAGAIN with O_NONBLOCK retry fix
      
       - Two small fixes for registered files (Jiufei)
      
      * tag 'io_uring-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
        io_uring: no read/write-retry on -EAGAIN error and O_NONBLOCK marked file
        io_uring: set table->files[i] to NULL when io_sqe_file_register failed
        io_uring: fix removing the wrong file in __io_sqe_files_update()
      d849ca48
    • Linus Torvalds's avatar
      Merge tag 'thermal-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux · 2fb54791
      Linus Torvalds authored
      Pull thermal fixes from Daniel Lezcano:
      
       - Fix bogus thermal shutdowns for omap4430 where bogus values resulting
         from an incorrect ADC conversion are too high and fire an emergency
         shutdown (Tony Lindgren)
      
       - Don't suppress negative temp for qcom spmi as they are valid and
         userspace needs them (Veera Vegivada)
      
       - Fix use-after-free in thermal_zone_device_unregister reported by
         Kasan (Dmitry Osipenko)
      
      * tag 'thermal-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
        thermal: core: Fix use-after-free in thermal_zone_device_unregister()
        thermal: qcom-spmi-temp-alarm: Don't suppress negative temp
        thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430
      2fb54791
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · e2dacf6c
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "A couple of core fixes and odd driver fixes for dmaengine subsystem:
      
        Core:
         - drop ACPI CSRT table reference after using it
         - fix of_dma_router_xlate() error handling
      
        Drivers fixes in idxd, at_hdmac, pl330, dw-edma and jz478"
      
      * tag 'dmaengine-fix-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: ti: k3-udma: Update rchan_oes_offset for am654 SYSFW ABI 3.0
        drivers/dma/dma-jz4780: Fix race condition between probe and irq handler
        dmaengine: dw-edma: Fix scatter-gather address calculation
        dmaengine: ti: k3-udma: Fix the TR initialization for prep_slave_sg
        dmaengine: pl330: Fix burst length if burst size is smaller than bus width
        dmaengine: at_hdmac: add missing kfree() call in at_dma_xlate()
        dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()
        dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate()
        dmaengine: of-dma: Fix of_dma_router_xlate's of_dma_xlate handling
        dmaengine: idxd: reset states after device disable or reset
        dmaengine: acpi: Put the CSRT table after using it
      e2dacf6c
    • Linus Torvalds's avatar
      Merge tag 'sound-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 86edf52e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small changes, nothing intrusive:
      
         - remaining tasklet API conversions, now all sound stuff have been
           converted
      
         - a few HD-audio and USB-audio quirks and minor fixes
      
         - FireWire Tascam and Digi00xx fixes
      
         - drop a kernel WARNING from PCM OSS for syzkaller"
      
      * tag 'sound-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
        ALSA: hda/realtek - Improved routing for Thinkpad X1 7th/8th Gen
        ALSA: hda: use consistent HDAudio spelling in comments/docs
        ALSA: hda: add dev_dbg log when driver is not selected
        ALSA: hda: fix a runtime pm issue in SOF when integrated GPU is disabled
        ALSA: hda: hdmi - add Rocketlake support
        ALSA: ua101: convert tasklets to use new tasklet_setup() API
        ALSA: usb-audio: convert tasklets to use new tasklet_setup() API
        ASoC: txx9: convert tasklets to use new tasklet_setup() API
        ASoC: siu: convert tasklets to use new tasklet_setup() API
        ASoC: fsl_esai: convert tasklets to use new tasklet_setup() API
        ALSA: hdsp: convert tasklets to use new tasklet_setup() API
        ALSA: riptide: convert tasklets to use new tasklet_setup() API
        ALSA: pci/asihpi: convert tasklets to use new tasklet_setup() API
        ALSA: firewire: convert tasklets to use new tasklet_setup() API
        ALSA: core: convert tasklets to use new tasklet_setup() API
        ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check
        ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO
        ALSA: hda/hdmi: always check pin power status in i915 pin fixup
        ALSA: hda/realtek: Add quirk for Samsung Galaxy Book Ion NT950XCJ-X716A
        ALSA: usb-audio: Add basic capture support for Pioneer DJ DJM-250MK2
        ...
      86edf52e
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2020-09-04' of git://anongit.freedesktop.org/drm/drm · cf85f5de
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Not much going on this week, nouveau has a display hw bug workaround,
        amdgpu has some PM fixes and CIK regression fixes, one single radeon
        PLL fix, and a couple of i915 display fixes.
      
        amdgpu:
         - Fix for 32bit systems
         - SW CTF fix
         - Update for Sienna Cichlid
         - CIK bug fixes
      
        radeon:
         - PLL fix
      
        i915:
         - Clang build warning fix
         - HDCP fixes
      
        nouveau:
         - display fixes"
      
      * tag 'drm-fixes-2020-09-04' of git://anongit.freedesktop.org/drm/drm:
        drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug
        drm/nouveau/kms/nv50-gp1xx: disable notifies again after core update
        drm/nouveau/kms/nv50-: add some whitespace before debug message
        drm/nouveau/kms/gv100-: Include correct push header in crcc37d.c
        drm/radeon: Prefer lower feedback dividers
        drm/amdgpu: Fix bug in reporting voltage for CIK
        drm/amdgpu: Specify get_argument function for ci_smu_funcs
        drm/amd/pm: enable MP0 DPM for sienna_cichlid
        drm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting
        drm/amd/pm: fix is_dpm_running() run error on 32bit system
        drm/i915: Clear the repeater bit on HDCP disable
        drm/i915: Fix sha_text population code
        drm/i915/display: Ensure that ret is always initialized in icl_combo_phy_verify_state
      cf85f5de
    • Or Cohen's avatar
      net/packet: fix overflow in tpacket_rcv · acf69c94
      Or Cohen authored
      Using tp_reserve to calculate netoff can overflow as
      tp_reserve is unsigned int and netoff is unsigned short.
      
      This may lead to macoff receving a smaller value then
      sizeof(struct virtio_net_hdr), and if po->has_vnet_hdr
      is set, an out-of-bounds write will occur when
      calling virtio_net_hdr_from_skb.
      
      The bug is fixed by converting netoff to unsigned int
      and checking if it exceeds USHRT_MAX.
      
      This addresses CVE-2020-14386
      
      Fixes: 8913336a ("packet: add PACKET_RESERVE sockopt")
      Signed-off-by: default avatarOr Cohen <orcohen@paloaltonetworks.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      acf69c94
    • Linus Torvalds's avatar
      Merge branch 'simplify-do_wp_page' · b25d1dc9
      Linus Torvalds authored
      Merge emailed patches from Peter Xu:
       "This is a small series that I picked up from Linus's suggestion to
        simplify cow handling (and also make it more strict) by checking
        against page refcounts rather than mapcounts.
      
        This makes uffd-wp work again (verified by running upmapsort)"
      
      Note: this is horrendously bad timing, and making this kind of
      fundamental vm change after -rc3 is not at all how things should work.
      The saving grace is that it really is a a nice simplification:
      
       8 files changed, 29 insertions(+), 120 deletions(-)
      
      The reason for the bad timing is that it turns out that commit
      17839856 ("gup: document and work around 'COW can break either way'
      issue" broke not just UFFD functionality (as Peter noticed), but Mikulas
      Patocka also reports that it caused issues for strace when running in a
      DAX environment with ext4 on a persistent memory setup.
      
      And we can't just revert that commit without re-introducing the original
      issue that is a potential security hole, so making COW stricter (and in
      the process much simpler) is a step to then undoing the forced COW that
      broke other uses.
      
      Link: https://lore.kernel.org/lkml/alpine.LRH.2.02.2009031328040.6929@file01.intranet.prod.int.rdu2.redhat.com/
      
      * emailed patches from Peter Xu <peterx@redhat.com>:
        mm: Add PGREUSE counter
        mm/gup: Remove enfornced COW mechanism
        mm/ksm: Remove reuse_ksm_page()
        mm: do_wp_page() simplification
      b25d1dc9
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpufreq' · f7ce2c3a
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled
        cpufreq: intel_pstate: Free memory only when turning off
        cpufreq: intel_pstate: Add ->offline and ->online callbacks
        cpufreq: intel_pstate: Tweak the EPP sysfs interface
        cpufreq: intel_pstate: Update cached EPP in the active mode
        cpufreq: intel_pstate: Refuse to turn off with HWP enabled
      f7ce2c3a