1. 14 Nov, 2019 1 commit
    • Jouni Hogander's avatar
      slcan: Fix memory leak in error path · ed50e160
      Jouni Hogander authored
      This patch is fixing memory leak reported by Syzkaller:
      
      BUG: memory leak unreferenced object 0xffff888067f65500 (size 4096):
        comm "syz-executor043", pid 454, jiffies 4294759719 (age 11.930s)
        hex dump (first 32 bytes):
          73 6c 63 61 6e 30 00 00 00 00 00 00 00 00 00 00 slcan0..........
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
        backtrace:
          [<00000000a06eec0d>] __kmalloc+0x18b/0x2c0
          [<0000000083306e66>] kvmalloc_node+0x3a/0xc0
          [<000000006ac27f87>] alloc_netdev_mqs+0x17a/0x1080
          [<0000000061a996c9>] slcan_open+0x3ae/0x9a0
          [<000000001226f0f9>] tty_ldisc_open.isra.1+0x76/0xc0
          [<0000000019289631>] tty_set_ldisc+0x28c/0x5f0
          [<000000004de5a617>] tty_ioctl+0x48d/0x1590
          [<00000000daef496f>] do_vfs_ioctl+0x1c7/0x1510
          [<0000000059068dbc>] ksys_ioctl+0x99/0xb0
          [<000000009a6eb334>] __x64_sys_ioctl+0x78/0xb0
          [<0000000053d0332e>] do_syscall_64+0x16f/0x580
          [<0000000021b83b99>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<000000008ea75434>] 0xffffffffffffffff
      
      Cc: Wolfgang Grandegger <wg@grandegger.com>
      Cc: Marc Kleine-Budde <mkl@pengutronix.de>
      Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: default avatarJouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      ed50e160
  2. 13 Nov, 2019 15 commits
  3. 12 Nov, 2019 6 commits
    • Ursula Braun's avatar
      net/smc: fix refcount non-blocking connect() -part 2 · 6d6dd528
      Ursula Braun authored
      If an SMC socket is immediately terminated after a non-blocking connect()
      has been called, a memory leak is possible.
      Due to the sock_hold move in
      commit 301428ea ("net/smc: fix refcounting for non-blocking connect()")
      an extra sock_put() is needed in smc_connect_work(), if the internal
      TCP socket is aborted and cancels the sk_stream_wait_connect() of the
      connect worker.
      
      Reported-by: syzbot+4b73ad6fc767e576e275@syzkaller.appspotmail.com
      Fixes: 301428ea ("net/smc: fix refcounting for non-blocking connect()")
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d6dd528
    • Xiaodong Xu's avatar
      xfrm: release device reference for invalid state · 4944a4b1
      Xiaodong Xu authored
      An ESP packet could be decrypted in async mode if the input handler for
      this packet returns -EINPROGRESS in xfrm_input(). At this moment the device
      reference in skb is held. Later xfrm_input() will be invoked again to
      resume the processing.
      If the transform state is still valid it would continue to release the
      device reference and there won't be a problem; however if the transform
      state is not valid when async resumption happens, the packet will be
      dropped while the device reference is still being held.
      When the device is deleted for some reason and the reference to this
      device is not properly released, the kernel will keep logging like:
      
      unregister_netdevice: waiting for ppp2 to become free. Usage count = 1
      
      The issue is observed when running IPsec traffic over a PPPoE device based
      on a bridge interface. By terminating the PPPoE connection on the server
      end for multiple times, the PPPoE device on the client side will eventually
      get stuck on the above warning message.
      
      This patch will check the async mode first and continue to release device
      reference in async resumption, before it is dropped due to invalid state.
      
      v2: Do not assign address family from outer_mode in the transform if the
      state is invalid
      
      v3: Release device reference in the error path instead of jumping to resume
      
      Fixes: 4ce3dbe3 ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)")
      Signed-off-by: default avatarXiaodong Xu <stid.smth@gmail.com>
      Reported-by: default avatarBo Chen <chenborfc@163.com>
      Tested-by: default avatarBo Chen <chenborfc@163.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      4944a4b1
    • YueHaibing's avatar
      mdio_bus: Fix PTR_ERR applied after initialization to constant · 1d463956
      YueHaibing authored
      Fix coccinelle warning:
      
      ./drivers/net/phy/mdio_bus.c:67:5-12: ERROR: PTR_ERR applied after initialization to constant on line 62
      ./drivers/net/phy/mdio_bus.c:68:5-12: ERROR: PTR_ERR applied after initialization to constant on line 62
      
      Fix this by using IS_ERR before PTR_ERR
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: 71dd6c0d ("net: phy: add support for reset-controller")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d463956
    • Stephan Gerhold's avatar
      NFC: nxp-nci: Fix NULL pointer dereference after I2C communication error · a71a29f5
      Stephan Gerhold authored
      I2C communication errors (-EREMOTEIO) during the IRQ handler of nxp-nci
      result in a NULL pointer dereference at the moment:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000000
          Oops: 0002 [#1] PREEMPT SMP NOPTI
          CPU: 1 PID: 355 Comm: irq/137-nxp-nci Not tainted 5.4.0-rc6 #1
          RIP: 0010:skb_queue_tail+0x25/0x50
          Call Trace:
           nci_recv_frame+0x36/0x90 [nci]
           nxp_nci_i2c_irq_thread_fn+0xd1/0x285 [nxp_nci_i2c]
           ? preempt_count_add+0x68/0xa0
           ? irq_forced_thread_fn+0x80/0x80
           irq_thread_fn+0x20/0x60
           irq_thread+0xee/0x180
           ? wake_threads_waitq+0x30/0x30
           kthread+0xfb/0x130
           ? irq_thread_check_affinity+0xd0/0xd0
           ? kthread_park+0x90/0x90
           ret_from_fork+0x1f/0x40
      
      Afterward the kernel must be rebooted to work properly again.
      
      This happens because it attempts to call nci_recv_frame() with skb == NULL.
      However, unlike nxp_nci_fw_recv_frame(), nci_recv_frame() does not have any
      NULL checks for skb, causing the NULL pointer dereference.
      
      Change the code to call only nxp_nci_fw_recv_frame() in case of an error.
      Make sure to log it so it is obvious that a communication error occurred.
      The error above then becomes:
      
          nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121
          nci: __nci_request: wait_for_completion_interruptible_timeout failed 0
          nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121
      
      Fixes: 6be88670 ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver")
      Signed-off-by: default avatarStephan Gerhold <stephan@gerhold.net>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a71a29f5
    • Jiri Pirko's avatar
      mlxsw: core: Enable devlink reload only on probe · 73a533ec
      Jiri Pirko authored
      Call devlink enable only during probe time and avoid deadlock
      during reload.
      Reported-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Fixes: 5a508a25 ("devlink: disallow reload operation during device cleanup")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Tested-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73a533ec
    • Aya Levin's avatar
      devlink: Add method for time-stamp on reporter's dump · d279505b
      Aya Levin authored
      When setting the dump's time-stamp, use ktime_get_real in addition to
      jiffies. This simplifies the user space implementation and bypasses
      some inconsistent behavior with translating jiffies to current time.
      The time taken is transformed into nsec, to comply with y2038 issue.
      
      Fixes: c8e1da0b ("devlink: Add health report functionality")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d279505b
  4. 11 Nov, 2019 1 commit
  5. 10 Nov, 2019 2 commits
  6. 09 Nov, 2019 9 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 0058b0a5
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) BPF sample build fixes from Björn Töpel
      
       2) Fix powerpc bpf tail call implementation, from Eric Dumazet.
      
       3) DCCP leaks jiffies on the wire, fix also from Eric Dumazet.
      
       4) Fix crash in ebtables when using dnat target, from Florian Westphal.
      
       5) Fix port disable handling whne removing bcm_sf2 driver, from Florian
          Fainelli.
      
       6) Fix kTLS sk_msg trim on fallback to copy mode, from Jakub Kicinski.
      
       7) Various KCSAN fixes all over the networking, from Eric Dumazet.
      
       8) Memory leaks in mlx5 driver, from Alex Vesker.
      
       9) SMC interface refcounting fix, from Ursula Braun.
      
      10) TSO descriptor handling fixes in stmmac driver, from Jose Abreu.
      
      11) Add a TX lock to synchonize the kTLS TX path properly with crypto
          operations. From Jakub Kicinski.
      
      12) Sock refcount during shutdown fix in vsock/virtio code, from Stefano
          Garzarella.
      
      13) Infinite loop in Intel ice driver, from Colin Ian King.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
        ixgbe: need_wakeup flag might not be set for Tx
        i40e: need_wakeup flag might not be set for Tx
        igb/igc: use ktime accessors for skb->tstamp
        i40e: Fix for ethtool -m issue on X722 NIC
        iavf: initialize ITRN registers with correct values
        ice: fix potential infinite loop because loop counter being too small
        qede: fix NULL pointer deref in __qede_remove()
        net: fix data-race in neigh_event_send()
        vsock/virtio: fix sock refcnt holding during the shutdown
        net: ethernet: octeon_mgmt: Account for second possible VLAN header
        mac80211: fix station inactive_time shortly after boot
        net/fq_impl: Switch to kvmalloc() for memory allocation
        mac80211: fix ieee80211_txq_setup_flows() failure path
        ipv4: Fix table id reference in fib_sync_down_addr
        ipv6: fixes rt6_probe() and fib6_nh->last_probe init
        net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
        net: usb: qmi_wwan: add support for DW5821e with eSIM support
        CDC-NCM: handle incomplete transfer of MTU
        nfc: netlink: fix double device reference drop
        NFC: st21nfca: fix double free
        ...
      0058b0a5
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block · 5cb8418c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Two NVMe device removal crash fixes, and a compat fixup for for an
         ioctl that was introduced in this release (Anton, Charles, Max - via
         Keith)
      
       - Missing error path mutex unlock for drbd (Dan)
      
       - cgroup writeback fixup on dead memcg (Tejun)
      
       - blkcg online stats print fix (Tejun)
      
      * tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block:
        cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
        block: drbd: remove a stray unlock in __drbd_send_protocol()
        blkcg: make blkcg_print_stat() print stats only for online blkgs
        nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
        nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
        nvme-rdma: fix a segmentation fault during module unload
      5cb8418c
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · a2582cdc
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Fixes 2019-11-08
      
      This series contains fixes to igb, igc, ixgbe, i40e, iavf and ice
      drivers.
      
      Colin Ian King fixes a potentially wrap-around counter in a for-loop.
      
      Nick fixes the default ITR values for the iavf driver to 50 usecs
      interval.
      
      Arkadiusz fixes 'ethtool -m' for X722 devices where the correct value
      cannot be obtained from the firmware, so add X722 to the check to ensure
      the wrong value is not returned.
      
      Jake fixes igb and igc drivers in their implementation of launch time
      support by declaring skb->tstamp value as ktime_t instead of s64.
      
      Magnus fixes ixgbe and i40e where the need_wakeup flag for transmit may
      not be set for AF_XDP sockets that are only used to send packets.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2582cdc
    • Magnus Karlsson's avatar
      ixgbe: need_wakeup flag might not be set for Tx · 0843aa8f
      Magnus Karlsson authored
      The need_wakeup flag for Tx might not be set for AF_XDP sockets that
      are only used to send packets. This happens if there is at least one
      outstanding packet that has not been completed by the hardware and we
      get that corresponding completion (which will not generate an
      interrupt since interrupts are disabled in the napi poll loop) between
      the time we stopped processing the Tx completions and interrupts are
      enabled again. In this case, the need_wakeup flag will have been
      cleared at the end of the Tx completion processing as we believe we
      will get an interrupt from the outstanding completion at a later point
      in time. But if this completion interrupt occurs before interrupts
      are enable, we lose it and should at that point really have set the
      need_wakeup flag since there are no more outstanding completions that
      can generate an interrupt to continue the processing. When this
      happens, user space will see a Tx queue need_wakeup of 0 and skip
      issuing a syscall, which means will never get into the Tx processing
      again and we have a deadlock.
      
      This patch introduces a quick fix for this issue by just setting the
      need_wakeup flag for Tx to 1 all the time. I am working on a proper
      fix for this that will toggle the flag appropriately, but it is more
      challenging than I anticipated and I am afraid that this patch will
      not be completed before the merge window closes, therefore this easier
      fix for now. This fix has a negative performance impact in the range
      of 0% to 4%. Towards the higher end of the scale if you have driver
      and application on the same core and issue a lot of packets, and
      towards no negative impact if you use two cores, lower transmission
      speeds and/or a workload that also receives packets.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0843aa8f
    • Magnus Karlsson's avatar
      i40e: need_wakeup flag might not be set for Tx · 70563957
      Magnus Karlsson authored
      The need_wakeup flag for Tx might not be set for AF_XDP sockets that
      are only used to send packets. This happens if there is at least one
      outstanding packet that has not been completed by the hardware and we
      get that corresponding completion (which will not generate an
      interrupt since interrupts are disabled in the napi poll loop) between
      the time we stopped processing the Tx completions and interrupts are
      enabled again. In this case, the need_wakeup flag will have been
      cleared at the end of the Tx completion processing as we believe we
      will get an interrupt from the outstanding completion at a later point
      in time. But if this completion interrupt occurs before interrupts
      are enable, we lose it and should at that point really have set the
      need_wakeup flag since there are no more outstanding completions that
      can generate an interrupt to continue the processing. When this
      happens, user space will see a Tx queue need_wakeup of 0 and skip
      issuing a syscall, which means will never get into the Tx processing
      again and we have a deadlock.
      
      This patch introduces a quick fix for this issue by just setting the
      need_wakeup flag for Tx to 1 all the time. I am working on a proper
      fix for this that will toggle the flag appropriately, but it is more
      challenging than I anticipated and I am afraid that this patch will
      not be completed before the merge window closes, therefore this easier
      fix for now. This fix has a negative performance impact in the range
      of 0% to 4%. Towards the higher end of the scale if you have driver
      and application on the same core and issue a lot of packets, and
      towards no negative impact if you use two cores, lower transmission
      speeds and/or a workload that also receives packets.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      70563957
    • Jacob Keller's avatar
      igb/igc: use ktime accessors for skb->tstamp · 6acab13b
      Jacob Keller authored
      When implementing launch time support in the igb and igc drivers, the
      skb->tstamp value is assumed to be a s64, but it's declared as a ktime_t
      value.
      
      Although ktime_t is typedef'd to s64 it wasn't always, and the kernel
      provides accessors for ktime_t values.
      
      Use the ktime_to_timespec64 and ktime_set accessors instead of directly
      assuming that the variable is always an s64.
      
      This improves portability if the code is ever moved to another kernel
      version, or if the definition of ktime_t ever changes again in the
      future.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6acab13b
    • Arkadiusz Kubalewski's avatar
      i40e: Fix for ethtool -m issue on X722 NIC · 4c9da6f2
      Arkadiusz Kubalewski authored
      This patch contains fix for a problem with command:
      'ethtool -m <dev>'
      which breaks functionality of:
      'ethtool <dev>'
      when called on X722 NIC
      
      Disallowed update of link phy_types on X722 NIC
      Currently correct value cannot be obtained from FW
      Previously wrong value returned by FW was used and was
      a root cause for incorrect output of 'ethtool <dev>' command
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4c9da6f2
    • Nicholas Nunley's avatar
      iavf: initialize ITRN registers with correct values · 4eda4e00
      Nicholas Nunley authored
      Since commit 92418fb1 ("i40e/i40evf: Use usec value instead of reg
      value for ITR defines") the driver tracks the interrupt throttling
      intervals in single usec units, although the actual ITRN registers are
      programmed in 2 usec units. Most register programming flows in the driver
      correctly handle the conversion, although it is currently not applied when
      the registers are initialized to their default values. Most of the time
      this doesn't present a problem since the default values are usually
      immediately overwritten through the standard adaptive throttling mechanism,
      or updated manually by the user, but if adaptive throttling is disabled and
      the interval values are left alone then the incorrect value will persist.
      
      Since the intended default interval of 50 usecs (vs. 100 usecs as
      programmed) performs better for most traffic workloads, this can lead to
      performance regressions.
      
      This patch adds the correct conversion when writing the initial values to
      the ITRN registers.
      Signed-off-by: default avatarNicholas Nunley <nicholas.d.nunley@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4eda4e00
    • Colin Ian King's avatar
      ice: fix potential infinite loop because loop counter being too small · 615457a2
      Colin Ian King authored
      Currently the for-loop counter i is a u8 however it is being checked
      against a maximum value hw->num_tx_sched_layers which is a u16. Hence
      there is a potential wrap-around of counter i back to zero if
      hw->num_tx_sched_layers is greater than 255.  Fix this by making i
      a u16.
      
      Addresses-Coverity: ("Infinite loop")
      Fixes: b36c598c ("ice: Updates to Tx scheduler code")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      615457a2
  7. 08 Nov, 2019 6 commits
    • Manish Chopra's avatar
      qede: fix NULL pointer deref in __qede_remove() · deabc871
      Manish Chopra authored
      While rebooting the system with SR-IOV vfs enabled leads
      to below crash due to recurrence of __qede_remove() on the VF
      devices (first from .shutdown() flow of the VF itself and
      another from PF's .shutdown() flow executing pci_disable_sriov())
      
      This patch adds a safeguard in __qede_remove() flow to fix this,
      so that driver doesn't attempt to remove "already removed" devices.
      
      [  194.360134] BUG: unable to handle kernel NULL pointer dereference at 00000000000008dc
      [  194.360227] IP: [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.360304] PGD 0
      [  194.360325] Oops: 0000 [#1] SMP
      [  194.360360] Modules linked in: tcp_lp fuse tun bridge stp llc devlink bonding ip_set nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi dell_smbios iTCO_wdt iTCO_vendor_support dell_wmi_descriptor dcdbas vfat fat pcc_cpufreq skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd qedr ib_core pcspkr ses enclosure joydev ipmi_ssif sg i2c_i801 lpc_ich mei_me mei wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_pad acpi_power_meter xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200
      [  194.361044]  qede i2c_algo_bit drm_kms_helper qed syscopyarea sysfillrect nvme sysimgblt fb_sys_fops ttm nvme_core mpt3sas crc8 ptp drm pps_core ahci raid_class scsi_transport_sas libahci libata drm_panel_orientation_quirks nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip_tables]
      [  194.361297] CPU: 51 PID: 7996 Comm: reboot Kdump: loaded Not tainted 3.10.0-1062.el7.x86_64 #1
      [  194.361359] Hardware name: Dell Inc. PowerEdge MX840c/0740HW, BIOS 2.4.6 10/15/2019
      [  194.361412] task: ffff9cea9b360000 ti: ffff9ceabebdc000 task.ti: ffff9ceabebdc000
      [  194.361463] RIP: 0010:[<ffffffffc03553c4>]  [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.361534] RSP: 0018:ffff9ceabebdfac0  EFLAGS: 00010282
      [  194.361570] RAX: 0000000000000000 RBX: ffff9cd013846098 RCX: 0000000000000000
      [  194.361621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9cd013846098
      [  194.361668] RBP: ffff9ceabebdfae8 R08: 0000000000000000 R09: 0000000000000000
      [  194.361715] R10: 00000000bfe14201 R11: ffff9ceabfe141e0 R12: 0000000000000000
      [  194.361762] R13: ffff9cd013846098 R14: 0000000000000000 R15: ffff9ceab5e48000
      [  194.361810] FS:  00007f799c02d880(0000) GS:ffff9ceacb0c0000(0000) knlGS:0000000000000000
      [  194.361865] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  194.361903] CR2: 00000000000008dc CR3: 0000001bdac76000 CR4: 00000000007607e0
      [  194.361953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  194.362002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  194.362051] PKRU: 55555554
      [  194.362073] Call Trace:
      [  194.362109]  [<ffffffffc0355500>] qede_remove+0x10/0x20 [qede]
      [  194.362180]  [<ffffffffb97d0f3e>] pci_device_remove+0x3e/0xc0
      [  194.362240]  [<ffffffffb98b3c52>] __device_release_driver+0x82/0xf0
      [  194.362285]  [<ffffffffb98b3ce3>] device_release_driver+0x23/0x30
      [  194.362343]  [<ffffffffb97c86d4>] pci_stop_bus_device+0x84/0xa0
      [  194.362388]  [<ffffffffb97c87e2>] pci_stop_and_remove_bus_device+0x12/0x20
      [  194.362450]  [<ffffffffb97f153f>] pci_iov_remove_virtfn+0xaf/0x160
      [  194.362496]  [<ffffffffb97f1aec>] sriov_disable+0x3c/0xf0
      [  194.362534]  [<ffffffffb97f1bc3>] pci_disable_sriov+0x23/0x30
      [  194.362599]  [<ffffffffc02f83c3>] qed_sriov_disable+0x5e3/0x650 [qed]
      [  194.362658]  [<ffffffffb9622df6>] ? kfree+0x106/0x140
      [  194.362709]  [<ffffffffc02cc0c0>] ? qed_free_stream_mem+0x70/0x90 [qed]
      [  194.362754]  [<ffffffffb9622df6>] ? kfree+0x106/0x140
      [  194.362803]  [<ffffffffc02cd659>] qed_slowpath_stop+0x1a9/0x1d0 [qed]
      [  194.362854]  [<ffffffffc035544e>] __qede_remove+0xae/0x130 [qede]
      [  194.362904]  [<ffffffffc03554e0>] qede_shutdown+0x10/0x20 [qede]
      [  194.362956]  [<ffffffffb97cf90a>] pci_device_shutdown+0x3a/0x60
      [  194.363010]  [<ffffffffb98b180b>] device_shutdown+0xfb/0x1f0
      [  194.363066]  [<ffffffffb94b66c6>] kernel_restart_prepare+0x36/0x40
      [  194.363107]  [<ffffffffb94b66e2>] kernel_restart+0x12/0x60
      [  194.363146]  [<ffffffffb94b6959>] SYSC_reboot+0x229/0x260
      [  194.363196]  [<ffffffffb95f200d>] ? handle_mm_fault+0x39d/0x9b0
      [  194.363253]  [<ffffffffb942b621>] ? __switch_to+0x151/0x580
      [  194.363304]  [<ffffffffb9b7ec28>] ? __schedule+0x448/0x9c0
      [  194.363343]  [<ffffffffb94b69fe>] SyS_reboot+0xe/0x10
      [  194.363387]  [<ffffffffb9b8bede>] system_call_fastpath+0x25/0x2a
      [  194.363430] Code: f9 e9 37 ff ff ff 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 4c 8d af 98 00 00 00 41 54 4c 89 ef 41 89 f4 53 e8 4c e4 55 f9 <80> b8 dc 08 00 00 01 48 89 c3 4c 8d b8 c0 08 00 00 4c 8b b0 c0
      [  194.363712] RIP  [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.363764]  RSP <ffff9ceabebdfac0>
      [  194.363791] CR2: 00000000000008dc
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarSudarsana Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      deabc871
    • Eric Dumazet's avatar
      net: fix data-race in neigh_event_send() · 1b53d644
      Eric Dumazet authored
      KCSAN reported the following data-race [1]
      
      The fix will also prevent the compiler from optimizing out
      the condition.
      
      [1]
      
      BUG: KCSAN: data-race in neigh_resolve_output / neigh_resolve_output
      
      write to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 1:
       neigh_event_send include/net/neighbour.h:443 [inline]
       neigh_resolve_output+0x78/0x480 net/core/neighbour.c:1474
       neigh_output include/net/neighbour.h:511 [inline]
       ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:308 [inline]
       __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
       ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
       __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
       ip_queue_xmit+0x45/0x60 include/net/ip.h:237
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
       tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
       __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
       tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
       tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
       tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598
       tcp_write_timer+0xd1/0xf0 net/ipv4/tcp_timer.c:618
      
      read to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 0:
       neigh_event_send include/net/neighbour.h:442 [inline]
       neigh_resolve_output+0x57/0x480 net/core/neighbour.c:1474
       neigh_output include/net/neighbour.h:511 [inline]
       ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:308 [inline]
       __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
       ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
       __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
       ip_queue_xmit+0x45/0x60 include/net/ip.h:237
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
       tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
       __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
       tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
       tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
       tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b53d644
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-5.4-rc7' of... · abf6c397
      Linus Torvalds authored
      Merge tag 'pwm/for-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm fix from Thierry Reding:
       "One more fix to keep a reference to the driver's module as long as
        there are users of the PWM exposed by the driver"
      
      * tag 'pwm/for-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: bcm-iproc: Prevent unloading the driver module while in use
      abf6c397
    • Tejun Heo's avatar
      cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead · 65de03e2
      Tejun Heo authored
      cgroup writeback tries to refresh the associated wb immediately if the
      current wb is dead.  This is to avoid keeping issuing IOs on the stale
      wb after memcg - blkcg association has changed (ie. when blkcg got
      disabled / enabled higher up in the hierarchy).
      
      Unfortunately, the logic gets triggered spuriously on inodes which are
      associated with dead cgroups.  When the logic is triggered on dead
      cgroups, the attempt fails only after doing quite a bit of work
      allocating and initializing a new wb.
      
      While c3aab9a0 ("mm/filemap.c: don't initiate writeback if mapping
      has no dirty pages") alleviated the issue significantly as it now only
      triggers when the inode has dirty pages.  However, the condition can
      still be triggered before the inode is switched to a different cgroup
      and the logic simply doesn't make sense.
      
      Skip the immediate switching if the associated memcg is dying.
      
      This is a simplified version of the following two patches:
      
       * https://lore.kernel.org/linux-mm/20190513183053.GA73423@dennisz-mbp/
       * http://lkml.kernel.org/r/156355839560.2063.5265687291430814589.stgit@buzz
      
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Fixes: e8a7abf5 ("writeback: disassociate inodes from dying bdi_writebacks")
      Acked-by: default avatarDennis Zhou <dennis@kernel.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      65de03e2
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.4-rc7' of git://github.com/ceph/ceph-client · 0689acfa
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Some late-breaking dentry handling fixes from Al and Jeff, a patch to
        further restrict copy_file_range() to avoid potential data corruption
        from Luis and a fix for !CONFIG_CEPH_FSCACHE kernels.
      
        Everything but the fscache fix is marked for stable"
      
      * tag 'ceph-for-5.4-rc7' of git://github.com/ceph/ceph-client:
        ceph: return -EINVAL if given fsc mount option on kernel w/o support
        ceph: don't allow copy_file_range when stripe_count != 1
        ceph: don't try to handle hashed dentries in non-O_CREAT atomic_open
        ceph: add missing check in d_revalidate snapdir handling
        ceph: fix RCU case handling in ceph_d_revalidate()
        ceph: fix use-after-free in __ceph_remove_cap()
      0689acfa
    • Stefano Garzarella's avatar
      vsock/virtio: fix sock refcnt holding during the shutdown · ad8a7220
      Stefano Garzarella authored
      The "42f5cda5" commit rightly set SOCK_DONE on peer shutdown,
      but there is an issue if we receive the SHUTDOWN(RDWR) while the
      virtio_transport_close_timeout() is scheduled.
      In this case, when the timeout fires, the SOCK_DONE is already
      set and the virtio_transport_close_timeout() will not call
      virtio_transport_reset() and virtio_transport_do_close().
      This causes that both sockets remain open and will never be released,
      preventing the unloading of [virtio|vhost]_transport modules.
      
      This patch fixes this issue, calling virtio_transport_reset() and
      virtio_transport_do_close() when we receive the SHUTDOWN(RDWR)
      and there is nothing left to read.
      
      Fixes: 42f5cda5 ("vsock/virtio: set SOCK_DONE on peer shutdown")
      Cc: Stephen Barber <smbarber@chromium.org>
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad8a7220