1. 20 Jul, 2018 1 commit
    • Tariq Toukan's avatar
      net/page_pool: Fix inconsistent lock state warning · 4905bd9a
      Tariq Toukan authored
      Fix the warning below by calling the ptr_ring_consume_bh,
      which uses spin_[un]lock_bh.
      
      [  179.064300] ================================
      [  179.069073] WARNING: inconsistent lock state
      [  179.073846] 4.18.0-rc2+ #18 Not tainted
      [  179.078133] --------------------------------
      [  179.082907] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [  179.089637] swapper/21/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      [  179.095478] 00000000963d1995 (&(&r->consumer_lock)->rlock){+.?.}, at:
      __page_pool_empty_ring+0x61/0x100
      [  179.105988] {SOFTIRQ-ON-W} state was registered at:
      [  179.111443]   _raw_spin_lock+0x35/0x50
      [  179.115634]   __page_pool_empty_ring+0x61/0x100
      [  179.120699]   page_pool_destroy+0x32/0x50
      [  179.125204]   mlx5e_free_rq+0x38/0xc0 [mlx5_core]
      [  179.130471]   mlx5e_close_channel+0x20/0x120 [mlx5_core]
      [  179.136418]   mlx5e_close_channels+0x26/0x40 [mlx5_core]
      [  179.142364]   mlx5e_close_locked+0x44/0x50 [mlx5_core]
      [  179.148509]   mlx5e_close+0x42/0x60 [mlx5_core]
      [  179.153936]   __dev_close_many+0xb1/0x120
      [  179.158749]   dev_close_many+0xa2/0x170
      [  179.163364]   rollback_registered_many+0x148/0x460
      [  179.169047]   rollback_registered+0x56/0x90
      [  179.174043]   unregister_netdevice_queue+0x7e/0x100
      [  179.179816]   unregister_netdev+0x18/0x20
      [  179.184623]   mlx5e_remove+0x2a/0x50 [mlx5_core]
      [  179.190107]   mlx5_remove_device+0xe5/0x110 [mlx5_core]
      [  179.196274]   mlx5_unregister_interface+0x39/0x90 [mlx5_core]
      [  179.203028]   cleanup+0x5/0xbfc [mlx5_core]
      [  179.208031]   __x64_sys_delete_module+0x16b/0x240
      [  179.213640]   do_syscall_64+0x5a/0x210
      [  179.218151]   entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  179.224218] irq event stamp: 334398
      [  179.228438] hardirqs last  enabled at (334398): [<ffffffffa511d8b7>]
      rcu_process_callbacks+0x1c7/0x790
      [  179.239178] hardirqs last disabled at (334397): [<ffffffffa511d872>]
      rcu_process_callbacks+0x182/0x790
      [  179.249931] softirqs last  enabled at (334386): [<ffffffffa509732e>] irq_enter+0x5e/0x70
      [  179.259306] softirqs last disabled at (334387): [<ffffffffa509741c>] irq_exit+0xdc/0xf0
      [  179.268584]
      [  179.268584] other info that might help us debug this:
      [  179.276572]  Possible unsafe locking scenario:
      [  179.276572]
      [  179.283877]        CPU0
      [  179.286954]        ----
      [  179.290033]   lock(&(&r->consumer_lock)->rlock);
      [  179.295546]   <Interrupt>
      [  179.298830]     lock(&(&r->consumer_lock)->rlock);
      [  179.304550]
      [  179.304550]  *** DEADLOCK ***
      
      Fixes: ff7d6b27 ("page_pool: refurbish version of page_pool code")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4905bd9a
  2. 19 Jul, 2018 3 commits
    • Linus Torvalds's avatar
      Merge tag 'sound-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · f39f28ff
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A rawmidi race fix and three trivial HD-audio quirks"
      
      * tag 'sound-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek - Yet another Clevo P950 quirk entry
        ALSA: rawmidi: Change resized buffers atomically
        ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk
        ALSA: hda: add mute led support for HP ProBook 455 G5
      f39f28ff
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · b4394c34
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "This fixes an allocation error-path bug in af_alg discovered by
        syzkaller"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: af_alg - Initialize sg_num_bytes in error code path
      b4394c34
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 024ddc0c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Lots of fixes, here goes:
      
         1) NULL deref in qtnfmac, from Gustavo A. R. Silva.
      
         2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih.
      
         3) Lost completion messages in AF_XDP, from Magnus Karlsson.
      
         4) Correct bogus self-assignment in rhashtable, from Rishabh
            Bhatnagar.
      
         5) Fix regression in ipv6 route append handling, from David Ahern.
      
         6) Fix masking in __set_phy_supported(), from Heiner Kallweit.
      
         7) Missing module owner set in x_tables icmp, from Florian Westphal.
      
         8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire.
      
         9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy.
      
        10) Fix NULL deref when using chains in act_csum, from Davide Caratti.
      
        11) XDP_REDIRECT needs to check if the interface is up and whether the
            MTU is sufficient. From Toshiaki Makita.
      
        12) Net diag can do a double free when killing TCP_NEW_SYN_RECV
            connections, from Lorenzo Colitti.
      
        13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a
            full minute, delaying device unregister. From Eric Dumazet.
      
        14) Update MAC entries in the correct order in ixgbe, from Alexander
            Duyck.
      
        15) Don't leave partial mangles bpf program in jit_subprogs, from
            Daniel Borkmann.
      
        16) Fix pfmemalloc SKB state propagation, from Stefano Brivio.
      
        17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng.
      
        18) Use after free in tun XDP_TX, from Toshiaki Makita.
      
        19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole.
      
        20) Don't reuse remainder of RX page when XDP is set in mlx4, from
            Saeed Mahameed.
      
        21) Fix window probe handling of TCP rapair sockets, from Stefan
            Baranoff.
      
        22) Missing socket locking in smc_ioctl(), from Ursula Braun.
      
        23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann.
      
        24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva.
      
        25) Two spots in ipv6 do a rol32() on a hash value but ignore the
            result. Fixes from Colin Ian King"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits)
        tcp: identify cryptic messages as TCP seq # bugs
        ptp: fix missing break in switch
        hv_netvsc: Fix napi reschedule while receive completion is busy
        MAINTAINERS: Drop inactive Vitaly Bordug's email
        net: cavium: Add fine-granular dependencies on PCI
        net: qca_spi: Fix log level if probe fails
        net: qca_spi: Make sure the QCA7000 reset is triggered
        net: qca_spi: Avoid packet drop during initial sync
        ipv6: fix useless rol32 call on hash
        ipv6: sr: fix useless rol32 call on hash
        net: sched: Using NULL instead of plain integer
        net: usb: asix: replace mii_nway_restart in resume path
        net: cxgb3_main: fix potential Spectre v1
        lib/rhashtable: consider param->min_size when setting initial table size
        net/smc: reset recv timeout after clc handshake
        net/smc: add error handling for get_user()
        net/smc: optimize consumer cursor updates
        net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
        ipv6: ila: select CONFIG_DST_CACHE
        net: usb: rtl8150: demote allmulti message to dev_dbg()
        ...
      024ddc0c
  3. 18 Jul, 2018 32 commits
  4. 17 Jul, 2018 3 commits
    • Linus Torvalds's avatar
      Mark HI and TASKLET softirq synchronous · 3c53776e
      Linus Torvalds authored
      Way back in 4.9, we committed 4cd13c21 ("softirq: Let ksoftirqd do
      its job"), and ever since we've had small nagging issues with it.  For
      example, we've had:
      
        1ff68820 ("watchdog: core: make sure the watchdog_worker is not deferred")
        8d5755b3 ("watchdog: softdog: fire watchdog even if softirqs do not get to run")
        217f6974 ("net: busy-poll: allow preemption in sk_busy_loop()")
      
      all of which worked around some of the effects of that commit.
      
      The DVB people have also complained that the commit causes excessive USB
      URB latencies, which seems to be due to the USB code using tasklets to
      schedule USB traffic.  This seems to be an issue mainly when already
      living on the edge, but waiting for ksoftirqd to handle it really does
      seem to cause excessive latencies.
      
      Now Hanna Hawa reports that this issue isn't just limited to USB URB and
      DVB, but also causes timeout problems for the Marvell SoC team:
      
       "I'm facing kernel panic issue while running raid 5 on sata disks
        connected to Macchiatobin (Marvell community board with Armada-8040
        SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine
        and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2)
        uses a tasklet to clean the done descriptors from the queue"
      
      The latency problem causes a panic:
      
        mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
        Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction
      
      We've discussed simply just reverting the original commit entirely, and
      also much more involved solutions (with per-softirq threads etc).  This
      patch is intentionally stupid and fairly limited, because the issue
      still remains, and the other solutions either got sidetracked or had
      other issues.
      
      We should probably also consider the timer softirqs to be synchronous
      and not be delayed to ksoftirqd (since they were the issue with the
      earlier watchdog problems), but that should be done as a separate patch.
      This does only the tasklet cases.
      Reported-and-tested-by: default avatarHanna Hawa <hannah@marvell.com>
      Reported-and-tested-by: default avatarJosef Griebichler <griebichler.josef@gmx.at>
      Reported-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3c53776e
    • Takashi Iwai's avatar
      ALSA: rawmidi: Change resized buffers atomically · 39675f7a
      Takashi Iwai authored
      The SNDRV_RAWMIDI_IOCTL_PARAMS ioctl may resize the buffers and the
      current code is racy.  For example, the sequencer client may write to
      buffer while it being resized.
      
      As a simple workaround, let's switch to the resized buffer inside the
      stream runtime lock.
      
      Reported-by: syzbot+52f83f0ea8df16932f7f@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      39675f7a
    • Qu Wenruo's avatar
      btrfs: scrub: Don't use inode page cache in scrub_handle_errored_block() · 665d4953
      Qu Wenruo authored
      In commit ac0b4145 ("btrfs: scrub: Don't use inode pages for device
      replace") we removed the branch of copy_nocow_pages() to avoid
      corruption for compressed nodatasum extents.
      
      However above commit only solves the problem in scrub_extent(), if
      during scrub_pages() we failed to read some pages,
      sctx->no_io_error_seen will be non-zero and we go to fixup function
      scrub_handle_errored_block().
      
      In scrub_handle_errored_block(), for sctx without csum (no matter if
      we're doing replace or scrub) we go to scrub_fixup_nodatasum() routine,
      which does the similar thing with copy_nocow_pages(), but does it
      without the extra check in copy_nocow_pages() routine.
      
      So for test cases like btrfs/100, where we emulate read errors during
      replace/scrub, we could corrupt compressed extent data again.
      
      This patch will fix it just by avoiding any "optimization" for
      nodatasum, just falls back to the normal fixup routine by try read from
      any good copy.
      
      This also solves WARN_ON() or dead lock caused by lame backref iteration
      in scrub_fixup_nodatasum() routine.
      
      The deadlock or WARN_ON() won't be triggered before commit ac0b4145
      ("btrfs: scrub: Don't use inode pages for device replace") since
      copy_nocow_pages() have better locking and extra check for data extent,
      and it's already doing the fixup work by try to read data from any good
      copy, so it won't go scrub_fixup_nodatasum() anyway.
      
      This patch disables the faulty code and will be removed completely in a
      followup patch.
      
      Fixes: ac0b4145 ("btrfs: scrub: Don't use inode pages for device replace")
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      665d4953
  5. 16 Jul, 2018 1 commit