1. 16 Sep, 2019 23 commits
  2. 15 Sep, 2019 2 commits
    • Linus Torvalds's avatar
      Linux 5.3 · 4d856f72
      Linus Torvalds authored
      4d856f72
    • Linus Torvalds's avatar
      Revert "ext4: make __ext4_get_inode_loc plug" · 72dbcf72
      Linus Torvalds authored
      This reverts commit b03755ad.
      
      This is sad, and done for all the wrong reasons.  Because that commit is
      good, and does exactly what it says: avoids a lot of small disk requests
      for the inode table read-ahead.
      
      However, it turns out that it causes an entirely unrelated problem: the
      getrandom() system call was introduced back in 2014 by commit
      c6e9d6f3 ("random: introduce getrandom(2) system call"), and people
      use it as a convenient source of good random numbers.
      
      But part of the current semantics for getrandom() is that it waits for
      the entropy pool to fill at least partially (unlike /dev/urandom).  And
      at least ArchLinux apparently has a systemd that uses getrandom() at
      boot time, and the improvements in IO patterns means that existing
      installations suddenly start hanging, waiting for entropy that will
      never happen.
      
      It seems to be an unlucky combination of not _quite_ enough entropy,
      together with a particular systemd version and configuration.  Lennart
      says that the systemd-random-seed process (which is what does this early
      access) is supposed to not block any other boot activity, but sadly that
      doesn't actually seem to be the case (possibly due bogus dependencies on
      cryptsetup for encrypted swapspace).
      
      The correct fix is to fix getrandom() to not block when it's not
      appropriate, but that fix is going to take a lot more discussion.  Do we
      just make it act like /dev/urandom by default, and add a new flag for
      "wait for entropy"? Do we add a boot-time option? Or do we just limit
      the amount of time it will wait for entropy?
      
      So in the meantime, we do the revert to give us time to discuss the
      eventual fix for the fundamental problem, at which point we can re-apply
      the ext4 inode table access optimization.
      Reported-by: default avatarAhmed S. Darwish <darwish.07@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Alexander E. Patrakov <patrakov@gmail.com>
      Cc: Lennart Poettering <mzxreary@0pointer.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      72dbcf72
  3. 14 Sep, 2019 12 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 1609d760
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "The main change here is a revert of reverts. We recently simplified
        some code that was thought unnecessary; however, since then KVM has
        grown quite a few cond_resched()s and for that reason the simplified
        code is prone to livelocks---one CPUs tries to empty a list of guest
        page tables while the others keep adding to them. This adds back the
        generation-based zapping of guest page tables, which was not
        unnecessary after all.
      
        On top of this, there is a fix for a kernel memory leak and a couple
        of s390 fixlets as well"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot
        KVM: x86: work around leak of uninitialized stack contents
        KVM: nVMX: handle page fault in vmread
        KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
        KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()
      1609d760
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 1f9c632c
      Linus Torvalds authored
      Pull virtio fix from Michael Tsirkin:
       "A last minute revert
      
        The 32-bit build got broken by the latest defence in depth patch.
        Revert and we'll try again in the next cycle"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        Revert "vhost: block speculation of translated descriptors"
      1f9c632c
    • Linus Torvalds's avatar
      Merge tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · b03c036e
      Linus Torvalds authored
      Pull RISC-V fix from Paul Walmsley:
       "Last week, Palmer and I learned that there was an error in the RISC-V
        kernel image header format that could make it less compatible with the
        ARM64 kernel image header format. I had missed this error during my
        original reviews of the patch.
      
        The kernel image header format is an interface that impacts
        bootloaders, QEMU, and other user tools. Those packages must be
        updated to align with whatever is merged in the kernel. We would like
        to avoid proliferating these image formats by keeping the RISC-V
        header as close as possible to the existing ARM64 header. Since the
        arch/riscv patch that adds support for the image header was merged
        with our v5.3-rc1 pull request as commit 0f327f2a ("RISC-V: Add
        an Image header that boot loader can parse."), we think it wise to try
        to fix this error before v5.3 is released.
      
        The fix itself should be backwards-compatible with any project that
        has already merged support for premature versions of this interface.
        It primarily involves ensuring that the RISC-V image header has
        something useful in the same field as the ARM64 image header"
      
      * tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: modify the Image header to improve compatibility with the ARM64 header
      b03c036e
    • Michael S. Tsirkin's avatar
      Revert "vhost: block speculation of translated descriptors" · 0d4a3f2a
      Michael S. Tsirkin authored
      This reverts commit a89db445.
      
      I was hasty to include this patch, and it breaks the build on 32 bit.
      Defence in depth is good but let's do it properly.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      0d4a3f2a
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 36024fcf
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Don't corrupt xfrm_interface parms before validation, from Nicolas
          Dichtel.
      
       2) Revert use of usb-wakeup in btusb, from Mario Limonciello.
      
       3) Block ipv6 packets in bridge netfilter if ipv6 is disabled, from
          Leonardo Bras.
      
       4) IPS_OFFLOAD not honored in ctnetlink, from Pablo Neira Ayuso.
      
       5) Missing ULP check in sock_map, from John Fastabend.
      
       6) Fix receive statistic handling in forcedeth, from Zhu Yanjun.
      
       7) Fix length of SKB allocated in 6pack driver, from Christophe
          JAILLET.
      
       8) ip6_route_info_create() returns an error pointer, not NULL. From
          Maciej Żenczykowski.
      
       9) Only add RDS sock to the hashes after rs_transport is set, from
          Ka-Cheong Poon.
      
      10) Don't double clean TX descriptors in ixgbe, from Ilya Maximets.
      
      11) Presence of transmit IPSEC offload in an SKB is not tested for
          correctly in ixgbe and ixgbevf. From Steffen Klassert and Jeff
          Kirsher.
      
      12) Need rcu_barrier() when register_netdevice() takes one of the
          notifier based failure paths, from Subash Abhinov Kasiviswanathan.
      
      13) Fix leak in sctp_do_bind(), from Mao Wenan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
        cdc_ether: fix rndis support for Mediatek based smartphones
        sctp: destroy bucket if failed to bind addr
        sctp: remove redundant assignment when call sctp_get_port_local
        sctp: change return type of sctp_get_port_local
        ixgbevf: Fix secpath usage for IPsec Tx offload
        sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'
        ixgbe: Fix secpath usage for IPsec TX offload.
        net: qrtr: fix memort leak in qrtr_tun_write_iter
        net: Fix null de-reference of device refcount
        ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()'
        tun: fix use-after-free when register netdev failed
        tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR
        ixgbe: fix double clean of Tx descriptors with xdp
        ixgbe: Prevent u8 wrapping of ITR value to something less than 10us
        mlx4: fix spelling mistake "veify" -> "verify"
        net: hns3: fix spelling mistake "undeflow" -> "underflow"
        net: lmc: fix spelling mistake "runnin" -> "running"
        NFC: st95hf: fix spelling mistake "receieve" -> "receive"
        net/rds: An rds_sock is added too early to the hash table
        mac80211: Do not send Layer 2 Update frame before authorization
        ...
      36024fcf
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 1c4c5e25
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
      
       - tmio: Fixup runtime PM management during probe and remove
      
       - sdhci-pci-o2micro: Fix eMMC initialization for an AMD SoC
      
       - bcm2835: Prevent lockups when terminating work
      
      * tag 'mmc-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: tmio: Fixup runtime PM management during remove
        mmc: tmio: Fixup runtime PM management during probe
        Revert "mmc: tmio: move runtime PM enablement to the driver implementations"
        Revert "mmc: sdhci: Remove unneeded quirk2 flag of O2 SD host controller"
        Revert "mmc: bcm2835: Terminate timeout work synchronously"
      1c4c5e25
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm · 592b8d87
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "From the maintainer summit, just some last minute fixes for final:
      
        lima:
         - fix gem_wait ioctl
      
        core:
         - constify modes list
      
        i915:
         - DP MST high color depth regression
         - GPU hangs on vulkan compute workloads"
      
      * tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm:
        drm/lima: fix lima_gem_wait() return value
        drm/i915: Restore relaxed padding (OCL_OOB_SUPPRES_ENABLE) for skl+
        drm/i915: Limit MST to <= 8bpc once again
        drm/modes: Make the whitelist more const
      592b8d87
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-master-5.3-1' of... · a9c20bb0
      Paolo Bonzini authored
      Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
      
      KVM: s390: Fixes for 5.3
      
      - prevent a user triggerable oops in the migration code
      - do not leak kernel stack content
      a9c20bb0
    • Sean Christopherson's avatar
      KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot · 002c5f73
      Sean Christopherson authored
      James Harvey reported a livelock that was introduced by commit
      d012a06a ("Revert "KVM: x86/mmu: Zap only the relevant pages when
      removing a memslot"").
      
      The livelock occurs because kvm_mmu_zap_all() as it exists today will
      voluntarily reschedule and drop KVM's mmu_lock, which allows other vCPUs
      to add shadow pages.  With enough vCPUs, kvm_mmu_zap_all() can get stuck
      in an infinite loop as it can never zap all pages before observing lock
      contention or the need to reschedule.  The equivalent of kvm_mmu_zap_all()
      that was in use at the time of the reverted commit (4e103134, "KVM:
      x86/mmu: Zap only the relevant pages when removing a memslot") employed
      a fast invalidate mechanism and was not susceptible to the above livelock.
      
      There are three ways to fix the livelock:
      
      - Reverting the revert (commit d012a06a) is not a viable option as
        the revert is needed to fix a regression that occurs when the guest has
        one or more assigned devices.  It's unlikely we'll root cause the device
        assignment regression soon enough to fix the regression timely.
      
      - Remove the conditional reschedule from kvm_mmu_zap_all().  However, although
        removing the reschedule would be a smaller code change, it's less safe
        in the sense that the resulting kvm_mmu_zap_all() hasn't been used in
        the wild for flushing memslots since the fast invalidate mechanism was
        introduced by commit 6ca18b69 ("KVM: x86: use the fast way to
        invalidate all pages"), back in 2013.
      
      - Reintroduce the fast invalidate mechanism and use it when zapping shadow
        pages in response to a memslot being deleted/moved, which is what this
        patch does.
      
      For all intents and purposes, this is a revert of commit ea145aac
      ("Revert "KVM: MMU: fast invalidate all pages"") and a partial revert of
      commit 7390de1e ("Revert "KVM: x86: use the fast way to invalidate
      all pages""), i.e. restores the behavior of commit 5304b8d3 ("KVM:
      MMU: fast invalidate all pages") and commit 6ca18b69 ("KVM: x86:
      use the fast way to invalidate all pages") respectively.
      
      Fixes: d012a06a ("Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"")
      Reported-by: default avatarJames Harvey <jamespharvey20@gmail.com>
      Cc: Alex Willamson <alex.williamson@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      002c5f73
    • Fuqian Huang's avatar
      KVM: x86: work around leak of uninitialized stack contents · 541ab2ae
      Fuqian Huang authored
      Emulation of VMPTRST can incorrectly inject a page fault
      when passed an operand that points to an MMIO address.
      The page fault will use uninitialized kernel stack memory
      as the CR2 and error code.
      
      The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
      exit to userspace; however, it is not an easy fix, so for now just ensure
      that the error code and CR2 are zero.
      Signed-off-by: default avatarFuqian Huang <huangfq.daxian@gmail.com>
      Cc: stable@vger.kernel.org
      [add comment]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      541ab2ae
    • Paolo Bonzini's avatar
      KVM: nVMX: handle page fault in vmread · f7eea636
      Paolo Bonzini authored
      The implementation of vmread to memory is still incomplete, as it
      lacks the ability to do vmread to I/O memory just like vmptrst.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f7eea636
    • Paul Walmsley's avatar
      riscv: modify the Image header to improve compatibility with the ARM64 header · 474efecb
      Paul Walmsley authored
      Part of the intention during the definition of the RISC-V kernel image
      header was to lay the groundwork for a future merge with the ARM64
      image header.  One error during my original review was not noticing
      that the RISC-V header's "magic" field was at a different size and
      position than the ARM64's "magic" field.  If the existing ARM64 Image
      header parsing code were to attempt to parse an existing RISC-V kernel
      image header format, it would see a magic number 0.  This is
      undesirable, since it's our intention to align as closely as possible
      with the ARM64 header format.  Another problem was that the original
      "res3" field was not being initialized correctly to zero.
      
      Address these issues by creating a 32-bit "magic2" field in the RISC-V
      header which matches the ARM64 "magic" field.  RISC-V binaries will
      store "RSC\x05" in this field.  The intention is that the use of the
      existing 64-bit "magic" field in the RISC-V header will be deprecated
      over time.  Increment the minor version number of the file format to
      indicate this change, and update the documentation accordingly.  Fix
      the assembler directives in head.S to ensure that reserved fields are
      properly zero-initialized.
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Reported-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      Reviewed-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      Cc: Atish Patra <atish.patra@wdc.com>
      Cc: Karsten Merker <merker@debian.org>
      Link: https://lore.kernel.org/linux-riscv/194c2f10c9806720623430dbf0cc59a965e50448.camel@wdc.com/T/#u
      Link: https://lore.kernel.org/linux-riscv/mhng-755b14c4-8f35-4079-a7ff-e421fd1b02bc@palmer-si-x1e/T/#t
      474efecb
  4. 13 Sep, 2019 3 commits
    • Bjørn Mork's avatar
      cdc_ether: fix rndis support for Mediatek based smartphones · 4d7ffcf3
      Bjørn Mork authored
      A Mediatek based smartphone owner reports problems with USB
      tethering in Linux.  The verbose USB listing shows a rndis_host
      interface pair (e0/01/03 + 10/00/00), but the driver fails to
      bind with
      
      [  355.960428] usb 1-4: bad CDC descriptors
      
      The problem is a failsafe test intended to filter out ACM serial
      functions using the same 02/02/ff class/subclass/protocol as RNDIS.
      The serial functions are recognized by their non-zero bmCapabilities.
      
      No RNDIS function with non-zero bmCapabilities were known at the time
      this failsafe was added. But it turns out that some Wireless class
      RNDIS functions are using the bmCapabilities field. These functions
      are uniquely identified as RNDIS by their class/subclass/protocol, so
      the failing test can safely be disabled.  The same applies to the two
      types of Misc class RNDIS functions.
      
      Applying the failsafe to Communication class functions only retains
      the original functionality, and fixes the problem for the Mediatek based
      smartphone.
      
      Tow examples of CDC functional descriptors with non-zero bmCapabilities
      from Wireless class RNDIS functions are:
      
      0e8d:000a  Mediatek Crosscall Spider X5 3G Phone
      
            CDC Header:
              bcdCDC               1.10
            CDC ACM:
              bmCapabilities       0x0f
                connection notifications
                sends break
                line coding and serial state
                get/set/clear comm features
            CDC Union:
              bMasterInterface        0
              bSlaveInterface         1
            CDC Call Management:
              bmCapabilities       0x03
                call management
                use DataInterface
              bDataInterface          1
      
      and
      
      19d2:1023  ZTE K4201-z
      
            CDC Header:
              bcdCDC               1.10
            CDC ACM:
              bmCapabilities       0x02
                line coding and serial state
            CDC Call Management:
              bmCapabilities       0x03
                call management
                use DataInterface
              bDataInterface          1
            CDC Union:
              bMasterInterface        0
              bSlaveInterface         1
      
      The Mediatek example is believed to apply to most smartphones with
      Mediatek firmware.  The ZTE example is most likely also part of a larger
      family of devices/firmwares.
      Suggested-by: default avatarLars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d7ffcf3
    • David S. Miller's avatar
      Merge branch 'sctp_do_bind-leak' · ae3b06ed
      David S. Miller authored
      Mao Wenan says:
      
      ====================
      fix memory leak for sctp_do_bind
      
      First two patches are to do cleanup, remove redundant assignment,
      and change return type of sctp_get_port_local.
      Third patch is to fix memory leak for sctp_do_bind if failed
      to bind address.
      
      v2: add one patch to change return type of sctp_get_port_local.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae3b06ed
    • Mao Wenan's avatar
      sctp: destroy bucket if failed to bind addr · 29b99f54
      Mao Wenan authored
      There is one memory leak bug report:
      BUG: memory leak
      unreferenced object 0xffff8881dc4c5ec0 (size 40):
        comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s)
        hex dump (first 32 bytes):
          02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00  ................
          f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00  .c=.............
        backtrace:
          [<0000000072006339>] sctp_get_port_local+0x2a1/0xa00 [sctp]
          [<00000000c7b379ec>] sctp_do_bind+0x176/0x2c0 [sctp]
          [<000000005be274a2>] sctp_bind+0x5a/0x80 [sctp]
          [<00000000b66b4044>] inet6_bind+0x59/0xd0 [ipv6]
          [<00000000c68c7f42>] __sys_bind+0x120/0x1f0 net/socket.c:1647
          [<000000004513635b>] __do_sys_bind net/socket.c:1658 [inline]
          [<000000004513635b>] __se_sys_bind net/socket.c:1656 [inline]
          [<000000004513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656
          [<0000000061f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
          [<0000000003d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      This is because in sctp_do_bind, if sctp_get_port_local is to
      create hash bucket successfully, and sctp_add_bind_addr failed
      to bind address, e.g return -ENOMEM, so memory leak found, it
      needs to destroy allocated bucket.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      29b99f54