1. 06 Jun, 2019 3 commits
    • Parav Pandit's avatar
      vfio/mdev: Synchronize device create/remove with parent removal · 5715c4dd
      Parav Pandit authored
      In following sequences, child devices created while removing mdev parent
      device can be left out, or it may lead to race of removing half
      initialized child mdev devices.
      
      issue-1:
      --------
             cpu-0                         cpu-1
             -----                         -----
                                        mdev_unregister_device()
                                          device_for_each_child()
                                            mdev_device_remove_cb()
                                              mdev_device_remove()
      create_store()
        mdev_device_create()                   [...]
          device_add()
                                        parent_remove_sysfs_files()
      
      /* BUG: device added by cpu-0
       * whose parent is getting removed
       * and it won't process this mdev.
       */
      
      issue-2:
      --------
      Below crash is observed when user initiated remove is in progress
      and mdev_unregister_driver() completes parent unregistration.
      
             cpu-0                         cpu-1
             -----                         -----
      remove_store()
         mdev_device_remove()
         active = false;
                                        mdev_unregister_device()
                                        parent device removed.
         [...]
         parents->ops->remove()
       /*
        * BUG: Accessing invalid parent.
        */
      
      This is similar race like create() racing with mdev_unregister_device().
      
      BUG: unable to handle kernel paging request at ffffffffc0585668
      PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0
      Oops: 0000 [#1] SMP PTI
      CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6
      Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016
      RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev]
      Call Trace:
       remove_store+0x71/0x90 [mdev]
       kernfs_fop_write+0x113/0x1a0
       vfs_write+0xad/0x1b0
       ksys_write+0x5a/0xe0
       do_syscall_64+0x5a/0x210
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Therefore, mdev core is improved as below to overcome above issues.
      
      Wait for any ongoing mdev create() and remove() to finish before
      unregistering parent device.
      This continues to allow multiple create and remove to progress in
      parallel for different mdev devices as most common case.
      At the same time guard parent removal while parent is being accessed by
      create() and remove() callbacks.
      create()/remove() and unregister_device() are synchronized by the rwsem.
      
      Refactor device removal code to mdev_device_remove_common() to avoid
      acquiring unreg_sem of the parent.
      
      Fixes: 7b96953b ("vfio: Mediated device Core driver")
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      5715c4dd
    • Parav Pandit's avatar
      vfio/mdev: Avoid creating sysfs remove file on stale device removal · 26c9e398
      Parav Pandit authored
      If device is removal is initiated by two threads as below, mdev core
      attempts to create a syfs remove file on stale device.
      During this flow, below [1] call trace is observed.
      
           cpu-0                                    cpu-1
           -----                                    -----
        mdev_unregister_device()
          device_for_each_child
             mdev_device_remove_cb
                mdev_device_remove
                                             user_syscall
                                               remove_store()
                                                 mdev_device_remove()
                                              [..]
         unregister device();
                                             /* not found in list or
                                              * active=false.
                                              */
                                                sysfs_create_file()
                                                ..Call trace
      
      Now that mdev core follows correct device removal sequence of the linux
      bus model, remove shouldn't fail in normal cases. If it fails, there is
      no point of creating a stale file or checking for specific error status.
      
      kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327
      sysfs_create_file_ns+0x7f/0x90
      kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted
      5.1.0-rc6-vdevbus+ #6
      kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b
      08/09/2016
      kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90
      kernel: Call Trace:
      kernel: remove_store+0xdc/0x100 [mdev]
      kernel: kernfs_fop_write+0x113/0x1a0
      kernel: vfs_write+0xad/0x1b0
      kernel: ksys_write+0x5a/0xe0
      kernel: do_syscall_64+0x5a/0x210
      kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      26c9e398
    • Parav Pandit's avatar
      vfio/mdev: Improve the create/remove sequence · 522ecce0
      Parav Pandit authored
      This patch addresses below two issues and prepares the code to address
      3rd issue listed below.
      
      1. mdev device is placed on the mdev bus before it is created in the
      vendor driver. Once a device is placed on the mdev bus without creating
      its supporting underlying vendor device, mdev driver's probe() gets
      triggered.  However there isn't a stable mdev available to work on.
      
         create_store()
           mdev_create_device()
             device_register()
                ...
               vfio_mdev_probe()
              [...]
              parent->ops->create()
                vfio_ap_mdev_create()
                  mdev_set_drvdata(mdev, matrix_mdev);
                  /* Valid pointer set above */
      
      Due to this way of initialization, mdev driver who wants to use the mdev,
      doesn't have a valid mdev to work on.
      
      2. Current creation sequence is,
         parent->ops_create()
         groups_register()
      
      Remove sequence is,
         parent->ops->remove()
         groups_unregister()
      
      However, remove sequence should be exact mirror of creation sequence.
      Once this is achieved, all users of the mdev will be terminated first
      before removing underlying vendor device.
      (Follow standard linux driver model).
      At that point vendor's remove() ops shouldn't fail because taking the
      device off the bus should terminate any usage.
      
      3. When remove operation fails, mdev sysfs removal attempts to add the
      file back on already removed device. Following call trace [1] is observed.
      
      [1] call trace:
      kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327 sysfs_create_file_ns+0x7f/0x90
      kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6
      kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016
      kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90
      kernel: Call Trace:
      kernel: remove_store+0xdc/0x100 [mdev]
      kernel: kernfs_fop_write+0x113/0x1a0
      kernel: vfs_write+0xad/0x1b0
      kernel: ksys_write+0x5a/0xe0
      kernel: do_syscall_64+0x5a/0x210
      kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Therefore, mdev core is improved in following ways.
      
      1. Split the device registration/deregistration sequence so that some
      things can be done between initialization of the device and hooking it
      up to the bus respectively after deregistering it from the bus but
      before giving up our final reference.
      In particular, this means invoking the ->create() and ->remove()
      callbacks in those new windows. This gives the vendor driver an
      initialized mdev device to work with during creation.
      At the same time, a bus driver who wish to bind to mdev driver also
      gets initialized mdev device.
      
      This follows standard Linux kernel bus and device model.
      
      2. During remove flow, first remove the device from the bus. This
      ensures that any bus specific devices are removed.
      Once device is taken off the mdev bus, invoke remove() of mdev
      from the vendor driver.
      
      3. The driver core device model provides way to register and auto
      unregister the device sysfs attribute groups at dev->groups.
      Make use of dev->groups to let core create the groups and eliminate
      code to avoid explicit groups creation and removal.
      
      To ensure, that new sequence is solid, a below stack dump of a
      process is taken who attempts to remove the device while device is in
      use by vfio driver and user application.
      This stack dump validates that vfio driver guards against such device
      removal when device is in use.
      
       cat /proc/21962/stack
      [<0>] vfio_del_group_dev+0x216/0x3c0 [vfio]
      [<0>] mdev_remove+0x21/0x40 [mdev]
      [<0>] device_release_driver_internal+0xe8/0x1b0
      [<0>] bus_remove_device+0xf9/0x170
      [<0>] device_del+0x168/0x350
      [<0>] mdev_device_remove_common+0x1d/0x50 [mdev]
      [<0>] mdev_device_remove+0x8c/0xd0 [mdev]
      [<0>] remove_store+0x71/0x90 [mdev]
      [<0>] kernfs_fop_write+0x113/0x1a0
      [<0>] vfs_write+0xad/0x1b0
      [<0>] ksys_write+0x5a/0xe0
      [<0>] do_syscall_64+0x5a/0x210
      [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [<0>] 0xffffffffffffffff
      
      This prepares the code to eliminate calling device_create_file() in
      subsequent patch.
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      522ecce0
  2. 02 Jun, 2019 14 commits
    • Linus Torvalds's avatar
      Linux 5.2-rc3 · f2c7c76c
      Linus Torvalds authored
      f2c7c76c
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7bd1d5ed
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Two fixes: a quirk for KVM guests running on certain AMD CPUs, and a
        KASAN related build fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
        x86/boot: Provide KASAN compatible aliases for string routines
      7bd1d5ed
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6751b8d9
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "On the kernel side there's a bunch of ring-buffer ordering fixes for a
        reproducible bug, plus a PEBS constraints regression fix.
      
        Plus tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools headers UAPI: Sync kvm.h headers with the kernel sources
        perf record: Fix s390 missing module symbol and warning for non-root users
        perf machine: Read also the end of the kernel
        perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms
        perf session: Add missing swap ops for namespace events
        perf namespace: Protect reading thread's namespace
        tools headers UAPI: Sync drm/drm.h with the kernel
        tools headers UAPI: Sync drm/i915_drm.h with the kernel
        tools headers UAPI: Sync linux/fs.h with the kernel
        tools headers UAPI: Sync linux/sched.h with the kernel
        tools arch x86: Sync asm/cpufeatures.h with the with the kernel
        tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls
        perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel
        perf data: Fix 'strncat may truncate' build failure with recent gcc
        perf/ring-buffer: Use regular variables for nesting
        perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data
        perf/ring_buffer: Add ordering to rb->nest increment
        perf/ring_buffer: Fix exposing a temporarily decreased data_head
        perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
      6751b8d9
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · af042452
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Two EFI fixes: a quirk for weird systabs, plus add more robust error
        handling in the old 1:1 mapping code"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Allow the number of EFI configuration tables entries to be zero
        efi/x86/Add missing error handling to old_memmap 1:1 mapping code
      af042452
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4fb5741c
      Linus Torvalds authored
      Pull stacktrace fix from Ingo Molnar:
       "Fix a stack_trace_save_tsk_reliable() regression"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        stacktrace: Unbreak stack_trace_save_tsk_reliable()
      4fb5741c
    • Linus Torvalds's avatar
      Merge tag 'spdx-5.2-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · a68dc618
      Linus Torvalds authored
      Pull SPDX fixes from Greg KH:
       "Here are just two small patches, that fix up some found SPDX
        identifier issues.
      
        The first patch fixes an error in a previous SPDX fixup patch, that
        causes build errors when doing 'make clean' on the tree (the fact that
        almost no one noticed it reflects the fact that kernel developers
        don't like doing that option very often...)
      
        The second patch fixes up a number of places in the tree where people
        mistyped the string "SPDX-License-Identifier". Given that people can
        not even type their own name all the time without mistakes, this was
        bound to happen, and odds are, we will have to add some type of check
        for this to checkpatch.pl to catch this happening in the future.
      
        Both of these have passed testing by 0-day"
      
      * tag 'spdx-5.2-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        treewide: fix typos of SPDX-License-Identifier
        crypto: ux500 - fix license comment syntax error
      a68dc618
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 460b48a0
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "A minor fix to our IMC PMU code to print a less confusing error
        message when the driver can't initialise properly.
      
        A fix for a bug where a user requesting an unsupported branch sampling
        filter can corrupt PMU state, preventing the PMU from counting
        properly.
      
        And finally a fix for a bug in our support for kexec_file_load(),
        which prevented loading a kernel and initramfs. Most versions of kexec
        don't yet use kexec_file_load().
      
        Thanks to: Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi
        Bangoria, Thiago Jung Bauermann"
      
      * tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load()
        powerpc/perf: Fix MMCRA corruption by bhrb_filter
        powerpc/powernv: Return for invalid IMC domain
      460b48a0
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b44a1dd3
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Fixes for PPC and s390"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: PPC: Book3S HV: Restore SPRG3 in kvmhv_p9_guest_entry()
        KVM: PPC: Book3S HV: Fix lockdep warning when entering guest on POWER9
        KVM: PPC: Book3S HV: XIVE: Fix page offset when clearing ESB pages
        KVM: PPC: Book3S HV: XIVE: Take the srcu read lock when accessing memslots
        KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts
        KVM: PPC: Book3S HV: XIVE: Introduce a new mutex for the XIVE device
        KVM: PPC: Book3S HV: XIVE: Fix the enforced limit on the vCPU identifier
        KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting
        KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released
        KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu
        KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
        KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup
        KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions
        KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID
        kvm: fix compile on s390 part 2
      b44a1dd3
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 38baf0bb
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "A memleak fix for the core, two driver bugfixes, as well as fixing
        missing file patterns to MAINTAINERS"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        MAINTAINERS: add I2C DT bindings to ARM platforms
        MAINTAINERS: add DT bindings to i2c drivers
        i2c: synquacer: fix synquacer_i2c_doxfer() return value
        i2c: mlxcpld: Fix wrong initialization order in probe
        i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
      38baf0bb
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal · 378e853f
      Linus Torvalds authored
      Pull thermal SoC fix from Eduardo Valentin:
       "A single revert, detected to cause issues on the tsens driver"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
        Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled"
      378e853f
    • Linus Torvalds's avatar
      Merge tag 'led-fixes-for-5.2-rc3' of... · f58c356e
      Linus Torvalds authored
      Merge tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
      
      Pull LED fix from Jacek Anaszewski:
       "Fix for a recent change in LED core, that didn't take into account the
        possibility of calling led_blink_setup() from atomic context"
      
      * tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
        leds: avoid flush_work in atomic context
      f58c356e
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190601' of git://git.kernel.dk/linux-block · 9221dced
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - A set of patches fixing code comments / kerneldoc (Bart)
      
       - Don't allow loop file change for exclusive open (Jan)
      
       - Fix revalidate of hidden genhd (Jan)
      
       - Init queue failure memory free fix (Jes)
      
       - Improve rq limits failure print (John)
      
       - Fixup for queue removal/addition (Ming)
      
       - Missed error progagation for io_uring buffer registration (Pavel)
      
      * tag 'for-linus-20190601' of git://git.kernel.dk/linux-block:
        block: print offending values when cloned rq limits are exceeded
        blk-mq: Document the blk_mq_hw_queue_to_node() arguments
        blk-mq: Fix spelling in a source code comment
        block: Fix bsg_setup_queue() kernel-doc header
        block: Fix rq_qos_wait() kernel-doc header
        block: Fix blk_mq_*_map_queues() kernel-doc headers
        block: Fix throtl_pending_timer_fn() kernel-doc header
        block: Convert blk_invalidate_devt() header into a non-kernel-doc header
        block/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header
        blk-mq: Fix memory leak in error handling
        block: don't protect generic_make_request_checks with blk_queue_enter
        block: move blk_exit_queue into __blk_release_queue
        block: Don't revalidate bdev of hidden gendisk
        loop: Don't change loop device under exclusive opener
        io_uring: Fix __io_uring_register() false success
      9221dced
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 1975b337
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Six minor fixes to device drivers and one to the multipath alua
        handler.
      
        The most extensive fix is the zfcp port remove prevention one, but
        it's impact is only s390"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: libsas: delete sas port if expander discover failed
        scsi: libsas: only clear phy->in_shutdown after shutdown event done
        scsi: scsi_dh_alua: Fix possible null-ptr-deref
        scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask
        scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
        scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
        scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
      1975b337
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 7b3064f0
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "Various fixes and followups"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm, compaction: make sure we isolate a valid PFN
        include/linux/generic-radix-tree.h: fix kerneldoc comment
        kernel/signal.c: trace_signal_deliver when signal_group_exit
        drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used
        spdxcheck.py: fix directory structures
        kasan: initialize tag to 0xff in __kasan_kmalloc
        z3fold: fix sheduling while atomic
        scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set
        mm/gup: continue VM_FAULT_RETRY processing even for pre-faults
        ocfs2: fix error path kobject memory leak
        memcg: make it work on sparse non-0-node systems
        mm, memcg: consider subtrees in memory.events
        prctl_set_mm: downgrade mmap_sem to read lock
        prctl_set_mm: refactor checks from validate_prctl_map
        kernel/fork.c: make max_threads symbol static
        arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes
        arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK
        mm/vmalloc.c: fix typo in comment
        lib/sort.c: fix kernel-doc notation warnings
        mm: fix Documentation/vm/hmm.rst Sphinx warnings
      7b3064f0
  3. 01 Jun, 2019 23 commits