1. 03 Jul, 2017 2 commits
  2. 25 Jun, 2017 2 commits
    • Juergen Gross's avatar
      xen: allocate page for shared info page from low memory · a5d5f328
      Juergen Gross authored
      In a HVM guest the kernel allocates the page for mapping the shared
      info structure via extend_brk() today. This will lead to a drop of
      performance as the underlying EPT entry will have to be split up into
      4kB entries as the single shared info page is located in hypervisor
      memory.
      
      The issue has been detected by using the libmicro munmap test:
      unmapping 8kB of memory was faster by nearly a factor of two when no
      pv interfaces were active in the HVM guest.
      
      So instead of taking a page from memory which might be mapped via
      large EPT entries use a page which is already mapped via a 4kB EPT
      entry: we can take a page from the first 1MB of memory as the video
      memory at 640kB disallows using larger EPT entries.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      a5d5f328
    • Juergen Gross's avatar
      xen: avoid deadlock in xenbus driver · 1a3fc2c4
      Juergen Gross authored
      There has been a report about a deadlock in the xenbus driver:
      
      [  247.979498] ======================================================
      [  247.985688] WARNING: possible circular locking dependency detected
      [  247.991882] 4.12.0-rc4-00022-gc4b25c0 #575 Not tainted
      [  247.997040] ------------------------------------------------------
      [  248.003232] xenbus/91 is trying to acquire lock:
      [  248.007875]  (&u->msgbuffer_mutex){+.+.+.}, at: [<ffff00000863e904>]
      xenbus_dev_queue_reply+0x3c/0x230
      [  248.017163]
      [  248.017163] but task is already holding lock:
      [  248.023096]  (xb_write_mutex){+.+...}, at: [<ffff00000863a940>]
      xenbus_thread+0x5f0/0x798
      [  248.031267]
      [  248.031267] which lock already depends on the new lock.
      [  248.031267]
      [  248.039615]
      [  248.039615] the existing dependency chain (in reverse order) is:
      [  248.047176]
      [  248.047176] -> #1 (xb_write_mutex){+.+...}:
      [  248.052943]        __lock_acquire+0x1728/0x1778
      [  248.057498]        lock_acquire+0xc4/0x288
      [  248.061630]        __mutex_lock+0x84/0x868
      [  248.065755]        mutex_lock_nested+0x3c/0x50
      [  248.070227]        xs_send+0x164/0x1f8
      [  248.074015]        xenbus_dev_request_and_reply+0x6c/0x88
      [  248.079427]        xenbus_file_write+0x260/0x420
      [  248.084073]        __vfs_write+0x48/0x138
      [  248.088113]        vfs_write+0xa8/0x1b8
      [  248.091983]        SyS_write+0x54/0xb0
      [  248.095768]        el0_svc_naked+0x24/0x28
      [  248.099897]
      [  248.099897] -> #0 (&u->msgbuffer_mutex){+.+.+.}:
      [  248.106088]        print_circular_bug+0x80/0x2e0
      [  248.110730]        __lock_acquire+0x1768/0x1778
      [  248.115288]        lock_acquire+0xc4/0x288
      [  248.119417]        __mutex_lock+0x84/0x868
      [  248.123545]        mutex_lock_nested+0x3c/0x50
      [  248.128016]        xenbus_dev_queue_reply+0x3c/0x230
      [  248.133005]        xenbus_thread+0x788/0x798
      [  248.137306]        kthread+0x110/0x140
      [  248.141087]        ret_from_fork+0x10/0x40
      
      It is rather easy to avoid by dropping xb_write_mutex before calling
      xenbus_dev_queue_reply().
      
      Fixes: fd8aa909 ("xen: optimize xenbus
      driver for multiple concurrent xenstore accesses").
      
      Cc: <stable@vger.kernel.org> # 4.11
      Reported-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      1a3fc2c4
  3. 15 Jun, 2017 4 commits
  4. 13 Jun, 2017 6 commits
    • Ankur Arora's avatar
      xen/vcpu: Handle xen_vcpu_setup() failure at boot · ae039001
      Ankur Arora authored
      On PVH, PVHVM, at failure in the VCPUOP_register_vcpu_info hypercall
      we limit the number of cpus to to MAX_VIRT_CPUS. However, if this
      failure had occurred for a cpu beyond MAX_VIRT_CPUS, we continue
      to function with > MAX_VIRT_CPUS.
      
      This leads to problems at the next save/restore cycle when there
      are > MAX_VIRT_CPUS threads going into stop_machine() but coming
      back up there's valid state for only the first MAX_VIRT_CPUS.
      
      This patch pulls the excess CPUs down via cpu_down().
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      ae039001
    • Ankur Arora's avatar
      xen/vcpu: Handle xen_vcpu_setup() failure in hotplug · c9b5d98b
      Ankur Arora authored
      The hypercall VCPUOP_register_vcpu_info can fail. This failure is
      handled by making per_cpu(xen_vcpu, cpu) point to its shared_info
      slot and those without one (cpu >= MAX_VIRT_CPUS) be NULL.
      
      For PVH/PVHVM, this is not enough, because we also need to pull
      these VCPUs out of circulation.
      
      Fix for PVH/PVHVM: on registration failure in the cpuhp prepare
      callback (xen_cpu_up_prepare_hvm()), return an error to the cpuhp
      state-machine so it can fail the CPU init.
      
      Fix for PV: the registration happens before smp_init(), so, in the
      failure case we clamp setup_max_cpus and limit the number of VCPUs
      that smp_init() will bring-up to MAX_VIRT_CPUS.
      This is functionally correct but it makes the code a bit simpler
      if we get rid of this explicit clamping: for VCPUs that don't have
      valid xen_vcpu, fail the CPU init in the cpuhp prepare callback
      (xen_cpu_up_prepare_pv()).
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      c9b5d98b
    • Ankur Arora's avatar
      xen/pv: Fix OOPS on restore for a PV, !SMP domain · 0e4d5837
      Ankur Arora authored
      If CONFIG_SMP is disabled, xen_setup_vcpu_info_placement() is called from
      xen_setup_shared_info(). This is fine as far as boot goes, but it means
      that we also call it in the restore path. This results in an OOPS
      because we assign to pv_mmu_ops.read_cr2 which is __ro_after_init.
      
      Also, though less problematically, this means we call xen_vcpu_setup()
      twice at restore -- once from the vcpu info placement call and the
      second time from xen_vcpu_restore().
      
      Fix by calling xen_setup_vcpu_info_placement() at boot only.
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      0e4d5837
    • Ankur Arora's avatar
      xen/pvh*: Support > 32 VCPUs at domain restore · 0b64ffb8
      Ankur Arora authored
      When Xen restores a PVHVM or PVH guest, its shared_info only holds
      up to 32 CPUs. The hypercall VCPUOP_register_vcpu_info allows
      us to setup per-page areas for VCPUs. This means we can boot
      PVH* guests with more than 32 VCPUs. During restore the per-cpu
      structure is allocated freshly by the hypervisor (vcpu_info_mfn is
      set to INVALID_MFN) so that the newly restored guest can make a
      VCPUOP_register_vcpu_info hypercall.
      
      However, we end up triggering this condition in Xen:
      /* Run this command on yourself or on other offline VCPUS. */
       if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) )
      
      which means we are unable to setup the per-cpu VCPU structures
      for running VCPUS. The Linux PV code paths makes this work by
      iterating over cpu_possible in xen_vcpu_restore() with:
      
       1) is target CPU up (VCPUOP_is_up hypercall?)
       2) if yes, then VCPUOP_down to pause it
       3) VCPUOP_register_vcpu_info
       4) if it was down, then VCPUOP_up to bring it back up
      
      With Xen commit 192df6f9122d ("xen/x86: allow HVM guests to use
      hypercalls to bring up vCPUs") this is available for non-PV guests.
      As such first check if VCPUOP_is_up is actually possible before
      trying this dance.
      
      As most of this dance code is done already in xen_vcpu_restore()
      let's make it callable on PV, PVH and PVHVM.
      Based-on-patch-by: default avatarKonrad Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      0b64ffb8
    • Ankur Arora's avatar
      xen/vcpu: Simplify xen_vcpu related code · ad73fd59
      Ankur Arora authored
      Largely mechanical changes to aid unification of xen_vcpu_restore()
      logic for PV, PVH and PVHVM.
      
      xen_vcpu_setup(): the only change in logic is that clamp_max_cpus()
      is now handled inside the "if (!xen_have_vcpu_info_placement)" block.
      
      xen_vcpu_restore(): code movement from enlighten_pv.c to enlighten.c.
      
      xen_vcpu_info_reset(): pulls together all the code where xen_vcpu
      is set to default.
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      ad73fd59
    • Anoob Soman's avatar
      xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU · c48f64ab
      Anoob Soman authored
      A HVM domian booting generates around 200K (evtchn:qemu-dm xen-dyn)
      interrupts,in a short period of time. All these evtchn:qemu-dm are bound
      to VCPU 0, until irqbalance sees these IRQ and moves it to a different VCPU.
      In one configuration, irqbalance runs every 10 seconds, which means
      irqbalance doesn't get to see these burst of interrupts and doesn't
      re-balance interrupts most of the time, making all evtchn:qemu-dm to be
      processed by VCPU0. This cause VCPU0 to spend most of time processing
      hardirq and very little time on softirq. Moreover, if dom0 kernel PREEMPTION
      is disabled, VCPU0 never runs watchdog (process context), triggering a
      softlockup detection code to panic.
      
      Binding evtchn:qemu-dm to next online VCPU, will spread hardirq
      processing evenly across different CPU. Later, irqbalance will try to balance
      evtchn:qemu-dm, if required.
      Signed-off-by: default avatarAnoob Soman <anoob.soman@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      c48f64ab
  5. 08 Jun, 2017 2 commits
  6. 07 Jun, 2017 1 commit
  7. 05 Jun, 2017 3 commits
  8. 04 Jun, 2017 9 commits
    • Linus Torvalds's avatar
      Linux 4.12-rc4 · 3c2993b8
      Linus Torvalds authored
      3c2993b8
    • Richard Narron's avatar
      fs/ufs: Set UFS default maximum bytes per file · 239e250e
      Richard Narron authored
      This fixes a problem with reading files larger than 2GB from a UFS-2
      file system:
      
          https://bugzilla.kernel.org/show_bug.cgi?id=195721
      
      The incorrect UFS s_maxsize limit became a problem as of commit
      c2a9737f ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
      which started using s_maxbytes to avoid a page index overflow in
      do_generic_file_read().
      
      That caused files to be truncated on UFS-2 file systems because the
      default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it.
      
      Here I simply increase the default to a common value used by other file
      systems.
      Signed-off-by: default avatarRichard Narron <comet.berkeley@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Will B <will.brokenbourgh2877@gmail.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <stable@vger.kernel.org> # v4.9 and backports of c2a9737fSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      239e250e
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 125f42b0
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Bugfixes include:
      
         - Fix a typo in commit e0926934 ("NFS append COMMIT after
           synchronous COPY") that breaks copy offload
      
         - Fix the connect error propagation in xs_tcp_setup_socket()
      
         - Fix a lock leak in nfs40_walk_client_list
      
         - Verify that pNFS requests lie within the offset range of the layout
           segment"
      
      * tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        nfs: Mark unnecessarily extern functions as static
        SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()
        NFSv4.0: Fix a lock leak in nfs40_walk_client_list
        pnfs: Fix the check for requests in range of layout segment
        xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup()
        pNFS/flexfiles: missing error code in ff_layout_alloc_lseg()
        NFS fix COMMIT after COPY
      125f42b0
    • Linus Torvalds's avatar
      Merge tag 'tty-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 3c06e6cb
      Linus Torvalds authored
      Pull tty fix from Greg KH:
       "Here is a single tty core fix for 4.12-rc4. It reverts a patch that a
        lot of people reported as causing lockdep and other warnings.
      
        Right after I reverted this in my tree, it seems like another
        "correct" fix might have shown up, but it's too late in the release
        cycle to be messing with tty core locking, so let's just revert this
        for now to go back how things always have been and try it again for
        4.13.
      
        This has not been in linux-next as I only reverted it a few hours ago"
      
      * tag 'tty-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "tty: fix port buffer locking"
      3c06e6cb
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · e00811b4
      Linus Torvalds authored
      Pull input subsystem fixes from Dmitry Torokhov:
      
       - a couple of regression fixes in synaptics and axp20x-pek drivers
      
       - try to ease transition from PS/2 to RMI for Synaptics touchpad users
         by ensuring we do not try to activate RMI mode when RMI SMBus support
         is not enabled, and nag users a bit to enable it
      
       - plus a couple of other changes that seemed worthwhile for this
         release
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: axp20x-pek - switch to acpi_dev_present and check for ACPI0011 too
        Input: axp20x-pek - only check for "INTCFD9" ACPI device on Cherry Trail
        Input: tm2-touchkey - use LEN_ON as boolean value instead of LED_FULL
        Input: synaptics - tell users to report when they should be using rmi-smbus
        Input: synaptics - warn the users when there is a better mode
        Input: synaptics - keep PS/2 around when RMI4_SMB is not enabled
        Input: synaptics - clear device info before filling in
        Input: silead - disable interrupt during suspend
      e00811b4
    • Linus Torvalds's avatar
      Merge tag 'rtc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 9f03b2c7
      Linus Torvalds authored
      Pull RTC fixlet from Alexandre Belloni:
       "A single patch, not really a fix but I don't think there is any reason
        to delay it.
      
        Change the mailing list address"
      
      * tag 'rtc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        MAINTAINERS: update RTC mailing list
      9f03b2c7
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 1f915b7f
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is nine fixes, seven of which are for the qedi driver (new as of
        4.10) the other two are a use after free in the cxgbi drivers and a
        potential NULL dereference in the rdac device handler"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: libcxgbi: fix skb use after free
        scsi: qedi: Fix endpoint NULL panic during recovery.
        scsi: qedi: set max_fin_rt default value
        scsi: qedi: Set firmware tcp msl timer value.
        scsi: qedi: Fix endpoint NULL panic in qedi_set_path.
        scsi: qedi: Set dma_boundary to 0xfff.
        scsi: qedi: Correctly set firmware max supported BDs.
        scsi: qedi: Fix bad pte call trace when iscsiuio is stopped.
        scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get()
      1f915b7f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 55cbdaf6
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       "For the most part this is just a minor -rc cycle for the rdma
        subsystem. Even given that this is all of the -rc patches since the
        merge window closed, it's still only about 25 patches:
      
         - Multiple i40iw, nes, iw_cxgb4, hfi1, qib, mlx4, mlx5 fixes
      
         - A few upper layer protocol fixes (IPoIB, iSER, SRP)
      
         - A modest number of core fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (26 commits)
        RDMA/SA: Fix kernel panic in CMA request handler flow
        RDMA/umem: Fix missing mmap_sem in get umem ODP call
        RDMA/core: not to set page dirty bit if it's already set.
        RDMA/uverbs: Declare local function static and add brackets to sizeof
        RDMA/netlink: Reduce exposure of RDMA netlink functions
        RDMA/srp: Fix NULL deref at srp_destroy_qp()
        RDMA/IPoIB: Limit the ipoib_dev_uninit_default scope
        RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings
        RDMA/qedr: add null check before pointer dereference
        RDMA/mlx5: set UMR wqe fence according to HCA cap
        net/mlx5: Define interface bits for fencing UMR wqe
        RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled
        RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
        RDMA/hfi1: Defer setting VL15 credits to link-up interrupt
        RDMA/hfi1: change PCI bar addr assignments to Linux API functions
        RDMA/hfi1: fix array termination by appending NULL to attr array
        RDMA/iw_cxgb4: fix the calculation of ipv6 header size
        RDMA/iw_cxgb4: calculate t4_eq_status_entries properly
        RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers
        RDMA/nes: ACK MPA Reply frame
        ...
      55cbdaf6
    • Greg Kroah-Hartman's avatar
      Revert "tty: fix port buffer locking" · fc098af1
      Greg Kroah-Hartman authored
      This reverts commit 925bb1ce.
      
      It causes lots of warnings and problems so for now, let's just revert
      it.
      
      Reported-by: <valdis.kletnieks@vt.edu>
      Reported-by: default avatarRussell King <linux@armlinux.org.uk>
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reported-by: default avatarJiri Slaby <jslaby@suse.cz>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc098af1
  9. 03 Jun, 2017 8 commits
  10. 02 Jun, 2017 3 commits