1. 24 May, 2016 5 commits
    • Ross Lagerwall's avatar
      xen/events: Don't move disabled irqs · f0f39387
      Ross Lagerwall authored
      Commit ff1e22e7 ("xen/events: Mask a moving irq") open-coded
      irq_move_irq() but left out checking if the IRQ is disabled. This broke
      resuming from suspend since it tries to move a (disabled) irq without
      holding the IRQ's desc->lock. Fix it by adding in a check for disabled
      IRQs.
      
      The resulting stacktrace was:
      kernel BUG at /build/linux-UbQGH5/linux-4.4.0/kernel/irq/migration.c:31!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: xenfs xen_privcmd ...
      CPU: 0 PID: 9 Comm: migration/0 Not tainted 4.4.0-22-generic #39-Ubuntu
      Hardware name: Xen HVM domU, BIOS 4.6.1-xs125180 05/04/2016
      task: ffff88003d75ee00 ti: ffff88003d7bc000 task.ti: ffff88003d7bc000
      RIP: 0010:[<ffffffff810e26e2>]  [<ffffffff810e26e2>] irq_move_masked_irq+0xd2/0xe0
      RSP: 0018:ffff88003d7bfc50  EFLAGS: 00010046
      RAX: 0000000000000000 RBX: ffff88003d40ba00 RCX: 0000000000000001
      RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff88003d40bad8
      RBP: ffff88003d7bfc68 R08: 0000000000000000 R09: ffff88003d000000
      R10: 0000000000000000 R11: 000000000000023c R12: ffff88003d40bad0
      R13: ffffffff81f3a4a0 R14: 0000000000000010 R15: 00000000ffffffff
      FS:  0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fd4264de624 CR3: 0000000037922000 CR4: 00000000003406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Stack:
       ffff88003d40ba38 0000000000000024 0000000000000000 ffff88003d7bfca0
       ffffffff814c8d92 00000010813ef89d 00000000805ea732 0000000000000009
       0000000000000024 ffff88003cc39b80 ffff88003d7bfce0 ffffffff814c8f66
      Call Trace:
       [<ffffffff814c8d92>] eoi_pirq+0xb2/0xf0
       [<ffffffff814c8f66>] __startup_pirq+0xe6/0x150
       [<ffffffff814ca659>] xen_irq_resume+0x319/0x360
       [<ffffffff814c7e75>] xen_suspend+0xb5/0x180
       [<ffffffff81120155>] multi_cpu_stop+0xb5/0xe0
       [<ffffffff811200a0>] ? cpu_stop_queue_work+0x80/0x80
       [<ffffffff811203d0>] cpu_stopper_thread+0xb0/0x140
       [<ffffffff810a94e6>] ? finish_task_switch+0x76/0x220
       [<ffffffff810ca731>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
       [<ffffffff810a3935>] smpboot_thread_fn+0x105/0x160
       [<ffffffff810a3830>] ? sort_range+0x30/0x30
       [<ffffffff810a0588>] kthread+0xd8/0xf0
       [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
       [<ffffffff8182568f>] ret_from_fork+0x3f/0x70
       [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      f0f39387
    • Stefano Stabellini's avatar
      xen/x86: actually allocate legacy interrupts on PV guests · 702f9260
      Stefano Stabellini authored
      b4ff8389 is incomplete: relies on nr_legacy_irqs() to get the number
      of legacy interrupts when actually nr_legacy_irqs() returns 0 after
      probe_8259A(). Use NR_IRQS_LEGACY instead.
      Signed-off-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      CC: stable@vger.kernel.org
      702f9260
    • Arnd Bergmann's avatar
      Xen: don't warn about 2-byte wchar_t in efi · 971a69db
      Arnd Bergmann authored
      The XEN UEFI code has become available on the ARM architecture
      recently, but now causes a link-time warning:
      
      ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail
      
      This seems harmless, because the efi code only uses 2-byte
      characters when interacting with EFI, so we don't pass on those
      strings to elsewhere in the system, and we just need to
      silence the warning.
      
      It is not clear to me whether we actually need to build the file
      with the -fshort-wchar flag, but if we do, then we should also
      pass --no-wchar-size-warning to the linker, to avoid the warning.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Fixes: 37060935dc04 ("ARM64: XEN: Add a function to initialize Xen specific UEFI runtime services")
      971a69db
    • David Vrabel's avatar
      xen/gntdev: reduce copy batch size to 16 · 36ae220a
      David Vrabel authored
      IOCTL_GNTDEV_GRANT_COPY batches copy operations to reduce the number
      of hypercalls.  The stack is used to avoid a memory allocation in a
      hot path. However, a batch size of 24 requires more than 1024 bytes of
      stack which in some configurations causes a compiler warning.
      
          xen/gntdev.c: In function ‘gntdev_ioctl_grant_copy’:
          xen/gntdev.c:949:1: warning: the frame size of 1248 bytes is
          larger than 1024 bytes [-Wframe-larger-than=]
      
      This is a harmless warning as there is still plenty of stack spare,
      but people keep trying to "fix" it.  Reduce the batch size to 16 to
      reduce stack usage to less than 1024 bytes.  This should have minimal
      impact on performance.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      36ae220a
    • Stefano Stabellini's avatar
      xen/x86: don't lose event interrupts · c06b6d70
      Stefano Stabellini authored
      On slow platforms with unreliable TSC, such as QEMU emulated machines,
      it is possible for the kernel to request the next event in the past. In
      that case, in the current implementation of xen_vcpuop_clockevent, we
      simply return -ETIME. To be precise the Xen returns -ETIME and we pass
      it on. However the result of this is a missed event, which simply causes
      the kernel to hang.
      
      Instead it is better to always ask the hypervisor for a timer event,
      even if the timeout is in the past. That way there are no lost
      interrupts and the kernel survives. To do that, remove the
      VCPU_SSHOTTMR_future flag.
      Signed-off-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      c06b6d70
  2. 18 Apr, 2016 1 commit
  3. 17 Apr, 2016 5 commits
  4. 16 Apr, 2016 7 commits
  5. 15 Apr, 2016 17 commits
  6. 14 Apr, 2016 5 commits
    • Mike Snitzer's avatar
      dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros · 9567366f
      Mike Snitzer authored
      The READ_LOCK macro was incorrectly returning -EINVAL if
      dm_bm_is_read_only() was true -- it will always be true once the cache
      metadata transitions to read-only by dm_cache_metadata_set_read_only().
      
      Wrap READ_LOCK and WRITE_LOCK multi-statement macros in do {} while(0).
      Also, all accesses of the 'cmd' argument passed to these related macros
      are now encapsulated in parenthesis.
      
      A follow-up patch can be developed to eliminate the use of macros in
      favor of pure C code.  Avoiding that now given that this needs to apply
      to stable@.
      Reported-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: d14fcf3d ("dm cache: make sure every metadata function checks fail_io")
      Cc: stable@vger.kernel.org
      9567366f
    • Keith Busch's avatar
      NVMe: Always use MSI/MSI-x interrupts · a5229050
      Keith Busch authored
      Multiple users have reported device initialization failure due the driver
      not receiving legacy PCI interrupts. This is not unique to any particular
      controller, but has been observed on multiple platforms.
      
      There have been no issues reported or observed when with message signaled
      interrupts, so this patch attempts to use MSI-x during initialization,
      falling back to MSI. If that fails, legacy would become the default.
      
      The setup_io_queues error handling had to change as a result: the admin
      queue's msix_entry used to be initialized to the legacy IRQ. The case
      where nr_io_queues is 0 would fail request_irq when setting up the admin
      queue's interrupt since re-enabling MSI-x fails with 0 vectors, leaving
      the admin queue's msix_entry invalid. Instead, return success immediately.
      Reported-by: default avatarTim Muhlemmer <muhlemmer@gmail.com>
      Reported-by: default avatarJon Derrick <jonathan.derrick@intel.com>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      a5229050
    • Linus Torvalds's avatar
      /proc/iomem: only expose physical resource addresses to privileged users · 51d7b120
      Linus Torvalds authored
      In commit c4004b02 ("x86: remove the kernel code/data/bss resources
      from /proc/iomem") I was hoping to remove the phyiscal kernel address
      data from /proc/iomem entirely, but that had to be reverted because some
      system programs actually use it.
      
      This limits all the detailed resource information to properly
      credentialed users instead.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      51d7b120
    • Linus Torvalds's avatar
      pci-sysfs: use proper file capability helper function · ab0fa82b
      Linus Torvalds authored
      The PCI config access checked the file capabilities correctly, but used
      the itnernal security capability check rather than the helper function
      that is actually meant for that.
      
      The security_capable() has unusual return values and is not meant to be
      used elsewhere (the only other use is in the capability checking
      functions that we actually intend people to use, and this odd PCI usage
      really stood out when looking around the capability code.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab0fa82b
    • Linus Torvalds's avatar
      Make file credentials available to the seqfile interfaces · 34dbbcdb
      Linus Torvalds authored
      A lot of seqfile users seem to be using things like %pK that uses the
      credentials of the current process, but that is actually completely
      wrong for filesystem interfaces.
      
      The unix semantics for permission checking files is to check permissions
      at _open_ time, not at read or write time, and that is not just a small
      detail: passing off stdin/stdout/stderr to a suid application and making
      the actual IO happen in privileged context is a classic exploit
      technique.
      
      So if we want to be able to look at permissions at read time, we need to
      use the file open credentials, not the current ones.  Normal file
      accesses can just use "f_cred" (or any of the helper functions that do
      that, like file_ns_capable()), but the seqfile interfaces do not have
      any such options.
      
      It turns out that seq_file _does_ save away the user_ns information of
      the file, though.  Since user_ns is just part of the full credential
      information, replace that special case with saving off the cred pointer
      instead, and suddenly seq_file has all the permission information it
      needs.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34dbbcdb