1. 12 Apr, 2017 27 commits
  2. 08 Apr, 2017 13 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.10.9 · f6392b77
      Greg Kroah-Hartman authored
      f6392b77
    • Zhi Wang's avatar
      drm/i915: A hotfix for making aliasing PPGTT work for GVT-g · 59529be9
      Zhi Wang authored
      commit 3e52d71e upstream.
      
      This patch makes PPGTT page table non-shrinkable when using aliasing PPGTT
      mode. It's just a temporary solution for making GVT-g work.
      
      Fixes: 2ce5179f ("drm/i915/gtt: Free unused lower-level page tables")
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@intel.com>
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
      Signed-off-by: default avatarZhi Wang <zhi.a.wang@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1486559013-25251-2-git-send-email-zhi.a.wang@intel.comReviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      (cherry picked from commit e81ecb5e)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59529be9
    • Zhi Wang's avatar
      drm/i915: Let execlist_update_context() cover !FULL_PPGTT mode. · 0efab45f
      Zhi Wang authored
      commit 26d12c61 upstream.
      
      execlist_update_context() will try to update PDPs in a context before a
      ELSP submission only for full PPGTT mode, while PDPs was populated during
      context initialization. Now the latter code path is removed. Let
      execlist_update_context() also cover !FULL_PPGTT mode.
      
      Fixes: 34869776 ("drm/i915: check ppgtt validity when init reg state")
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
      Signed-off-by: default avatarZhi Wang <zhi.a.wang@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1486377436-15380-1-git-send-email-zhi.a.wang@intel.comReviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      (cherry picked from commit 04da811b)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0efab45f
    • Zhi Wang's avatar
      drm/i915: Move the release of PT page to the upper caller · e47bc4fb
      Zhi Wang authored
      commit a18dbba8 upstream.
      
      a PT page will be released if it doesn't contain any meaningful mappings
      during PPGTT page table shrinking. The PT entry in the upper level will
      be set to a scratch entry.
      
      Normally this works nicely, but in virtualization world, the PPGTT page
      table is tracked by hypervisor. Releasing the PT page before modifying
      the upper level PT entry would cause extra efforts.
      
      As the tracked page has been returned to OS before losing track from
      hypervisor, it could be written in any pattern. Hypervisor has to recognize
      if a page is still being used as a PT page by validating these writing
      patterns. It's complicated. Better let the guest modify the PT entry in
      upper level PT first, then release the PT page.
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMichał Winiarski <michal.winiarski@intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
      Signed-off-by: default avatarZhi Wang <zhi.a.wang@intel.com>
      Link: https://patchwork.freedesktop.org/patch/122697/msgid/1479728666-25333-1-git-send-email-zhi.a.wang@intel.comSigned-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1480402516-22275-1-git-send-email-zhi.a.wang@intel.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e47bc4fb
    • Keith Busch's avatar
      nvme/pci: Disable on removal when disconnected · e33cb974
      Keith Busch authored
      commit 6db28eda upstream.
      
      If the device is not present, the driver should disable the queues
      immediately. Prior to this, the driver was relying on the watchdog timer
      to kill the queues if requests were outstanding to the device, and that
      just delays removal up to one second.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e33cb974
    • Keith Busch's avatar
      nvme/core: Fix race kicking freed request_queue · 2bfe1b12
      Keith Busch authored
      commit f33447b9 upstream.
      
      If a namespace has already been marked dead, we don't want to kick the
      request_queue again since we may have just freed it from another thread.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bfe1b12
    • Jason A. Donenfeld's avatar
      padata: avoid race in reordering · 311cd5ae
      Jason A. Donenfeld authored
      commit de5540d0 upstream.
      
      Under extremely heavy uses of padata, crashes occur, and with list
      debugging turned on, this happens instead:
      
      [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
      __list_add+0xae/0x130
      [87487.301868] list_add corruption. prev->next should be next
      (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
      [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
      [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
      [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
      [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
      [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
      [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
      [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
      [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
      
      padata_reorder calls list_add_tail with the list to which its adding
      locked, which seems correct:
      
      spin_lock(&squeue->serial.lock);
      list_add_tail(&padata->list, &squeue->serial.list);
      spin_unlock(&squeue->serial.lock);
      
      This therefore leaves only place where such inconsistency could occur:
      if padata->list is added at the same time on two different threads.
      This pdata pointer comes from the function call to
      padata_get_next(pd), which has in it the following block:
      
      next_queue = per_cpu_ptr(pd->pqueue, cpu);
      padata = NULL;
      reorder = &next_queue->reorder;
      if (!list_empty(&reorder->list)) {
             padata = list_entry(reorder->list.next,
                                 struct padata_priv, list);
             spin_lock(&reorder->lock);
             list_del_init(&padata->list);
             atomic_dec(&pd->reorder_objects);
             spin_unlock(&reorder->lock);
      
             pd->processed++;
      
             goto out;
      }
      out:
      return padata;
      
      I strongly suspect that the problem here is that two threads can race
      on reorder list. Even though the deletion is locked, call to
      list_entry is not locked, which means it's feasible that two threads
      pick up the same padata object and subsequently call list_add_tail on
      them at the same time. The fix is thus be hoist that lock outside of
      that block.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      311cd5ae
    • NeilBrown's avatar
      blk: Ensure users for current->bio_list can see the full list. · a591a05f
      NeilBrown authored
      commit f5fe1b51 upstream.
      
      Commit 79bd9959 ("blk: improve order of bio handling in generic_make_request()")
      changed current->bio_list so that it did not contain *all* of the
      queued bios, but only those submitted by the currently running
      make_request_fn.
      
      There are two places which walk the list and requeue selected bios,
      and others that check if the list is empty.  These are no longer
      correct.
      
      So redefine current->bio_list to point to an array of two lists, which
      contain all queued bios, and adjust various code to test or walk both
      lists.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Fixes: 79bd9959 ("blk: improve order of bio handling in generic_make_request()")
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Cc: Jack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a591a05f
    • NeilBrown's avatar
      blk: improve order of bio handling in generic_make_request() · 75a778ed
      NeilBrown authored
      commit 79bd9959 upstream.
      
      To avoid recursion on the kernel stack when stacked block devices
      are in use, generic_make_request() will, when called recursively,
      queue new requests for later handling.  They will be handled when the
      make_request_fn for the current bio completes.
      
      If any bios are submitted by a make_request_fn, these will ultimately
      be handled seqeuntially.  If the handling of one of those generates
      further requests, they will be added to the end of the queue.
      
      This strict first-in-first-out behaviour can lead to deadlocks in
      various ways, normally because a request might need to wait for a
      previous request to the same device to complete.  This can happen when
      they share a mempool, and can happen due to interdependencies
      particular to the device.  Both md and dm have examples where this happens.
      
      These deadlocks can be erradicated by more selective ordering of bios.
      Specifically by handling them in depth-first order.  That is: when the
      handling of one bio generates one or more further bios, they are
      handled immediately after the parent, before any siblings of the
      parent.  That way, when generic_make_request() calls make_request_fn
      for some particular device, we can be certain that all previously
      submited requests for that device have been completely handled and are
      not waiting for anything in the queue of requests maintained in
      generic_make_request().
      
      An easy way to achieve this would be to use a last-in-first-out stack
      instead of a queue.  However this will change the order of consecutive
      bios submitted by a make_request_fn, which could have unexpected consequences.
      Instead we take a slightly more complex approach.
      A fresh queue is created for each call to a make_request_fn.  After it completes,
      any bios for a different device are placed on the front of the main queue, followed
      by any bios for the same device, followed by all bios that were already on
      the queue before the make_request_fn was called.
      This provides the depth-first approach without reordering bios on the same level.
      
      This, by itself, it not enough to remove all deadlocks.  It just makes
      it possible for drivers to take the extra step required themselves.
      
      To avoid deadlocks, drivers must never risk waiting for a request
      after submitting one to generic_make_request.  This includes never
      allocing from a mempool twice in the one call to a make_request_fn.
      
      A common pattern in drivers is to call bio_split() in a loop, handling
      the first part and then looping around to possibly split the next part.
      Instead, a driver that finds it needs to split a bio should queue
      (with generic_make_request) the second part, handle the first part,
      and then return.  The new code in generic_make_request will ensure the
      requests to underlying bios are processed first, then the second bio
      that was split off.  If it splits again, the same process happens.  In
      each case one bio will be completely handled before the next one is attempted.
      
      With this is place, it should be possible to disable the
      punt_bios_to_recover() recovery thread for many block devices, and
      eventually it may be possible to remove it completely.
      
      Ref: http://www.spinics.net/lists/raid/msg54680.htmlTested-by: default avatarJinpu Wang <jinpu.wang@profitbricks.com>
      Inspired-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Cc: Jack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75a778ed
    • Felix Fietkau's avatar
      MIPS: Lantiq: Fix cascaded IRQ setup · b576c583
      Felix Fietkau authored
      commit 6c356eda upstream.
      
      With the IRQ stack changes integrated, the XRX200 devices started
      emitting a constant stream of kernel messages like this:
      
      [  565.415310] Spurious IRQ: CAUSE=0x1100c300
      
      This is caused by IP0 getting handled by plat_irq_dispatch() rather than
      its vectored interrupt handler, which is fixed by commit de856416e714
      ("MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch").
      
      Fix plat_irq_dispatch() to handle non-vectored IPI interrupts correctly
      by setting up IP2-6 as proper chained IRQ handlers and calling do_IRQ
      for all MIPS CPU interrupts.
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Acked-by: default avatarJohn Crispin <john@phrozen.org>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/15077/
      [james.hogan@imgtec.com: tweaked commit message]
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b576c583
    • Jon Mason's avatar
      ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags · 77149f08
      Jon Mason authored
      commit 0c2bf9f9 upstream.
      
      GIC_PPI flags were misconfigured for the timers, resulting in errors
      like:
      [    0.000000] GIC: PPI11 is secure or misconfigured
      
      Changing them to being edge triggered corrects the issue
      Suggested-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Signed-off-by: default avatarJon Mason <jon.mason@broadcom.com>
      Fixes: d27509f1 ("ARM: BCM5301X: add dts files for BCM4708 SoC")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77149f08
    • Daniel Vetter's avatar
      drm/armada: Fix compile fail · 1229cd2f
      Daniel Vetter authored
      commit 7357f899 upstream.
      
      I reported the include issue for tracepoints a while ago, but nothing
      seems to have happened. Now it bit us, since the drm_mm_print
      conversion was broken for armada. Fix it, so I can re-enable armada
      in the drm-misc build configs.
      
      v2: Rebase just the compile fix on top of Chris' build fix.
      
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Acked: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1483115932-19584-1-git-send-email-daniel.vetter@ffwll.chSigned-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1229cd2f
    • Naoya Horiguchi's avatar
      mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd() · 847f0ffc
      Naoya Horiguchi authored
      commit c9d398fa upstream.
      
      I found the race condition which triggers the following bug when
      move_pages() and soft offline are called on a single hugetlb page
      concurrently.
      
          Soft offlining page 0x119400 at 0x700000000000
          BUG: unable to handle kernel paging request at ffffea0011943820
          IP: follow_huge_pmd+0x143/0x190
          PGD 7ffd2067
          PUD 7ffd1067
          PMD 0
              [61163.582052] Oops: 0000 [#1] SMP
          Modules linked in: binfmt_misc ppdev virtio_balloon parport_pc pcspkr i2c_piix4 parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk 8139too crc32c_intel ata_piix serio_raw libata virtio_pci 8139cp virtio_ring virtio mii floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: cap_check]
          CPU: 0 PID: 22573 Comm: iterate_numa_mo Tainted: P           OE   4.11.0-rc2-mm1+ #2
          Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
          RIP: 0010:follow_huge_pmd+0x143/0x190
          RSP: 0018:ffffc90004bdbcd0 EFLAGS: 00010202
          RAX: 0000000465003e80 RBX: ffffea0004e34d30 RCX: 00003ffffffff000
          RDX: 0000000011943800 RSI: 0000000000080001 RDI: 0000000465003e80
          RBP: ffffc90004bdbd18 R08: 0000000000000000 R09: ffff880138d34000
          R10: ffffea0004650000 R11: 0000000000c363b0 R12: ffffea0011943800
          R13: ffff8801b8d34000 R14: ffffea0000000000 R15: 000077ff80000000
          FS:  00007fc977710740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffffea0011943820 CR3: 000000007a746000 CR4: 00000000001406f0
          Call Trace:
           follow_page_mask+0x270/0x550
           SYSC_move_pages+0x4ea/0x8f0
           SyS_move_pages+0xe/0x10
           do_syscall_64+0x67/0x180
           entry_SYSCALL64_slow_path+0x25/0x25
          RIP: 0033:0x7fc976e03949
          RSP: 002b:00007ffe72221d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000117
          RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc976e03949
          RDX: 0000000000c22390 RSI: 0000000000001400 RDI: 0000000000005827
          RBP: 00007ffe72221e00 R08: 0000000000c2c3a0 R09: 0000000000000004
          R10: 0000000000c363b0 R11: 0000000000000246 R12: 0000000000400650
          R13: 00007ffe72221ee0 R14: 0000000000000000 R15: 0000000000000000
          Code: 81 e4 ff ff 1f 00 48 21 c2 49 c1 ec 0c 48 c1 ea 0c 4c 01 e2 49 bc 00 00 00 00 00 ea ff ff 48 c1 e2 06 49 01 d4 f6 45 bc 04 74 90 <49> 8b 7c 24 20 40 f6 c7 01 75 2b 4c 89 e7 8b 47 1c 85 c0 7e 2a
          RIP: follow_huge_pmd+0x143/0x190 RSP: ffffc90004bdbcd0
          CR2: ffffea0011943820
          ---[ end trace e4f81353a2d23232 ]---
          Kernel panic - not syncing: Fatal exception
          Kernel Offset: disabled
      
      This bug is triggered when pmd_present() returns true for non-present
      hugetlb, so fixing the present check in follow_huge_pmd() prevents it.
      Using pmd_present() to determine present/non-present for hugetlb is not
      correct, because pmd_present() checks multiple bits (not only
      _PAGE_PRESENT) for historical reason and it can misjudge hugetlb state.
      
      Fixes: e66f17ff ("mm/hugetlb: take page table lock in follow_huge_pmd()")
      Link: http://lkml.kernel.org/r/1490149898-20231-1-git-send-email-n-horiguchi@ah.jp.nec.comSigned-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      847f0ffc